Likelihood-ratio test
   HOME

TheInfoList



OR:

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, the likelihood-ratio test assesses the
goodness of fit The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measure ...
of two competing
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
s based on the ratio of their likelihoods, specifically one found by maximization over the entire
parameter space The parameter space is the space of possible parameter values that define a particular mathematical model, often a subset of finite-dimensional Euclidean space. Often the parameters are inputs of a function, in which case the technical term for ...
and another found after imposing some constraint. If the constraint (i.e., the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is ...
) is supported by the observed data, the two likelihoods should not differ by more than
sampling error In statistics, sampling errors are incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample ( ...
. Thus the likelihood-ratio test tests whether this ratio is significantly different from one, or equivalently whether its
natural logarithm The natural logarithm of a number is its logarithm to the base of the mathematical constant , which is an irrational and transcendental number approximately equal to . The natural logarithm of is generally written as , , or sometimes, if ...
is significantly different from zero. The likelihood-ratio test, also known as Wilks test, is the oldest of the three classical approaches to hypothesis testing, together with the Lagrange multiplier test and the
Wald test In statistics, the Wald test (named after Abraham Wald) assesses constraints on statistical parameters based on the weighted distance between the unrestricted estimate and its hypothesized value under the null hypothesis, where the weight is the ...
. In fact, the latter two can be conceptualized as approximations to the likelihood-ratio test, and are asymptotically equivalent. In the case of comparing two models each of which has no unknown
parameters A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
, use of the likelihood-ratio test can be justified by the
Neyman–Pearson lemma In statistics, the Neyman–Pearson lemma was introduced by Jerzy Neyman and Egon Pearson in a paper in 1933. The Neyman-Pearson lemma is part of the Neyman-Pearson theory of statistical testing, which introduced concepts like errors of the sec ...
. The lemma demonstrates that the test has the highest
power Power most often refers to: * Power (physics), meaning "rate of doing work" ** Engine power, the power put out by an engine ** Electric power * Power (social and political), the ability to influence people or events ** Abusive power Power may a ...
among all competitors.


Definition


General

Suppose that we have a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
with
parameter space The parameter space is the space of possible parameter values that define a particular mathematical model, often a subset of finite-dimensional Euclidean space. Often the parameters are inputs of a function, in which case the technical term for ...
\Theta. A
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is ...
is often stated by saying that the parameter \theta is in a specified subset \Theta_0 of \Theta. The
alternative hypothesis In statistical hypothesis testing, the alternative hypothesis is one of the proposed proposition in the hypothesis test. In general the goal of hypothesis test is to demonstrate that in the given condition, there is sufficient evidence supporting ...
is thus that \theta is in the
complement A complement is something that completes something else. Complement may refer specifically to: The arts * Complement (music), an interval that, when added to another, spans an octave ** Aggregate complementation, the separation of pitch-clas ...
of \Theta_0, i.e. in \Theta ~ \backslash ~ \Theta_0, which is denoted by \Theta_0^\text. The likelihood ratio test statistic for the null hypothesis H_0 \, : \, \theta \in \Theta_0 is given by: :\lambda_\text = -2 \ln \left \frac \right/math> where the quantity inside the brackets is called the likelihood ratio. Here, the \sup notation refers to the
supremum In mathematics, the infimum (abbreviated inf; plural infima) of a subset S of a partially ordered set P is a greatest element in P that is less than or equal to each element of S, if such an element exists. Consequently, the term ''greatest ...
. As all likelihoods are positive, and as the constrained maximum cannot exceed the unconstrained maximum, the likelihood ratio is bounded between zero and one. Often the likelihood-ratio test statistic is expressed as a difference between the
log-likelihood The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood functi ...
s :\lambda_\text = -2 \left \ell( \theta_0 ) - \ell( \hat ) ~\right/math> where : \ell( \hat ) \equiv \ln \left \sup_ \mathcal(\theta) ~\right is the logarithm of the maximized likelihood function \mathcal, and \ell(\theta_0) is the maximal value in the special case that the null hypothesis is true (but not necessarily a value that maximizes \mathcal for the sampled data) and : \theta_0 \in \Theta_0 \qquad \text \qquad \hat \in \Theta~ denote the respective arguments of the maxima and the allowed ranges they're embedded in. Multiplying by −2 ensures mathematically that (by
Wilks' theorem In statistics Wilks' theorem offers an asymptotic distribution of the log-likelihood ratio statistic, which can be used to produce confidence intervals for maximum-likelihood estimates or as a test statistic for performing the likelihood-ratio ...
) \lambda_\text converges asymptotically to being ²-distributed if the null hypothesis happens to be true. The finite sample distributions of likelihood-ratio tests are generally unknown. The likelihood-ratio test requires that the models be
nested ''Nested'' is the seventh studio album by Bronx-born singer, songwriter and pianist Laura Nyro, released in 1978 on Columbia Records. Following on from her extensive tour to promote 1976's ''Smile'', which resulted in the 1977 live album '' Sea ...
– i.e. the more complex model can be transformed into the simpler model by imposing constraints on the former's parameters. Many common test statistics are tests for nested models and can be phrased as log-likelihood ratios or approximations thereof: e.g. the ''Z''-test, the ''F''-test, the ''G''-test, and
Pearson's chi-squared test Pearson's chi-squared test (\chi^2) is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is the most widely used of many chi-squared tests (e.g. ...
; for an illustration with the one-sample ''t''-test, see below. If the models are not nested, then instead of the likelihood-ratio test, there is a generalization of the test that can usually be used: for details, see ''
relative likelihood In statistics, suppose that we have been given some data, and we are selecting a statistical model for that data. The relative likelihood compares the relative plausibilities of different candidate models or of different values of a parameter of a ...
''.


Case of simple hypotheses

A simple-vs.-simple hypothesis test has completely specified models under both the null hypothesis and the alternative hypothesis, which for convenience are written in terms of fixed values of a notional parameter \theta: : \begin H_0 &:& \theta=\theta_0 ,\\ H_1 &:& \theta=\theta_1 . \end In this case, under either hypothesis, the distribution of the data is fully specified: there are no unknown parameters to estimate. For this case, a variant of the likelihood-ratio test is available: : \Lambda(x) = \frac Some older references may use the reciprocal of the function above as the definition. Thus, the likelihood ratio is small if the alternative model is better than the null model. The likelihood-ratio test provides the decision rule as follows: :If ~\Lambda > c ~, do not reject H_0; :If ~\Lambda < c ~, reject H_0; :If ~\Lambda = c ~, reject H_0 with probability ~q~. : The values c and q are usually chosen to obtain a specified
significance level In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
\alpha, via the relation :~q~ \operatorname(\Lambda=c \mid H_0)~+~\operatorname(\Lambda < c \mid H_0)~=~\alpha~. The
Neyman–Pearson lemma In statistics, the Neyman–Pearson lemma was introduced by Jerzy Neyman and Egon Pearson in a paper in 1933. The Neyman-Pearson lemma is part of the Neyman-Pearson theory of statistical testing, which introduced concepts like errors of the sec ...
states that this likelihood-ratio test is the most powerful among all level \alpha tests for this case.


Interpretation

The likelihood ratio is a function of the data x; therefore, it is a
statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...
, although unusual in that the statistic's value depends on a parameter, \theta. The likelihood-ratio test rejects the null hypothesis if the value of this statistic is too small. How small is too small depends on the significance level of the test, i.e. on what probability of Type I error is considered tolerable (Type I errors consist of the rejection of a null hypothesis that is true). The
numerator A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
corresponds to the likelihood of an observed outcome under the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is ...
. The
denominator A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
corresponds to the maximum likelihood of an observed outcome, varying parameters over the whole parameter space. The numerator of this ratio is less than the denominator; so, the likelihood ratio is between 0 and 1. Low values of the likelihood ratio mean that the observed result was much less likely to occur under the null hypothesis as compared to the alternative. High values of the statistic mean that the observed outcome was nearly as likely to occur under the null hypothesis as the alternative, and so the null hypothesis cannot be rejected.


An example

The following example is adapted and abridged from . Suppose that we have a random sample, of size , from a population that is normally-distributed. Both the mean, , and the standard deviation, , of the population are unknown. We want to test whether the mean is equal to a given value, . Thus, our null hypothesis is and our alternative hypothesis is . The likelihood function is :\mathcal(\mu,\sigma \mid x) = \left(2\pi\sigma^2\right)^ \exp\left( -\sum_^n \frac\right)\,. With some calculation (omitted here), it can then be shown that :\lambda = \left(1 + \frac\right)^ where is the -statistic with degrees of freedom. Hence we may use the known exact distribution of to draw inferences.


Asymptotic distribution: Wilks’ theorem

If the distribution of the likelihood ratio corresponding to a particular null and alternative hypothesis can be explicitly determined then it can directly be used to form decision regions (to sustain or reject the null hypothesis). In most cases, however, the exact distribution of the likelihood ratio corresponding to specific hypotheses is very difficult to determine. Assuming is true, there is a fundamental result by Samuel S. Wilks: As the sample size n approaches \infty, the test statistic \lambda_\text defined above will be
asymptotically In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...
chi-squared distributed (\chi^2) with
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
equal to the difference in dimensionality of \Theta and \Theta_0. This implies that for a great variety of hypotheses, we can calculate the likelihood ratio \lambda for the data and then compare the observed \lambda_\text to the \chi^2 value corresponding to a desired
statistical significance In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
as an ''approximate'' statistical test. Other extensions exist.


See also

*
Akaike information criterion The Akaike information criterion (AIC) is an estimator of prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to e ...
*
Bayes factor The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nul ...
*
Johansen test In statistics, the Johansen test, named after Søren Johansen, is a procedure for testing cointegration of several, say ''k'', I(1) time series. This test permits more than one cointegrating relationship so is more generally applicable than th ...
* Model selection * Vuong's closeness test *
Sup-LR test In econometrics and statistics, a structural break is an unexpected change over time in the parameters of regression models, which can lead to huge forecasting errors and unreliability of the model in general. This issue was popularised by David ...
*
Error exponents in hypothesis testing In statistical hypothesis testing, the error exponent of a hypothesis testing procedure is the rate at which the probabilities of Type I and Type II decay exponentially with the size of the sample used in the test. For example, if the probability of ...


References


Further reading

* * * * * * *


External links


Practical application of likelihood ratio test described

R Package: Wald's Sequential Probability Ratio Test


Online Clinical Calculator {{DEFAULTSORT:Likelihood-Ratio Test Statistical ratios Statistical tests