The likelihood function (often simply called the likelihood) represents the probability of
random variable realizations conditional on particular values of the
statistical parameter
In statistics, as opposed to its general use in mathematics, a parameter is any measured quantity of a statistical population that summarises or describes an aspect of the population, such as a mean or a standard deviation. If a population exa ...
s. Thus, when evaluated on a
given sample, the likelihood function indicates which parameter values are more ''likely'' than others, in the sense that they would have made the observed data more probable. Consequently, the likelihood is often written as
instead of
, to emphasize that it is to be understood as a function of the parameters
instead of the random variable
.
In
maximum likelihood estimation, the
arg max
In mathematics, the arguments of the maxima (abbreviated arg max or argmax) are the points, or elements, of the domain of some function at which the function values are maximized.For clarity, we refer to the input (''x'') as ''points'' and the ...
of the likelihood function serves as a
point estimate
In statistics, point estimation involves the use of sample data to calculate a single value (known as a point estimate since it identifies a point in some parameter space) which is to serve as a "best guess" or "best estimate" of an unknown popu ...
for
, while local curvature (approximated by the likelihood's
Hessian matrix) indicates the estimate's
precision
Precision, precise or precisely may refer to:
Science, and technology, and mathematics Mathematics and computing (general)
* Accuracy and precision, measurement deviation from true value and its scatter
* Significant figures, the number of digit ...
. Meanwhile in
Bayesian statistics, parameter estimates are derived from the converse of the likelihood, the so-called
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
, which is calculated via
Bayes' rule
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
.
Definition
The likelihood function, parameterized by a (possibly multivariate) parameter
, is usually defined differently for
discrete and continuous probability distributions (a more general definition is discussed below). Given a probability density or mass function
:
where
is a realization of the random variable
, the likelihood function is
:
often written
:
In other words, when
is viewed as a function of
with
fixed, it is a probability density function, and when viewed as a function of
with
fixed, it is a likelihood function. The likelihood function does ''not'' specify the probability that
is the truth, given the observed sample
. Such an interpretation is a common error, with potentially disastrous consequences (see
prosecutor's fallacy).
Discrete probability distribution
Let
be a discrete
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
with
probability mass function
In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
depending on a parameter
. Then the function
:
considered as a function of
, is the ''likelihood function'', given the
outcome
Outcome may refer to:
* Outcome (probability), the result of an experiment in probability theory
* Outcome (game theory), the result of players' decisions in game theory
* ''The Outcome'', a 2005 Spanish film
* An outcome measure (or endpoint) ...
of the random variable
. Sometimes the probability of "the value
of
for the parameter value
" is written as or . The likelihood is the probability that a particular outcome
is observed when the true value of the parameter is
, equivalent to the probability mass on
; it is ''not'' a probability density over the parameter
. The likelihood,
, should not be confused with
, which is the posterior probability of
given the data
.
Given no event (no data), the likelihood is 1; any non-trivial event will have a lower likelihood.
Example
Consider a simple statistical model of a coin flip: a single parameter
that expresses the "fairness" of the coin. The parameter is the probability that a coin lands heads up ("H") when tossed.
can take on any value within the range 0.0 to 1.0. For a perfectly
fair coin,
.
Imagine flipping a fair coin twice, and observing two heads in two tosses ("HH"). Assuming that each successive coin flip is
i.i.d., then the probability of observing HH is
:
Equivalently, the likelihood at
given that "HH" was observed is 0.25:
:
This is not the same as saying that
, a conclusion which could only be reached via
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
given knowledge about the marginal probabilities
and
.
Now suppose that the coin is not a fair coin, but instead that
. Then the probability of two heads on two flips is
:
Hence
:
More generally, for each value of
, we can calculate the corresponding likelihood. The result of such calculations is displayed in Figure 1. Note that the integral of
over
, 1is 1/3; likelihoods need not integrate or sum to one over the parameter space.
Continuous probability distribution
Let
be a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
following an
absolutely continuous probability distribution
In probability theory and statistics, a probability distribution is the mathematical Function (mathematics), function that gives the probabilities of occurrence of different possible outcomes for an Experiment (probability theory), experiment. ...
with
density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(a function of
) which depends on a parameter
. Then the function
:
considered as a function of
, is the ''likelihood function'' (of
, given the
outcome
Outcome may refer to:
* Outcome (probability), the result of an experiment in probability theory
* Outcome (game theory), the result of players' decisions in game theory
* ''The Outcome'', a 2005 Spanish film
* An outcome measure (or endpoint) ...
). Again, note that
is not a probability density or mass function over
, despite being a function of
given the observation
.
Relationship between the likelihood and probability density functions
The use of the
probability density in specifying the likelihood function above is justified as follows. Given an observation
, the likelihood for the interval