In
probability theory, the central limit theorem (CLT) establishes that, in many situations, when
independent random variables are summed up, their properly
normalized sum tends toward a
normal distribution even if the original variables themselves are not normally distributed.
The theorem is a key concept in probability theory because it implies that probabilistic and
statistical
Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
methods that work for normal distributions can be applicable to many problems involving other types of distributions.
This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory.
If
are
random samples drawn from a population with overall
mean and finite
variance and if
is the
sample mean of the first
samples, then the limiting form of the distribution, with
, is a standard normal distribution.
For example, suppose that a
sample is obtained containing many
observations, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the
arithmetic mean
In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The colle ...
of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of the average will closely approximate a normal distribution.
The central limit theorem has several variants. In its common form, the random variables must be
independent and identically distributed (i.i.d.). In variants, convergence of the mean to the normal distribution also occurs for non-identical distributions or for non-independent observations, if they comply with certain conditions.
The earliest version of this theorem, that the normal distribution may be used as an approximation to the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
, is the
de Moivre–Laplace theorem.
Independent sequences
Classical CLT
Let
be a sequence of
random samples — that is, a sequence of i.i.d. random variables drawn from a distribution of
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
given by
and finite
variance given by Suppose we are interested in the
sample average
of the first
samples.
By the
law of large numbers
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials shou ...
, the sample averages
converge almost surely (and therefore also
converge in probability) to the expected value
as
The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number
during this convergence. More precisely, it states that as
gets larger, the distribution of the difference between the sample average
and its limit when multiplied by the factor
approximates the
normal distribution with mean 0 and variance For large enough , the distribution of
gets arbitrarily close to the normal distribution with mean
and variance
The usefulness of the theorem is that the distribution of
approaches normality regardless of the shape of the distribution of the individual Formally, the theorem can be stated as follows:
In the case convergence in distribution means that the
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
s of
converge pointwise to the cdf of the
distribution: for every real
where
is the standard normal cdf evaluated The convergence is uniform in
in the sense that
where
denotes the least upper bound (or
supremum
In mathematics, the infimum (abbreviated inf; plural infima) of a subset S of a partially ordered set P is a greatest element in P that is less than or equal to each element of S, if such an element exists. Consequently, the term ''greatest l ...
) of the set.
Lyapunov CLT
The theorem is named after Russian mathematician
Aleksandr Lyapunov. In this variant of the central limit theorem the random variables
have to be independent, but not necessarily identically distributed. The theorem also requires that random variables
have
moment
Moment or Moments may refer to:
* Present time
Music
* The Moments, American R&B vocal group Albums
* ''Moment'' (Dark Tranquillity album), 2020
* ''Moment'' (Speed album), 1998
* ''Moments'' (Darude album)
* ''Moments'' (Christine Guldbrand ...
s of some order and that the rate of growth of these moments is limited by the Lyapunov condition given below.
In practice it is usually easiest to check Lyapunov's condition for
If a sequence of random variables satisfies Lyapunov's condition, then it also satisfies Lindeberg's condition. The converse implication, however, does not hold.
Lindeberg CLT
In the same setting and with the same notation as above, the Lyapunov condition can be replaced with the following weaker one (from
Lindeberg in 1920).
Suppose that for every
where
is the
indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x\i ...
. Then the distribution of the standardized sums
converges towards the standard normal distribution
Multidimensional CLT
Proofs that use characteristic functions can be extended to cases where each individual
is a
random vector in with mean vector