In
statistics, a normal distribution or Gaussian distribution is a type of
continuous probability distribution for a
real-valued
In mathematics, value may refer to several, strongly related notions.
In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an ...
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
. The general form of its
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
is
:
The parameter
is the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
or
expectation
Expectation or Expectations may refer to:
Science
* Expectation (epistemic)
* Expected value, in mathematical probability theory
* Expectation value (quantum mechanics)
* Expectation–maximization algorithm, in statistics
Music
* ''Expectation' ...
of the distribution (and also its
median and
mode), while the parameter
is its
standard deviation. The
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
of the distribution is
. A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
Normal distributions are important in
statistics and are often used in the
natural
Nature, in the broadest sense, is the physical world or universe. "Nature" can refer to the phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. Although humans are ...
and
social science
Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the original "science of soc ...
s to represent real-valued
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
s whose distributions are not known. Their importance is partly due to the
central limit theorem
In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables thems ...
. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution
converges to a normal distribution as the number of samples increases. Therefore, physical quantities that are expected to be the sum of many independent processes, such as
measurement error
Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistak ...
s, often have distributions that are nearly normal.
Moreover, Gaussian distributions have some unique properties that are valuable in analytic studies. For instance, any linear combination of a fixed collection of normal deviates is a normal deviate. Many results and methods, such as
propagation of uncertainty
In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of ...
and
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
parameter fitting, can be derived analytically in explicit form when the relevant variables are normally distributed.
A normal distribution is sometimes informally called a bell curve.
However, many other distributions are bell-shaped (such as the
Cauchy
Baron Augustin-Louis Cauchy (, ; ; 21 August 178923 May 1857) was a French mathematician, engineer, and physicist who made pioneering contributions to several branches of mathematics, including mathematical analysis and continuum mechanics. He ...
,
Student's ''t'', and
logistic distributions). For other names, see
Naming
Naming is assigning a name to something.
Naming may refer to:
* Naming (parliamentary procedure), a procedure in certain parliamentary bodies
* Naming ceremony, an event at which an infant is named
* Product naming, the discipline of deciding ...
.
The univariate probability distribution is generalized for vectors in the
multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One ...
and for matrices in the
matrix normal distribution.
Definitions
Standard normal distribution
The simplest case of a normal distribution is known as the ''standard normal distribution'' or ''unit normal distribution''. This is a special case when
and
, and it is described by this
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
(or density):
:
The variable
has a mean of 0 and a variance and standard deviation of 1. The density
has its peak
at
and
inflection point
In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (British English: inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case ...
s at
and
.
Although the density above is most commonly known as the ''standard normal,'' a few authors have used that term to describe other versions of the normal distribution.
Carl Friedrich Gauss
Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refe ...
, for example, once defined the standard normal as
:
which has a variance of 1/2, and
Stephen Stigler once defined the standard normal as
:
which has a simple functional form and a variance of
General normal distribution
Every normal distribution is a version of the standard normal distribution, whose domain has been stretched by a factor
(the standard deviation) and then translated by
(the mean value):
:
The probability density must be scaled by
so that the integral is still 1.
If
is a
standard normal deviate
A standard normal deviate is a normally distributed deviate. It is a realization of a standard normal random variable, defined as a random variable with expected value 0 and variance 1.Dodge, Y. (2003) The Oxford Dictionary of Statisti ...
, then
will have a normal distribution with expected value
and standard deviation
. This is equivalent to saying that the "standard" normal distribution
can be scaled/stretched by a factor of
and shifted by
to yield a different normal distribution, called
. Conversely, if
is a normal deviate with parameters
and
, then this
distribution can be re-scaled and shifted via the formula
to convert it to the "standard" normal distribution. This variate is also called the standardized form of
.
Notation
The probability density of the standard Gaussian distribution (standard normal distribution, with zero mean and unit variance) is often denoted with the Greek letter
(
phi). The alternative form of the Greek letter phi,
, is also used quite often.
The normal distribution is often referred to as
or
. Thus when a random variable
is normally distributed with mean
and standard deviation
, one may write
:
Alternative parameterizations
Some authors advocate using the
precision as the parameter defining the width of the distribution, instead of the deviation
or the variance
. The precision is normally defined as the reciprocal of the variance,
. The formula for the distribution then becomes
:
This choice is claimed to have advantages in numerical computations when
is very close to zero, and simplifies formulas in some contexts, such as in the
Bayesian inference of variables with
multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One ...
.
Alternatively, the reciprocal of the standard deviation
might be defined as the ''precision'', in which case the expression of the normal distribution becomes
:
According to Stigler, this formulation is advantageous because of a much simpler and easier-to-remember formula, and simple approximate formulas for the
quantiles of the distribution.
Normal distributions form an
exponential family
In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
with
natural parameters
and
, and natural statistics ''x'' and ''x''
2. The dual expectation parameters for normal distribution are and .
Cumulative distribution functions
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(CDF) of the standard normal distribution, usually denoted with the capital Greek letter
(
phi), is the integral
:
The related
error function
In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as:
:\operatorname z = \frac\int_0^z e^\,\mathrm dt.
This integral is a special (non- elementa ...
gives the probability of a random variable, with normal distribution of mean 0 and variance 1/2 falling in the range