Students of

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

and

probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...

sometimes develop misconceptions about the normal distribution, ideas that may seem plausible but are mathematically untrue. For example, it is sometimes mistakenly thought that two linearly uncorrelated,

normally distributed In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real number, real-valued random variable. The general form of its probability density function is f(x ...

random variables must be

statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two event (probability theory), events are independent, statistically independent, or stochastically independent if, informally s ...

. However, this is untrue, as can be demonstrated by counterexample. Likewise, it is sometimes mistakenly thought that a

linear combination In mathematics, a linear combination or superposition is an Expression (mathematics), expression constructed from a Set (mathematics), set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' a ...

of normally distributed random variables will itself be normally distributed, but again, counterexamples prove this wrong. To say that the pair

(X,Y)

of random variables has a

bivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One d ...

means that every

aX+bY

X

and

Y

for constant (i.e. not random) coefficients

a

and

b

(not both equal to zero) has a univariate normal distribution. In that case, if

X

and

Y

are uncorrelated then they are independent. However, it is possible for two random variables

X

and

Y

to be so distributed jointly that each one alone is marginally normally distributed, and they are uncorrelated, but they are not independent; examples are given below.

Examples

A symmetric example

Suppose

X

has a normal distribution with

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

0 and variance 1. Let

W

have the

Rademacher distribution In probability theory and statistics, the Rademacher distribution (which is named after Hans Rademacher) is a discrete probability distribution where a random variate ''X'' has a 50% chance of being +1 and a 50% chance of being −1. A series ...

, so that

W=1

W=-1

, each with probability 1/2, and assume

W

is independent of

X

. Let

Y=WX

. Then

X

and

Y

are uncorrelated, as can be verified by calculating their

covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...

. Moreover, both have the same normal distribution. And yet,

X

and

Y

are not independent. To see that

X

and

Y

are not independent, observe that

, Y, =, X,

or that

\operatorname(Y > 1 ,  -1/2 1 ,  -1/2 .

Finally, the distribution of the simple linear combination X+Y concentrates positive probability at 0: \operatorname(X+Y=0) = 1/2 .  Therefore, the random variable X+Y is not normally distributed, and so also X and Y are not jointly normally distributed (by the definition above).

An asymmetric example

Suppose

X

has a normal distribution with

0 and variance 1. Let

Y=\left\{\begin{matrix} X & \text{if } \left, X\ \leq c \\
-X & \text{if } \left, X\>c \end{matrix}\right.

where

c

is a positive number to be specified below. If

c

is very small, then the

correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...

\operatorname{corr}(X,Y)

is near

-1

c

is very large, then

\operatorname{corr}(X,Y)

is near 1. Since the correlation is a

continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...

c

, the

intermediate value theorem In mathematical analysis, the intermediate value theorem states that if f is a continuous function whose domain contains the interval , then it takes on any given value between f(a) and f(b) at some point within the interval. This has two imp ...

implies there is some particular value of

c

that makes the correlation 0. That value is approximately 1.54. In that case,

X

and

Y

are uncorrelated, but they are clearly not independent, since

X

completely determines

Y

. To see that

Y

is normally distributed—indeed, that its distribution is the same as that of

X

—one may compute its

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...

\begin{align}\Pr(Y \leq x) &= \Pr(\{, X,  \leq c\text{ and }X \leq x\}\text{ or }\{, X, >c\text{ and }-X \leq x\})\\
&= \Pr(, X,  \leq c\text{ and }X \leq x) + \Pr(, X, >c\text{ and }-X \leq x)\\
&= \Pr(, X,  \leq c\text{ and }X \leq x) + \Pr(, X, >c\text{ and }X \leq x) \\
&= \Pr(X \leq x), \end{align}

where the next-to-last equality follows from the symmetry of the distribution of

X

and the symmetry of the condition that

, X,  \leq c

. In this example, the difference

X-Y

is nowhere near being normally distributed, since it has a substantial probability (about 0.88) of it being equal to 0. By contrast, the normal distribution, being a continuous distribution, has no discrete part—that is, it does not concentrate more than zero probability at any single point. Consequently

X

and

Y

are not ''jointly'' normally distributed, even though they are separately normally distributed.

Examples with support almost everywhere in the plane

Suppose that the coordinates

(X,Y)

of a random point in the plane are chosen according to the probability density function

p(x,y) = \frac{1}{2\pi\sqrt{3 \left exp\left(-\frac{2}{3}(x^2 + xy + y^2)\right) + \exp\left(-\frac{2}{3}(x^2 - xy + y^2)\right)\right

Then the random variables

X

and

Y

are uncorrelated, and each of them is normally distributed (with mean 0 and variance 1), but they are not independent. It is well-known that the ratio

C

of two independent standard normal random deviates

X_{i}

and

Y_{i}

has a

Cauchy distribution The Cauchy distribution, named after Augustin-Louis Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) ...

. One can equally well start with the Cauchy random variable

C

and derive the conditional distribution of

Y_{i}

to satisfy the requirement that

X_{i}=CY_{i}

with

X_{i}

and

Y_{i}

independent and standard normal. It follows that

Y_{i}=W_{i}\sqrt{\frac{\chi_{i}^{2}\left(k=2\right)}{1+C^{2}

in which

W_{i}

is a Rademacher random variable and

\chi_{i}^{2}\left(k=2\right)

is a Chi-squared random variable with two degrees of freedom. Consider two sets of

\left(X_{i},Y_{i}\right)

i\in\left\{ 1,2\right\}

. Note that

C

is not indexed by

i

– that is, the same Cauchy random variable

C

is used in the definition of both

\left(X_{1},Y_{1}\right)

and

\left(X_{2},Y_{2}\right)

. This sharing of

C

results in dependences across indices: neither

X_{1}

nor

Y_{1}

is independent of

Y_{2}

. Nevertheless all of the

X_{i}

and

Y_{i}

are uncorrelated as the bivariate distributions all have reflection symmetry across the axes. Normal marginals

The figure shows scatterplots of samples drawn from the above distribution. This furnishes two examples of bivariate distributions that are uncorrelated and have normal marginal distributions but are not independent. The left panel shows the joint distribution of

X_{1}

and

Y_{2}

; the distribution has support everywhere but at the origin. The right panel shows the joint distribution of

Y_{1}

and

Y_{2}

; the distribution has support everywhere except along the axes and has a discontinuity at the origin: the density diverges when the origin is approached along any straight path except along the axes.

References

;Notes {{NoteFoot Theory of probability distributions Covariance and correlation Normal distribution Probability fallacies

Examples

A symmetric example

An asymmetric example

Examples with support almost everywhere in the plane

See also

References