Central limit theorem for directional statistics
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, the
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
states conditions under which the average of a sufficiently large number of
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
random variables A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
, each with finite mean and variance, will be approximately normally distributed.
Directional statistics Directional statistics (also circular statistics or spherical statistics) is the subdiscipline of statistics that deals with directions (unit vectors in Euclidean space, R''n''), axes (lines through the origin in R''n'') or rotations in R''n''. Mor ...
is the subdiscipline of
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
that deals with directions (
unit vector In mathematics, a unit vector in a normed vector space is a vector (often a spatial vector) of length 1. A unit vector is often denoted by a lowercase letter with a circumflex, or "hat", as in \hat (pronounced "v-hat"). The term ''direction vecto ...
s in R''n''),
axes Axes, plural of ''axe'' and of ''axis'', may refer to * ''Axes'' (album), a 2005 rock album by the British band Electrelane * a possibly still empty plot (graphics) See also *Axess (disambiguation) *Axxess (disambiguation) Axxess may refer to: ...
(lines through the origin in R''n'') or
rotation Rotation, or spin, is the circular movement of an object around a '' central axis''. A two-dimensional rotating object has only one possible central axis and can rotate in either a clockwise or counterclockwise direction. A three-dimensional ...
s in R''n''. The means and variances of directional quantities are all finite, so that the central limit theorem may be applied to the particular case of directional statistics. This article will deal only with unit vectors in 2-dimensional space (R''2'') but the method described can be extended to the general case.


The central limit theorem

A sample of angles \theta_i are measured, and since they are indefinite to within a factor of 2\pi, the complex definite quantity z_i=e^=\cos(\theta_i)+i\sin(\theta_i) is used as the random variate. The probability distribution from which the sample is drawn may be characterized by its moments, which may be expressed in Cartesian and polar form: :m_n=E(z^n)= C_n +i S_n = R_n e^\, It follows that: :C_n=E(\cos (n\theta))\, :S_n=E(\sin (n\theta))\, :R_n=, E(z^n), =\sqrt\, :\theta_n=\arg(E(z^n))\, Sample moments for N trials are: :\overline=\frac\sum_^N z_i^n =\overline +i \overline = \overline e^ where :\overline=\frac\sum_^N\cos(n\theta_i) :\overline=\frac\sum_^N\sin(n\theta_i) :\overline=\frac\sum_^N , z_i^n, :\overline=\frac\sum_^N \arg(z_i^n) The vector math>\overline,\overlinemay be used as a representation of the sample mean (\overline) and may be taken as a 2-dimensional random variate. The bivariate
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
states that the
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
for \overline and \overline in the limit of a large number of samples is given by: : overline,\overline\xrightarrow \mathcal( _1,S_1\Sigma/N) where \mathcal() is the
bivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One ...
and \Sigma is the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
for the
circular distribution In probability and statistics, a circular distribution or polar distribution is a probability distribution of a random variable whose values are angles, usually taken to be in the range A circular distribution is often a continuous probability ...
: : \Sigma = \begin \sigma_ & \sigma_ \\ \sigma_ & \sigma_ \end \quad :\sigma_=E(\cos^2\theta)-E(\cos\theta)^2\, :\sigma_=\sigma_=E(\cos\theta\sin\theta)-E(\cos\theta)E(\sin\theta)\, :\sigma_=E(\sin^2\theta)-E(\sin\theta)^2\, Note that the bivariate normal distribution is defined over the entire plane, while the mean is confined to be in the unit ball (on or inside the unit circle). This means that the integral of the limiting (bivariate normal) distribution over the unit ball will not be equal to unity, but rather approach unity as ''N'' approaches infinity. It is desired to state the limiting bivariate distribution in terms of the moments of the distribution.


Covariance matrix in terms of moments

Using multiple angle
trigonometric identities In trigonometry, trigonometric identities are equalities that involve trigonometric functions and are true for every value of the occurring variables for which both sides of the equality are defined. Geometrically, these are identities involvin ...
:C_2= E(\cos(2\theta)) = E(\cos^2\theta-1)=E(1-\sin^2\theta)\, :S_2= E(\sin(2\theta)) = E(2\cos\theta\sin\theta)\, It follows that: :\sigma_=E(\cos^2\theta)-E(\cos\theta)^2 =\frac\left(1 + C_2 - 2C_1^2\right) :\sigma_=E(\cos\theta\sin\theta)-E(\cos\theta)E(\sin\theta)=\frac\left(S_2 - 2 C_1 S_1 \right) :\sigma_=E(\sin^2\theta)-E(\sin\theta)^2 =\frac\left(1 - C_2 - 2S_1^2\right) The covariance matrix is now expressed in terms of the moments of the circular distribution. The central limit theorem may also be expressed in terms of the polar components of the mean. If P(\overline,\overline)d\overlined\overline is the probability of finding the mean in area element d\overlined\overline, then that probability may also be written P(\overline\cos(\overline),\overline\sin(\overline))\overlined\overlined\overline{\theta_1}.


References

Directional statistics Asymptotic theory (statistics)