Directional statistics (also circular statistics or spherical statistics) is the subdiscipline of statistics that deals with directions (unit vectors in R^{''n''}), axes (lines through the origin in R^{''n''}) or rotations in R^{''n''}. More generally, directional statistics deals with observations on compact Riemannian manifolds.
The fact that 0 degrees and 360 degrees are identical angles, so that for example 180 degrees is not a sensible mean of 2 degrees and 358 degrees, provides one illustration that special statistical methods are required for the analysis of some types of data (in this case, angular data). Other examples of data that may be regarded as directional include statistics involving temporal periods (e.g. time of day, week, month, year, etc.), compass directions, dihedral angles in molecules, orientations, rotations and so on.

Circular distributions

Any probability density function (pdf) $p(x)$ on the line can be "wrapped" around the circumference of a circle of unit radius. That is, the pdf of the wrapped variable : $\backslash theta\; =\; x\_w=x\; \backslash bmod\; 2\backslash pi\backslash \; \backslash \; \backslash in\; (-\backslash pi,\backslash pi]$ is : $p\_w(\backslash theta)=\backslash sum\_^.$ This concept can be extended to the multivariate context by an extension of the simple sum to a number of $F$ sums that cover all dimensions in the feature space: : $p\_w(\backslash vec\backslash theta)=\backslash sum\_^\backslash cdots\; \backslash sum\_^\backslash infty$ where $\backslash mathbf\_k=(0,\backslash dots,0,1,0,\backslash dots,0)^$ is the $k$th Euclidean basis vector. The following sections show some relevant circular distributions.

von Mises circular distribution

The ''von Mises distribution'' is a circular distribution which, like any other circular distribution, may be thought of as a wrapping of a certain linear probability distribution around the circle. The underlying linear probability distribution for the von Mises distribution is mathematically intractable; however, for statistical purposes, there is no need to deal with the underlying linear distribution. The usefulness of the von Mises distribution is twofold: it is the most mathematically tractable of all circular distributions, allowing simpler statistical analysis, and it is a close approximation to the wrapped normal distribution, which, analogously to the linear normal distribution, is important because it is the limiting case for the sum of a large number of small angular deviations. In fact, the von Mises distribution is often known as the "circular normal" distribution because of its ease of use and its close relationship to the wrapped normal distribution (Fisher, 1993). The pdf of the von Mises distribution is: ::$f(\backslash theta;\backslash mu,\backslash kappa)=\backslash frac$ where $I\_0$ is the modified Bessel function of order 0.

Circular uniform distribution

The probability density function (pdf) of the ''circular uniform distribution'' is given by ::$U(\backslash theta)=1/(2\backslash pi).\backslash ,$ It can also be thought of as $\backslash kappa\; =\; 0$ of the von Mises above.

Wrapped normal distribution

The pdf of the ''wrapped normal distribution'' (WN) is: ::$WN(\backslash theta;\backslash mu,\backslash sigma)=\backslash frac\; \backslash sum^\_\; \backslash exp\; \backslash leftfrac\; \backslash right\backslash frac\backslash vartheta\backslash left(\backslash frac,\backslash frac\backslash right)$ :where μ and σ are the mean and standard deviation of the unwrapped distribution, respectively and $\backslash vartheta(\backslash theta,\backslash tau)$ is the Jacobi theta function: ::$\backslash vartheta(\backslash theta,\backslash tau)=\backslash sum\_^\backslash infty\; (w^2)^n\; q^$ where $w\; \backslash equiv\; e^$ and $q\; \backslash equiv\; e^.$

Wrapped Cauchy distribution

The pdf of the ''wrapped Cauchy distribution'' (WC) is: ::$WC(\backslash theta;\backslash theta\_0,\backslash gamma)=\backslash sum\_^\backslash infty\; \backslash frac\; =\backslash frac\backslash ,\backslash ,\backslash frac$ :where $\backslash gamma$ is the scale factor and $\backslash theta\_0$ is the peak position.

Wrapped Lévy distribution

The pdf of the ''wrapped Lévy distribution'' (WL) is: ::$f\_(\backslash theta;\backslash mu,c)=\backslash sum\_^\backslash infty\; \backslash sqrt\backslash ,\backslash frac$ where the value of the summand is taken to be zero when $\backslash theta+2\backslash pi\; n-\backslash mu\; \backslash le\; 0$, $c$ is the scale factor and $\backslash mu$ is the location parameter.

Distributions on higher-dimensional manifolds

There also exist distributions on the two-dimensional sphere (such as the Kent distribution), the ''N''-dimensional sphere (the von Mises–Fisher distribution) or the torus (the bivariate von Mises distribution). The matrix von Mises–Fisher distribution is a distribution on the Stiefel manifold, and can be used to construct probability distributions over rotation matrices. The Bingham distribution is a distribution over axes in ''N'' dimensions, or equivalently, over points on the (''N'' − 1)-dimensional sphere with the antipodes identified. For example, if ''N'' = 2, the axes are undirected lines through the origin in the plane. In this case, each axis cuts the unit circle in the plane (which is the one-dimensional sphere) at two points that are each other's antipodes. For ''N'' = 4, the Bingham distribution is a distribution over the space of unit quaternions. Since a unit quaternion corresponds to a rotation matrix, the Bingham distribution for ''N'' = 4 can be used to construct probability distributions over the space of rotations, just like the Matrix-von Mises–Fisher distribution. These distributions are for example used in geology, crystallography and bioinformatics.

** Moments **

The raw vector (or trigonometric) moments of a circular distribution are defined as
:$m\_n=\backslash operatorname\; E(z^n)=\backslash int\_\backslash Gamma\; P(\backslash theta)\; z^n\; \backslash ,\; d\backslash theta$
where $\backslash Gamma$ is any interval of length $2\backslash pi$, $P(\backslash theta)$ is the PDF of the circular distribution, and $z=e^$. Since the integral $P(\backslash theta)$ is unity, and the integration interval is finite, it follows that the moments of any circular distribution are always finite and well defined.
Sample moments are analogously defined:
:$\backslash overline\_n=\backslash frac\backslash sum\_^N\; z\_i^n.$
The population resultant vector, length, and mean angle are defined in analogy with the corresponding sample parameters.
:$\backslash rho=m\_1$
:$R=|m\_1|$
:$\backslash theta\_n=\backslash operatorname(m\_n).$
In addition, the lengths of the higher moments are defined as:
:$R\_n=|m\_n|$
while the angular parts of the higher moments are just $(n\; \backslash theta\_n)\; \backslash bmod\; 2\backslash pi$. The lengths of all moments will lie between 0 and 1.

** Measures of location and spread **

Various measures of location and spread may be defined for both the population and a sample drawn from that population.Fisher, NI., ''Statistical Analysis of Circular Data'', Cambridge University Press, 1993. The most common measure of location is the circular mean. The population circular mean is simply the first moment of the distribution while the sample mean is the first moment of the sample. The sample mean will serve as an unbiased estimator of the population mean.
When data is concentrated, the median and mode may be defined by analogy to the linear case, but for more dispersed or multi-modal data, these concepts are not useful.
The most common measures of circular spread are:
* The . For the sample the circular variance is defined as:
::$\backslash overline=1-\backslash overline\backslash ,$
:and for the population
::$\backslash operatorname(z)=1-R\backslash ,$
:Both will have values between 0 and 1.
* The
::$S(z)=\backslash sqrt=\backslash sqrt\backslash ,$
::$\backslash overline(z)=\backslash sqrt=\backslash sqrt\backslash ,$
:with values between 0 and infinity. This definition of the standard deviation (rather than the square root of the variance) is useful because for a wrapped normal distribution, it is an estimator of the standard deviation of the underlying normal distribution. It will therefore allow the circular distribution to be standardized as in the linear case, for small values of the standard deviation. This also applies to the von Mises distribution which closely approximates the wrapped normal distribution. Note that for small $S(z)$, we have $S(z)^2=2\; \backslash operatorname(z)$.
* The
::$\backslash delta=\backslash frac$
::$\backslash overline=\backslash frac$
:with values between 0 and infinity. This measure of spread is found useful in the statistical analysis of variance.

** Distribution of the mean **

Given a set of ''N'' measurements $z\_n=e^$ the mean value of ''z'' is defined as:
:$\backslash overline=\backslash frac\backslash sum\_^N\; z\_n$
which may be expressed as
:$\backslash overline\; =\; \backslash overline+i\backslash overline$
where
:$\backslash overline\; =\; \backslash frac\backslash sum\_^N\; \backslash cos(\backslash theta\_n)\; \backslash text\; \backslash overline\; =\; \backslash frac\backslash sum\_^N\; \backslash sin(\backslash theta\_n)$
or, alternatively as:
:$\backslash overline\; =\; \backslash overlinee^$
where
:$\backslash overline\; =\; \backslash sqrt\; \backslash text\; \backslash overline\; =\; \backslash arctan\; (\backslash overline\; /\; \backslash overline).$
The distribution of the mean ($\backslash overline$) for a circular pdf ''P''(''θ'') will be given by:
:$P(\backslash overline,\backslash overline)\; \backslash ,\; d\backslash overline\; \backslash ,\; d\backslash overline\; =\; P(\backslash overline,\backslash overline)\; \backslash ,\; d\backslash overline\; \backslash ,\; d\backslash overline\; =\; \backslash int\_\backslash Gamma\; \backslash cdots\; \backslash int\_\backslash Gamma\; \backslash prod\_^N\; \backslash leftP(\backslash theta\_n)\; \backslash ,\; d\backslash theta\_n\; \backslash right$
where $\backslash Gamma$ is over any interval of length $2\backslash pi$ and the integral is subject to the constraint that $\backslash overline$ and $\backslash overline$ are constant, or, alternatively, that $\backslash overline$ and $\backslash overline$ are constant.
The calculation of the distribution of the mean for most circular distributions is not analytically possible, and in order to carry out an analysis of variance, numerical or mathematical approximations are needed.
The central limit theorem may be applied to the distribution of the sample means. (main article: Central limit theorem for directional statistics). It can be shown that the distribution of $overline,\backslash overline/math>\; approaches\; abivariate\; normal\; distributionin\; the\; limit\; of\; large\; sample\; size.$

Goodness of fit and significance testing

For cyclic data – (e.g., is it uniformly distributed) : * Rayleigh test for a unimodal cluster * Kuiper's test for possibly multimodal data.

** See also **

* Complex normal distribution
* Yamartino method
* Wrapped distribution

References

Books on directional statistics

* Batschelet, E. ''Circular statistics in biology,'' Academic Press, London, 1981. . * Fisher, NI., ''Statistical Analysis of Circular Data'', Cambridge University Press, 1993. * Fisher, NI., Lewis, T., Embleton, BJJ. ''Statistical Analysis of Spherical Data'', Cambridge University Press, 1993. * Jammalamadaka S. Rao and SenGupta A. ''Topics in Circular Statistics'', World Scientific, 2001. * Mardia, KV. and Jupp P., ''Directional Statistics (2nd edition)'', John Wiley and Sons Ltd., 2000. * Ley, C. and Verdebout, T., ''Modern Directional Statistics'', CRC Press Taylor & Francis Group, 2017. {{ProbDistributions|directional Category:Statistical data types Category:Statistical theory Category:Types of probability distributions

Circular distributions

Any probability density function (pdf) $p(x)$ on the line can be "wrapped" around the circumference of a circle of unit radius. That is, the pdf of the wrapped variable : $\backslash theta\; =\; x\_w=x\; \backslash bmod\; 2\backslash pi\backslash \; \backslash \; \backslash in\; (-\backslash pi,\backslash pi]$ is : $p\_w(\backslash theta)=\backslash sum\_^.$ This concept can be extended to the multivariate context by an extension of the simple sum to a number of $F$ sums that cover all dimensions in the feature space: : $p\_w(\backslash vec\backslash theta)=\backslash sum\_^\backslash cdots\; \backslash sum\_^\backslash infty$ where $\backslash mathbf\_k=(0,\backslash dots,0,1,0,\backslash dots,0)^$ is the $k$th Euclidean basis vector. The following sections show some relevant circular distributions.

von Mises circular distribution

The ''von Mises distribution'' is a circular distribution which, like any other circular distribution, may be thought of as a wrapping of a certain linear probability distribution around the circle. The underlying linear probability distribution for the von Mises distribution is mathematically intractable; however, for statistical purposes, there is no need to deal with the underlying linear distribution. The usefulness of the von Mises distribution is twofold: it is the most mathematically tractable of all circular distributions, allowing simpler statistical analysis, and it is a close approximation to the wrapped normal distribution, which, analogously to the linear normal distribution, is important because it is the limiting case for the sum of a large number of small angular deviations. In fact, the von Mises distribution is often known as the "circular normal" distribution because of its ease of use and its close relationship to the wrapped normal distribution (Fisher, 1993). The pdf of the von Mises distribution is: ::$f(\backslash theta;\backslash mu,\backslash kappa)=\backslash frac$ where $I\_0$ is the modified Bessel function of order 0.

Circular uniform distribution

The probability density function (pdf) of the ''circular uniform distribution'' is given by ::$U(\backslash theta)=1/(2\backslash pi).\backslash ,$ It can also be thought of as $\backslash kappa\; =\; 0$ of the von Mises above.

Wrapped normal distribution

The pdf of the ''wrapped normal distribution'' (WN) is: ::$WN(\backslash theta;\backslash mu,\backslash sigma)=\backslash frac\; \backslash sum^\_\; \backslash exp\; \backslash leftfrac\; \backslash right\backslash frac\backslash vartheta\backslash left(\backslash frac,\backslash frac\backslash right)$ :where μ and σ are the mean and standard deviation of the unwrapped distribution, respectively and $\backslash vartheta(\backslash theta,\backslash tau)$ is the Jacobi theta function: ::$\backslash vartheta(\backslash theta,\backslash tau)=\backslash sum\_^\backslash infty\; (w^2)^n\; q^$ where $w\; \backslash equiv\; e^$ and $q\; \backslash equiv\; e^.$

Wrapped Cauchy distribution

The pdf of the ''wrapped Cauchy distribution'' (WC) is: ::$WC(\backslash theta;\backslash theta\_0,\backslash gamma)=\backslash sum\_^\backslash infty\; \backslash frac\; =\backslash frac\backslash ,\backslash ,\backslash frac$ :where $\backslash gamma$ is the scale factor and $\backslash theta\_0$ is the peak position.

Wrapped Lévy distribution

The pdf of the ''wrapped Lévy distribution'' (WL) is: ::$f\_(\backslash theta;\backslash mu,c)=\backslash sum\_^\backslash infty\; \backslash sqrt\backslash ,\backslash frac$ where the value of the summand is taken to be zero when $\backslash theta+2\backslash pi\; n-\backslash mu\; \backslash le\; 0$, $c$ is the scale factor and $\backslash mu$ is the location parameter.

Distributions on higher-dimensional manifolds

There also exist distributions on the two-dimensional sphere (such as the Kent distribution), the ''N''-dimensional sphere (the von Mises–Fisher distribution) or the torus (the bivariate von Mises distribution). The matrix von Mises–Fisher distribution is a distribution on the Stiefel manifold, and can be used to construct probability distributions over rotation matrices. The Bingham distribution is a distribution over axes in ''N'' dimensions, or equivalently, over points on the (''N'' − 1)-dimensional sphere with the antipodes identified. For example, if ''N'' = 2, the axes are undirected lines through the origin in the plane. In this case, each axis cuts the unit circle in the plane (which is the one-dimensional sphere) at two points that are each other's antipodes. For ''N'' = 4, the Bingham distribution is a distribution over the space of unit quaternions. Since a unit quaternion corresponds to a rotation matrix, the Bingham distribution for ''N'' = 4 can be used to construct probability distributions over the space of rotations, just like the Matrix-von Mises–Fisher distribution. These distributions are for example used in geology, crystallography and bioinformatics.

Goodness of fit and significance testing

For cyclic data – (e.g., is it uniformly distributed) : * Rayleigh test for a unimodal cluster * Kuiper's test for possibly multimodal data.

References

Books on directional statistics

* Batschelet, E. ''Circular statistics in biology,'' Academic Press, London, 1981. . * Fisher, NI., ''Statistical Analysis of Circular Data'', Cambridge University Press, 1993. * Fisher, NI., Lewis, T., Embleton, BJJ. ''Statistical Analysis of Spherical Data'', Cambridge University Press, 1993. * Jammalamadaka S. Rao and SenGupta A. ''Topics in Circular Statistics'', World Scientific, 2001. * Mardia, KV. and Jupp P., ''Directional Statistics (2nd edition)'', John Wiley and Sons Ltd., 2000. * Ley, C. and Verdebout, T., ''Modern Directional Statistics'', CRC Press Taylor & Francis Group, 2017. {{ProbDistributions|directional Category:Statistical data types Category:Statistical theory Category:Types of probability distributions