HOME

TheInfoList



OR:

Directional statistics (also circular statistics or spherical statistics) is the subdiscipline of statistics that deals with directions (
unit vector In mathematics, a unit vector in a normed vector space is a vector (often a spatial vector) of length 1. A unit vector is often denoted by a lowercase letter with a circumflex, or "hat", as in \hat (pronounced "v-hat"). The term ''direction vec ...
s in Euclidean space, R''n''),
axes Axes, plural of '' axe'' and of '' axis'', may refer to * ''Axes'' (album), a 2005 rock album by the British band Electrelane * a possibly still empty plot (graphics) A plot is a graphical technique for representing a data set, usually as a gra ...
( lines through the origin in R''n'') or rotations in R''n''. More generally, directional statistics deals with observations on compact Riemannian manifolds including the
Stiefel manifold In mathematics, the Stiefel manifold V_k(\R^n) is the set of all orthonormal ''k''-frames in \R^n. That is, it is the set of ordered orthonormal ''k''-tuples of vectors in \R^n. It is named after Swiss mathematician Eduard Stiefel. Likewise one ...
. The fact that 0 degrees and 360 degrees are identical
angle In Euclidean geometry, an angle is the figure formed by two rays, called the '' sides'' of the angle, sharing a common endpoint, called the ''vertex'' of the angle. Angles formed by two rays lie in the plane that contains the rays. Angles a ...
s, so that for example 180 degrees is not a sensible mean of 2 degrees and 358 degrees, provides one illustration that special statistical methods are required for the analysis of some types of data (in this case, angular data). Other examples of data that may be regarded as directional include statistics involving temporal periods (e.g. time of day, week, month, year, etc.), compass directions,
dihedral angle A dihedral angle is the angle between two intersecting planes or half-planes. In chemistry, it is the clockwise angle between half-planes through two sets of three atoms, having two atoms in common. In solid geometry, it is defined as the uni ...
s in molecules, orientations, rotations and so on.


Circular distributions

Any probability density function (pdf) \ p(x) on the line can be "wrapped" around the circumference of a circle of unit radius. That is, the pdf of the wrapped variable \theta = x_w=x \bmod 2\pi\ \ \in (-\pi,\pi] is p_w(\theta) = \sum_^. This concept can be extended to the multivariate context by an extension of the simple sum to a number of F sums that cover all dimensions in the feature space: p_w(\boldsymbol\theta) = \sum_^ \cdots \sum_^\infty where \mathbf_k = (0, \dots, 0, 1, 0, \dots, 0)^ is the k-th Euclidean basis vector. The following sections show some relevant circular distributions.


von Mises circular distribution

The ''von Mises distribution'' is a circular distribution which, like any other circular distribution, may be thought of as a wrapping of a certain linear probability distribution around the circle. The underlying linear probability distribution for the von Mises distribution is mathematically intractable; however, for statistical purposes, there is no need to deal with the underlying linear distribution. The usefulness of the von Mises distribution is twofold: it is the most mathematically tractable of all circular distributions, allowing simpler statistical analysis, and it is a close approximation to the wrapped normal distribution, which, analogously to the linear normal distribution, is important because it is the limiting case for the sum of a large number of small angular deviations. In fact, the von Mises distribution is often known as the "circular normal" distribution because of its ease of use and its close relationship to the wrapped normal distribution (Fisher, 1993). The pdf of the von Mises distribution is: f(\theta;\mu,\kappa) = \frac where I_0 is the modified
Bessel function Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions of Bessel's differential equation x^2 \frac + x \frac + \left(x^2 - \alpha^2 \right)y = 0 for an arbitrary ...
of order 0.


Circular uniform distribution

The probability density function (pdf) of the ''circular uniform distribution'' is given by U(\theta) = \frac 1 . It can also be thought of as \kappa = 0 of the von Mises above.


Wrapped normal distribution

The pdf of the ''wrapped normal distribution'' (WN) is: WN(\theta;\mu,\sigma) = \frac \sum^_ \exp \left frac \right= \frac\vartheta\left(\frac,\frac\right) where μ and σ are the mean and standard deviation of the unwrapped distribution, respectively and \vartheta(\theta,\tau) is the Jacobi theta function: \vartheta(\theta,\tau) = \sum_^\infty (w^2)^n q^ where w \equiv e^ and q \equiv e^.


Wrapped Cauchy distribution

The pdf of the ''wrapped Cauchy distribution'' (WC) is: WC(\theta;\theta_0,\gamma) = \sum_^\infty \frac = \frac\,\,\frac where \gamma is the scale factor and \theta_0 is the peak position.


Wrapped Lévy distribution

The pdf of the ''wrapped Lévy distribution'' (WL) is: f_(\theta;\mu,c) = \sum_^\infty \sqrt\,\frac where the value of the summand is taken to be zero when \theta+2\pi n-\mu \le 0, c is the scale factor and \mu is the location parameter.


Distributions on higher-dimensional manifolds

There also exist distributions on the two-dimensional sphere (such as the Kent distribution), the ''N''-dimensional sphere (the
von Mises–Fisher distribution In directional statistics, the von Mises–Fisher distribution (named after Richard von Mises and Ronald Fisher), is a probability distribution on the (p-1)- sphere in \mathbb^. If p=2 the distribution reduces to the von Mises distribution on th ...
) or the torus (the bivariate von Mises distribution). The matrix von Mises–Fisher distribution is a distribution on the
Stiefel manifold In mathematics, the Stiefel manifold V_k(\R^n) is the set of all orthonormal ''k''-frames in \R^n. That is, it is the set of ordered orthonormal ''k''-tuples of vectors in \R^n. It is named after Swiss mathematician Eduard Stiefel. Likewise one ...
, and can be used to construct probability distributions over rotation matrices. The Bingham distribution is a distribution over axes in ''N'' dimensions, or equivalently, over points on the (''N'' − 1)-dimensional sphere with the antipodes identified. For example, if ''N'' = 2, the axes are undirected lines through the origin in the plane. In this case, each axis cuts the unit circle in the plane (which is the one-dimensional sphere) at two points that are each other's antipodes. For ''N'' = 4, the Bingham distribution is a distribution over the space of unit quaternions ( versors). Since a versor corresponds to a rotation matrix, the Bingham distribution for ''N'' = 4 can be used to construct probability distributions over the space of rotations, just like the Matrix-von Mises–Fisher distribution. These distributions are for example used in geology,
crystallography Crystallography is the experimental science of determining the arrangement of atoms in crystalline solids. Crystallography is a fundamental subject in the fields of materials science and solid-state physics (condensed matter physics). The wor ...
and
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combine ...
.


Moments

The raw vector (or trigonometric) moments of a circular distribution are defined as : m_n=\operatorname E(z^n)=\int_\Gamma P(\theta) z^n \, d\theta where \Gamma is any interval of length 2\pi, P(\theta) is the PDF of the circular distribution, and z=e^. Since the integral P(\theta) is unity, and the integration interval is finite, it follows that the moments of any circular distribution are always finite and well defined. Sample moments are analogously defined: : \overline_n=\frac\sum_^N z_i^n. The population resultant vector, length, and mean angle are defined in analogy with the corresponding sample parameters. : \rho=m_1 : R=, m_1, : \theta_n=\operatorname(m_n). In addition, the lengths of the higher moments are defined as: : R_n=, m_n, while the angular parts of the higher moments are just (n \theta_n) \bmod 2\pi. The lengths of all moments will lie between 0 and 1.


Measures of location and spread

Various measures of
central tendency In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications in ...
and statistical dispersion may be defined for both the population and a sample drawn from that population.Fisher, NI., ''Statistical Analysis of Circular Data'', Cambridge University Press, 1993.


Central tendency

The most common measure of location is the circular mean. The population circular mean is simply the first moment of the distribution while the sample mean is the first moment of the sample. The sample mean will serve as an unbiased estimator of the population mean. When data is concentrated, the median and mode may be defined by analogy to the linear case, but for more dispersed or multi-modal data, these concepts are not useful.


Dispersion

The most common measures of circular spread are: * The . For the sample the circular variance is defined as: \overline = 1 - \overline and for the population \operatorname(z) = 1 - R Both will have values between 0 and 1. * The S(z) = \sqrt = \sqrt \overline(z) = \sqrt = \sqrt with values between 0 and infinity. This definition of the standard deviation (rather than the square root of the variance) is useful because for a wrapped normal distribution, it is an estimator of the standard deviation of the underlying normal distribution. It will therefore allow the circular distribution to be standardized as in the linear case, for small values of the standard deviation. This also applies to the von Mises distribution which closely approximates the wrapped normal distribution. Note that for small S(z), we have S(z)^2 = 2 \operatorname(z). * The \delta = \frac \overline=\frac with values between 0 and infinity. This measure of spread is found useful in the statistical analysis of variance.


Distribution of the mean

Given a set of ''N'' measurements z_n=e^ the mean value of ''z'' is defined as: : \overline=\frac\sum_^N z_n which may be expressed as : \overline = \overline+i\overline where : \overline = \frac\sum_^N \cos(\theta_n) \text \overline = \frac\sum_^N \sin(\theta_n) or, alternatively as: : \overline = \overlinee^ where : \overline = \sqrt \text \overline = \arctan (\overline / \overline). The distribution of the mean angle (\overline) for a circular pdf ''P''(''θ'') will be given by: : P(\overline,\overline) \, d\overline \, d\overline = P(\overline,\overline) \, d\overline \, d\overline = \int_\Gamma \cdots \int_\Gamma \prod_^N \left P(\theta_n) \, d\theta_n \right where \Gamma is over any interval of length 2\pi and the integral is subject to the constraint that \overline and \overline are constant, or, alternatively, that \overline and \overline are constant. The calculation of the distribution of the mean for most circular distributions is not analytically possible, and in order to carry out an analysis of variance, numerical or mathematical approximations are needed. The
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
may be applied to the distribution of the sample means. (main article: Central limit theorem for directional statistics). It can be shown that the distribution of overline,\overline/math> approaches a
bivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One ...
in the limit of large sample size.


Goodness of fit and significance testing

For cyclic data – (e.g., is it uniformly distributed) : * Rayleigh test for a unimodal cluster *
Kuiper's test Kuiper's test is used in statistics to test that whether a given distribution, or family of distributions, is contradicted by evidence from a sample of data. It is named after Dutch mathematician Nicolaas Kuiper. Kuiper's test is closely related to ...
for possibly multimodal data.


See also

*
Circular correlation coefficient In statistics, the Pearson correlation coefficient (PCC, pronounced ) ― also known as Pearson's ''r'', the Pearson product-moment correlation coefficient (PPMCC), the bivariate correlation, or colloquially simply as the correlation coefficient ...
*
Complex normal distribution In probability theory, the family of complex normal distributions, denoted \mathcal or \mathcal_, characterizes complex random variables whose real and imaginary parts are jointly normal. The complex normal family has three parameters: ''location ...
*
Wrapped distribution In probability theory and directional statistics, a wrapped probability distribution is a continuous probability distribution that describes data points that lie on a unit ''n''-sphere. In one dimension, a wrapped distribution consists of points on ...


References


Books on directional statistics

* Batschelet, E. ''Circular statistics in biology,''
Academic Press Academic Press (AP) is an academic book publisher founded in 1941. It was acquired by Harcourt, Brace & World in 1969. Reed Elsevier bought Harcourt in 2000, and Academic Press is now an imprint of Elsevier. Academic Press publishes referen ...
, London, 1981. . * Fisher, NI., ''Statistical Analysis of Circular Data'', Cambridge University Press, 1993. * Fisher, NI., Lewis, T., Embleton, BJJ. ''Statistical Analysis of Spherical Data'', Cambridge University Press, 1993. * Jammalamadaka S. Rao and SenGupta A. ''Topics in Circular Statistics'', World Scientific, 2001. * Mardia, KV. and Jupp P., ''Directional Statistics (2nd edition)'', John Wiley and Sons Ltd., 2000. * Ley, C. and Verdebout, T., ''Modern Directional Statistics'',
CRC Press The CRC Press, LLC is an American publishing group that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information techn ...
Taylor & Francis Group, 2017. {{ProbDistributions, directional Statistical data types Statistical theory Types of probability distributions