HOME

TheInfoList



OR:

V-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947. V-statistics are closely related to U-statistics (U for " unbiased") introduced by Wassily Hoeffding in 1948. A V-statistic is a statistical function (of a sample) defined by a particular statistical functional of a probability distribution.


Statistical functions

Statistics that can be represented as functionals T(F_n) of the
empirical distribution function In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...
(F_n) are called ''statistical functionals''. Differentiability of the functional ''T'' plays a key role in the von Mises approach; thus von Mises considers ''differentiable statistical functionals''.


Examples of statistical functions

  1. The ''k''-th central moment is the ''functional'' T(F)=\int(x-\mu)^k \, dF(x), where \mu = E /math> is the
    expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
    of ''X''. The associated ''statistical function'' is the sample ''k''-th central moment, : T_n=m_k=T(F_n) = \frac 1n \sum_^n (x_i - \overline x)^k.
  2. The chi-squared goodness-of-fit statistic is a statistical function ''T''(''F''''n''), corresponding to the statistical functional : T(F) = \sum_^k \frac, where ''A''''i'' are the ''k'' cells and ''p''''i'' are the specified probabilities of the cells under the null hypothesis.
  3. The Cramér–von-Mises and Anderson–Darling goodness-of-fit statistics are based on the functional : T(F) = \int (F(x) - F_0(x))^2 \, w(x;F_0) \, dF_0(x), where ''w''(''x''; ''F''0) is a specified weight function and ''F''0 is a specified null distribution. If ''w'' is the identity function then ''T''(''F''''n'') is the well known Cramér–von-Mises goodness-of-fit statistic; if w(x;F_0)=
    _0(x)(1-F_0(x)) 0 (zero) is a number representing an empty quantity. In place-value notation such as the Hindu–Arabic numeral system, 0 also serves as a placeholder numerical digit, which works by multiplying digits to the left of 0 by the radix, usually ...
    then ''T''(''F''''n'') is the Anderson–Darling statistic.


Representation as a V-statistic

Suppose ''x''1, ..., ''x''''n'' is a sample. In typical applications the statistical function has a representation as the V-statistic : V_ = \frac \sum_^n \cdots \sum_^n h(x_, x_, \dots, x_), where ''h'' is a symmetric kernel function. SerflingSerfling (1980, Section 6.5) discusses how to find the kernel in practice. ''V''''mn'' is called a V-statistic of degree ''m''. A symmetric kernel of degree 2 is a function ''h''(''x'', ''y''), such that ''h''(''x'', ''y'') = ''h''(''y'', ''x'') for all ''x'' and ''y'' in the domain of h. For samples ''x''1, ..., ''x''''n'', the corresponding V-statistic is defined : V_ = \frac \sum_^n \sum_^n h(x_i, x_j).


Example of a V-statistic

  1. An example of a degree-2 V-statistic is the second central moment ''m''2. If ''h''(''x'', ''y'') = (''x'' − ''y'')2/2, the corresponding V-statistic is : V_ = \frac \sum_^n \sum_^n \frac(x_i - x_j)^2 = \frac \sum_^n (x_i - \bar x)^2, which is the maximum likelihood estimator of variance. With the same kernel, the corresponding U-statistic is the (unbiased) sample variance: :s^2= ^ \sum_ \frac(x_i - x_j)^2 = \frac \sum_^n (x_i - \bar x)^2.


Asymptotic distribution

In examples 1–3, the asymptotic distribution of the statistic is different: in (1) it is normal, in (2) it is chi-squared, and in (3) it is a weighted sum of chi-squared variables. Von Mises' approach is a unifying theory that covers all of the cases above. Informally, the type of asymptotic distribution of a statistical function depends on the order of "degeneracy," which is determined by which term is the first non-vanishing term in the Taylor expansion of the functional ''T''. In case it is the linear term, the limit distribution is normal; otherwise higher order types of distributions arise (under suitable conditions such that a central limit theorem holds). There are a hierarchy of cases parallel to asymptotic theory of U-statistics. Let ''A''(''m'') be the property defined by: :''A''(''m''):
  1. Var(''h''(''X''1, ..., ''X''''k'')) = 0 for ''k'' < ''m'', and Var(''h''(''X''1, ..., ''X''''k'')) > 0 for ''k'' = ''m'';
  2. ''n''''m''/2''R''''mn'' tends to zero (in probability). (''R''''mn'' is the remainder term in the Taylor series for ''T''.)
Case ''m'' = 1 (Non-degenerate kernel): If ''A''(1) is true, the statistic is a sample mean and the Central Limit Theorem implies that T(Fn) is asymptotically normal. In the variance example (4), m2 is asymptotically normal with mean \sigma^2 and variance (\mu_4 - \sigma^4)/n, where \mu_4=E(X-E(X))^4. Case ''m'' = 2 (Degenerate kernel): Suppose ''A''(2) is true, and E
^2(X_1,X_2) In mathematics, a square is the result of multiplying a number by itself. The verb "to square" is used to denote this operation. Squaring is the same as raising to the power  2, and is denoted by a superscript 2; for instance, the square o ...
\infty, \, E, h(X_1,X_1), <\infty, and E (x,X_1)equiv 0. Then nV2,n converges in distribution to a weighted sum of independent chi-squared variables: : n V_ \sum_^\infty \lambda_k Z^2_k, where Z_k are independent standard normal variables and \lambda_k are constants that depend on the distribution ''F'' and the functional ''T''. In this case the asymptotic distribution is called a ''quadratic form of centered Gaussian random variables''. The statistic ''V''2,''n'' is called a ''degenerate kernel V-statistic''. The V-statistic associated with the Cramer–von Mises functional (Example 3) is an example of a degenerate kernel V-statistic.See Lee (1990, p. 160) for the kernel function.


See also

* U-statistic * Asymptotic distribution * Asymptotic theory (statistics)


Notes


References

* * * * * * * * {{Statistics, inference, collapsed Estimation theory Asymptotic theory (statistics)