V-statistics are a class of statistics named for
Richard von Mises who developed their
asymptotic distribution theory in a fundamental paper in 1947.
V-statistics are closely related to
U-statistics (U for "
unbiased") introduced by
Wassily Hoeffding in 1948. A V-statistic is a statistical function (of a sample) defined by a particular statistical functional of a probability distribution.
Statistical functions
Statistics that can be represented as functionals
of the
empirical distribution function
In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...
are called ''statistical functionals''.
Differentiability of the functional ''T'' plays a key role in the von Mises approach; thus von Mises considers ''differentiable statistical functionals''.
[
]
Examples of statistical functions
-
The ''k''-th central moment is the ''functional'' , where
-
The chi-squared goodness-of-fit statistic is a statistical function ''T''(''F''''n''), corresponding to the statistical functional
:
T(F) = \sum_^k \frac,
where ''A''''i'' are the ''k'' cells and ''p''''i'' are the specified probabilities of the cells under the null hypothesis.
-
The Cramér–von-Mises and Anderson–Darling goodness-of-fit statistics are based on the functional
:
T(F) = \int (F(x) - F_0(x))^2 \, w(x;F_0) \, dF_0(x),
where ''w''(''x''; ''F''0) is a specified weight function and ''F''0 is a specified null distribution. If ''w'' is the identity function then ''T''(''F''''n'') is the well known Cramér–von-Mises goodness-of-fit statistic; if w(x;F_0)=
_0(x)(1-F_0(x))
0 (zero) is a number representing an empty quantity. In place-value notation such as the Hindu–Arabic numeral system, 0 also serves as a placeholder numerical digit, which works by multiplying digits to the left of 0 by the radix, usually ...
then ''T''(''F''''n'') is the Anderson–Darling statistic.
Representation as a V-statistic
Suppose ''x''1, ..., ''x''''n'' is a sample. In typical applications the statistical function has a representation as the V-statistic
:
V_ = \frac \sum_^n \cdots \sum_^n h(x_, x_, \dots, x_),
where ''h'' is a symmetric kernel function. Serfling[Serfling (1980, Section 6.5)] discusses how to find the kernel in practice. ''V''''mn'' is called a V-statistic of degree ''m''.
A symmetric kernel of degree 2 is a function ''h''(''x'', ''y''), such that ''h''(''x'', ''y'') = ''h''(''y'', ''x'') for all ''x'' and ''y'' in the domain of h. For samples ''x''1, ..., ''x''''n'', the corresponding V-statistic is defined
:
V_ = \frac \sum_^n \sum_^n h(x_i, x_j).
Example of a V-statistic
-
An example of a degree-2 V-statistic is the second central moment ''m''2.
If ''h''(''x'', ''y'') = (''x'' − ''y'')2/2, the corresponding V-statistic is
:
V_ = \frac \sum_^n \sum_^n \frac(x_i - x_j)^2 = \frac \sum_^n (x_i - \bar x)^2,
which is the maximum likelihood estimator of variance. With the same kernel, the corresponding U-statistic is the (unbiased) sample variance:
:s^2=
^ \sum_ \frac(x_i - x_j)^2 =
\frac \sum_^n (x_i - \bar x)^2.
Asymptotic distribution
In examples 1–3, the asymptotic distribution of the statistic is different: in (1) it is normal, in (2) it is chi-squared, and in (3) it is a weighted sum of chi-squared variables.
Von Mises' approach is a unifying theory that covers all of the cases above.[ Informally, the type of asymptotic distribution of a statistical function depends on the order of "degeneracy," which is determined by which term is the first non-vanishing term in the Taylor expansion of the functional ''T''. In case it is the linear term, the limit distribution is normal; otherwise higher order types of distributions arise (under suitable conditions such that a central limit theorem holds).
There are a hierarchy of cases parallel to asymptotic theory of U-statistics. Let ''A''(''m'') be the property defined by:
:''A''(''m''):
]
- Var(''h''(''X''1, ..., ''X''''k'')) = 0 for ''k'' < ''m'', and Var(''h''(''X''1, ..., ''X''''k'')) > 0 for ''k'' = ''m'';
- ''n''''m''/2''R''''mn'' tends to zero (in probability). (''R''''mn'' is the remainder term in the Taylor series for ''T''.)
Case ''m'' = 1 (Non-degenerate kernel):
If ''A''(1) is true, the statistic is a sample mean and the Central Limit Theorem implies that T(Fn) is asymptotically normal.
In the variance example (4), m2 is asymptotically normal with mean \sigma^2 and variance (\mu_4 - \sigma^4)/n, where \mu_4=E(X-E(X))^4.
Case ''m'' = 2 (Degenerate kernel):
Suppose ''A''(2) is true, and E^2(X_1,X_2)
In mathematics, a square is the result of multiplying a number by itself. The verb "to square" is used to denote this operation. Squaring is the same as raising to the power 2, and is denoted by a superscript 2; for instance, the square o ...
\infty, \, E, h(X_1,X_1), <\infty, and E (x,X_1)equiv 0. Then nV2,n converges in distribution to a weighted sum of independent chi-squared variables:
: n V_ \sum_^\infty \lambda_k Z^2_k,
where Z_k are independent standard normal variables and \lambda_k are constants that depend on the distribution ''F'' and the functional ''T''. In this case the asymptotic distribution is called a ''quadratic form of centered Gaussian random variables''. The statistic ''V''2,''n'' is called a ''degenerate kernel V-statistic''. The V-statistic associated with the Cramer–von Mises functional[ (Example 3) is an example of a degenerate kernel V-statistic.][See Lee (1990, p. 160) for the kernel function.]
See also
* U-statistic
* Asymptotic distribution
* Asymptotic theory (statistics)
Notes
References
*
*
*
*
*
*
*
*
{{Statistics, inference, collapsed
Estimation theory
Asymptotic theory (statistics)