statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...

does not depend on the unknown

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

s (including

nuisance parameter Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "commo ...

s). A pivot quantity need not be a

statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...

—the function and its ''value'' can depend on the parameters of the model, but its ''distribution'' must not. If it is a statistic, then it is known as an '' ancillary statistic.'' More formally, let

X = (X_1,X_2,\ldots,X_n)

be a random sample from a distribution that depends on a parameter (or vector of parameters)

\theta

. Let

g(X,\theta)

be a random variable whose distribution is the same for all

\theta

. Then

g

is called a ''pivotal quantity'' (or simply a ''pivot''). Pivotal quantities are commonly used for normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels. Pivotal quantities are fundamental to the construction of

test statistic A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specifi ...

s, as they allow the statistic to not depend on parameters – for example, Student's t-statistic is for a normal distribution with unknown variance (and mean). They also provide one method of constructing

confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...

s, and the use of pivotal quantities improves performance of the bootstrap. In the form of ancillary statistics, they can be used to construct frequentist

prediction interval In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are ...

s (predictive confidence intervals).

Examples

Normal distribution

One of the simplest pivotal quantities is the

z-score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...

; given a normal distribution with mean

\mu

and variance

\sigma^2

, and an observation ''x,'' the z-score: :

z = \frac,

has distribution

N(0,1)

– a normal distribution with mean 0 and variance 1. Similarly, since the ''n''-sample sample mean has sampling distribution

N(\mu,\sigma^2/n),

the z-score of the mean :

z = \frac

also has distribution

N(0,1).

Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) – the distribution is independent of the parameters. Given

n

independent, identically distributed (i.i.d.) observations

X = (X_1, X_2, \ldots, X_n)

from the

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

with unknown mean

\mu

and variance

\sigma^2

, a pivotal quantity can be obtained from the function: :

g(x,X) = \frac

where :

\overline = \frac\sum_^n

and :

s^2 = \frac\sum_^n

are unbiased estimates of

\mu

and

\sigma^2

, respectively. The function

g(x,X)

is the Student's t-statistic for a new value

x

, to be drawn from the same population as the already observed set of values

X

. Using

x=\mu

the function

g(\mu,X)

becomes a pivotal quantity, which is also distributed by the

Student's t-distribution In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situ ...

with

\nu = n-1

degrees of freedom. As required, even though

\mu

appears as an argument to the function

g

, the distribution of

g(\mu,X)

does not depend on the parameters

\mu

\sigma

of the normal probability distribution that governs the observations

X_1,\ldots,X_n

. This can be used to compute a

for the next observation

X_;

see Prediction interval: Normal distribution.

Bivariate normal distribution

In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to asymptotic normality. Suppose a sample of size

n

of vectors

(X_i,Y_i)'

is taken from a bivariate

with unknown

correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...

\rho

. An estimator of

\rho

is the sample (Pearson, moment) correlation :

r = \frac

where

s_X^2, s_Y^2

are

sample variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...

s of

X

and

Y

. The sample statistic

r

has an asymptotically normal distribution: :

\sqrt\frac \Rightarrow N(0,1)

. However, a variance-stabilizing transformation :

z = \rm^ r = \frac12 \ln \frac

known as Fisher's ''z'' transformation of the correlation coefficient allows creating the distribution of

z

asymptotically independent of unknown parameters: :

\sqrt(z-\zeta) \Rightarrow N(0,1)

where

\zeta = ^ \rho

is the corresponding distribution parameter. For finite samples sizes

n

, the random variable

z

will have distribution closer to normal than that of

r

. An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is :

\operatorname(z) \approx \frac1 .

Robustness

From the point of view of

robust statistics Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, su ...

, pivotal quantities are robust to changes in the parameters – indeed, independent of the parameters – but not in general robust to changes in the model, such as violations of the assumption of normality. This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.

References

{{Statistics, inference Statistical theory

Examples

Normal distribution

Bivariate normal distribution

Robustness

See also

References