In
statistics
Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
does not depend on the unknown
parameter
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s (including
nuisance parameter
Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "commo ...
s). A pivot quantity need not be a
statistic
A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...
—the function and its ''value'' can depend on the parameters of the model, but its ''distribution'' must not. If it is a statistic, then it is known as an ''
ancillary statistic.''
More formally,
let
be a random sample from a distribution that depends on a parameter (or vector of parameters)
. Let
be a random variable whose distribution is the same for all
. Then
is called a ''pivotal quantity'' (or simply a ''pivot'').
Pivotal quantities are commonly used for
normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels.
Pivotal quantities are fundamental to the construction of
test statistic
A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specifi ...
s, as they allow the statistic to not depend on parameters – for example,
Student's t-statistic is for a normal distribution with unknown variance (and mean). They also provide one method of constructing
confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
s, and the use of pivotal quantities improves performance of the
bootstrap. In the form of ancillary statistics, they can be used to construct frequentist
prediction interval
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are ...
s (predictive confidence intervals).
Examples
Normal distribution
One of the simplest pivotal quantities is the
z-score
In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...
; given a normal distribution with mean
and variance
, and an observation ''x,'' the z-score:
:
has distribution
– a normal distribution with mean 0 and variance 1. Similarly, since the ''n''-sample sample mean has sampling distribution
the z-score of the mean
:
also has distribution
Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) – the distribution is independent of the parameters.
Given
independent, identically distributed (i.i.d.) observations
from the
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
with unknown mean
and variance
, a pivotal quantity can be obtained from the function:
:
where
:
and
:
are unbiased estimates of
and
, respectively. The function
is the
Student's t-statistic for a new value
, to be drawn from the same population as the already observed set of values
.
Using
the function
becomes a pivotal quantity, which is also distributed by the
Student's t-distribution
In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situ ...
with
degrees of freedom. As required, even though
appears as an argument to the function
, the distribution of
does not depend on the parameters
or
of the normal probability distribution that governs the observations
.
This can be used to compute a
prediction interval
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are ...
for the next observation
see
Prediction interval: Normal distribution.
Bivariate normal distribution
In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to
asymptotic normality.
Suppose a sample of size
of vectors
is taken from a bivariate
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
with unknown
correlation
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
.
An estimator of
is the sample (Pearson, moment) correlation
:
where
are
sample variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
s of
and
. The sample statistic
has an asymptotically normal distribution:
:
.
However, a
variance-stabilizing transformation
:
known as
Fisher's ''z'' transformation of the correlation coefficient allows creating the distribution of
asymptotically independent of unknown parameters:
:
where
is the corresponding distribution parameter. For finite samples sizes
, the random variable
will have distribution closer to normal than that of
. An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is
:
Robustness
From the point of view of
robust statistics
Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, su ...
, pivotal quantities are robust to changes in the parameters – indeed, independent of the parameters – but not in general robust to changes in the model, such as violations of the assumption of normality.
This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.
See also
*
Normalization (statistics)
In statistics and applications of statistics, normalization can have a range of meanings. In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging ...
References
{{Statistics, inference
Statistical theory