Basu's Theorem
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, Basu's theorem states that any boundedly complete and
sufficient statistic In statistics, sufficiency is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. A sufficient statistic contains all of the information that the dataset provides about the model parameters. It ...
is
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...
of any
ancillary statistic In statistics, ancillarity is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. An ancillary statistic has the same distribution regardless of the value of the parameters and thus provides no i ...
. This is a 1955 result of Debabrata Basu. It is often used in statistics as a tool to prove independence of two statistics, by first demonstrating one is complete sufficient and the other is ancillary, then appealing to the theorem. An example of this is to show that the sample mean and sample variance of a normal distribution are independent statistics, which is done in the
Example Example may refer to: * ''exempli gratia'' (e.g.), usually read out in English as "for example" * .example, reserved as a domain name that may not be installed as a top-level domain of the Internet ** example.com, example.net, example.org, an ...
section below. This property (independence of sample mean and sample variance) characterizes normal distributions.


Statement

Let (P_\theta; \theta \in \Theta) be a family of distributions on a
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. It captures and generalises intuitive notions such as length, area, an ...
(X, \mathcal) and a
statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypot ...
T maps from (X, \mathcal) to some measurable space (Y, \mathcal). If T is a boundedly complete sufficient statistic for \theta, and A is ancillary to \theta, then conditional on \theta, T is independent of A. That is, T\perp\!\!\!\perp A, \theta.


Proof

Let P_\theta^T and P_\theta^A be the
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variable ...
s of T and A respectively. Denote by A^(B) the
preimage In mathematics, for a function f: X \to Y, the image of an input value x is the single output value produced by f when passed x. The preimage of an output value y is the set of input values that produce y. More generally, evaluating f at each ...
of a set B under the map A. For any measurable set B \in \mathcal we have :P_\theta^A(B) = P_\theta (A^(B)) = \int_ P_\theta(A^(B) \mid T=t) \ P_\theta^T (dt). The distribution P_\theta^A does not depend on \theta because A is ancillary. Likewise, P_\theta(\cdot \mid T = t) does not depend on \theta because T is sufficient. Therefore : \int_Y \big P(A^(B) \mid T=t) - P^A(B) \big\ P_\theta^T (dt) = 0. Note the integrand (the function inside the integral) is a function of t and not \theta. Therefore, since T is boundedly complete the function :g(t) = P(A^(B) \mid T=t) - P^A(B) is zero for P_\theta^T almost all values of t and thus :P(A^(B) \mid T=t) = P^A(B) for almost all t. Therefore, A is independent of T.


Example


Independence of sample mean and sample variance of a normal distribution

Let ''X''1, ''X''2, ..., ''X''''n'' be independent, identically distributed normal
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s with
mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
''μ'' and
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
''σ''2. Then with respect to the parameter ''μ'', one can show that :\widehat=\frac, the sample mean, is a complete and sufficient statistic – it is all the information one can derive to estimate ''μ,'' and no more – and :\widehat^2=\frac, the sample variance, is an ancillary statistic – its distribution does not depend on ''μ.'' Therefore, from Basu's theorem it follows that these statistics are independent conditional on \mu, conditional on \sigma^2. This independence result can also be proven by Cochran's theorem. Further, this property (that the sample mean and sample variance of the normal distribution are independent) '' characterizes'' the normal distribution – no other distribution has this property.


Notes


References

* * Mukhopadhyay, Nitis (2000). ''Probability and Statistical Inference''. Statistics: A Series of Textbooks and Monographs. 162. Florida: CRC Press USA. . * * {{Statistics, state=collapsed Indian inventions Theorems in statistics Independence (probability theory) Articles containing proofs