Exchangeable random variables
   HOME

TheInfoList



OR:

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, an exchangeable sequence of random variables (also sometimes interchangeable) is a sequence ''X''1, ''X''2, ''X''3, ... (which may be finitely or infinitely long) whose
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
does not change when the positions in the sequence in which finitely many of them appear are altered. Thus, for example the sequences : X_1, X_2, X_3, X_4, X_5, X_6 \quad \text \quad X_3, X_6, X_1, X_5, X_2, X_4 both have the same joint probability distribution. It is closely related to the use of
independent and identically distributed random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usu ...
in statistical models. Exchangeable sequences of random variables arise in cases of
simple random sampling In statistics, a simple random sample (or SRS) is a subset of individuals (a sample) chosen from a larger set (a population) in which a subset of individuals are chosen randomly, all with the same probability. It is a process of selecting a sample ...
.


Definition

Formally, an exchangeable sequence of random variables is a finite or infinite sequence ''X''1, ''X''2, ''X''3, ... of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s such that for any finite
permutation In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or pro ...
σ of the indices 1, 2, 3, ..., (the permutation acts on only finitely many indices, with the rest fixed), the
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
of the permuted sequence : X_, X_, X_, \dots is the same as the joint probability distribution of the original sequence.In short, the order of the sequence of random variables does not affect its joint probability distribution. * Chow, Yuan Shih and Teicher, Henry, ''Probability theory. Independence, interchangeability, martingales,'' Springer Texts in Statistics, 3rd ed., Springer, New York, 1997. xxii+488 pp.  (A sequence ''E''1, ''E''2, ''E''3, ... of events is said to be exchangeable precisely if the sequence of its
indicator function In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x\i ...
s is exchangeable.) The distribution function ''F''''X''1,...,''X''''n''(''x''1, ..., ''x''''n'') of a finite sequence of exchangeable random variables is symmetric in its arguments
Olav Kallenberg Olav Kallenberg (born 1939) is a probability theorist known for his work on exchangeable stochastic processes and for his graduate-level textbooks and monographs. Kallenberg is a professor of mathematics at Auburn University in Alabama in the US ...
provided an appropriate definition of exchangeability for continuous-time stochastic processes. Kallenberg, O., ''Probabilistic symmetries and invariance principles''. Springer-Verlag, New York (2005). 510 pp. .


History

The concept was introduced by
William Ernest Johnson William Ernest Johnson, FBA (23 June 1858 – 14 January 1931), usually cited as W. E. Johnson, was a British philosopher, logician and economic theorist.Zabell, S.L. (2008"Johnson, William Ernest (1858–1931)"In: Durlauf S.N., Blume L.E. ( ...
in his 1924 book ''Logic, Part III: The Logical Foundations of Science''. Exchangeability is equivalent to the concept of
statistical control Statistical process control (SPC) or statistical quality control (SQC) is the application of statistical methods to monitor and control the quality of a production process. This helps to ensure that the process operates efficiently, producing ...
introduced by
Walter Shewhart Walter Andrew Shewhart (pronounced like "shoe-heart"; March 18, 1891 – March 11, 1967) was an American physicist, engineer and statistician, sometimes known as the ''father of statistical quality control'' and also related to the Shewhart cycl ...
also in 1924.


Exchangeability and the i.i.d. statistical model

The property of exchangeability is closely related to the use of
independent and identically distributed In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...
(i.i.d.) random variables in statistical models. A sequence of random variables that are i.i.d, conditional on some underlying distributional form, is exchangeable. This follows directly from the structure of the joint probability distribution generated by the i.i.d. form. Mixtures of exchangeable sequences (in particular, sequences of i.i.d. variables) are exchangeable. The converse can be established for infinite sequences, through an important representation theorem by
Bruno de Finetti Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 "La prévision: ...
(later extended by other probability theorists such as Halmos and Savage). The extended versions of the theorem show that in any infinite sequence of exchangeable random variables, the random variables are conditionally independent and identically-distributed, given the underlying distributional form. This theorem is stated briefly below. (De Finetti's original theorem only showed this to be true for random indicator variables, but this was later extended to encompass all sequences of random variables.) Another way of putting this is that
de Finetti's theorem In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in hono ...
characterizes exchangeable sequences as mixtures of i.i.d. sequences — while an exchangeable sequence need not itself be unconditionally i.i.d., it can be expressed as a mixture of underlying i.i.d. sequences. This means that infinite sequences of exchangeable random variables can be regarded equivalently as sequences of conditionally i.i.d. random variables, based on some underlying distributional form. (Note that this equivalence does not quite hold for finite exchangeability. However, for finite vectors of random variables there is a close approximation to the i.i.d. model.) An infinite exchangeable sequence is strictly stationary and so a
law of large numbers In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials shou ...
in the form of Birkhoff–Khinchin theorem applies. This means that the underlying distribution can be given an operational interpretation as the limiting empirical distribution of the sequence of values. The close relationship between exchangeable sequences of random variables and the i.i.d. form means that the latter can be justified on the basis of infinite exchangeability. This notion is central to Bruno de Finetti's development of
predictive inference Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers propertie ...
and to
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
. It can also be shown to be a useful foundational assumption in
frequentist statistics Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pr ...
and to link the two paradigms. * O'Neill, B. (2009) Exchangeability, Correlation and Bayes' Effect. ''International Statistical Review'' 77(2), pp. 241–250. The representation theorem: This statement is based on the presentation in O'Neill (2009) in references below. Given an infinite sequence of random variables \mathbf=(X_1,X_2,X_3,\ldots) we define the limiting
empirical distribution function In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...
F_\mathbf by: :::::F_\mathbf(x) = \lim_ \frac \sum_^n I(X_i \le x). (This is the Cesaro limit of the indicator functions. In cases where the Cesaro limit does not exist this function can actually be defined as the
Banach limit In mathematical analysis, a Banach limit is a continuous linear functional \phi: \ell^\infty \to \mathbb defined on the Banach space \ell^\infty of all bounded complex-valued sequences such that for all sequences x = (x_n), y = (y_n) in \ell^\in ...
of the indicator functions, which is an extension of this limit. This latter limit always exists for sums of indicator functions, so that the empirical distribution is always well-defined.) This means that for any vector of random variables in the sequence we have joint distribution function given by: :::::\Pr (X_1 \le x_1,X_2 \le x_2,\ldots,X_n \le x_n) = \int \prod_^n F_\mathbf(x_i)\,dP(F_\mathbf). If the distribution function F_\mathbf is indexed by another parameter \theta then (with densities appropriately defined) we have: :::::p_(x_1,\ldots,x_n) = \int \prod_^n p_(x_i\mid\theta)\,dP(\theta). These equations show the joint distribution or density characterised as a mixture distribution based on the underlying limiting empirical distribution (or a parameter indexing this distribution). Note that not all finite exchangeable sequences are mixtures of i.i.d. To see this, consider sampling without replacement from a finite set until no elements are left. The resulting sequence is exchangeable, but not a mixture of i.i.d. Indeed, conditioned on all other elements in the sequence, the remaining element is known.


Covariance and correlation

Exchangeable sequences have some basic covariance and correlation properties which mean that they are generally positively correlated. For infinite sequences of exchangeable random variables, the covariance between the random variables is equal to the variance of the mean of the underlying distribution function. For finite exchangeable sequences the covariance is also a fixed value which does not depend on the particular random variables in the sequence. There is a weaker lower bound than for infinite exchangeability and it is possible for negative correlation to exist.
Covariance for exchangeable sequences (infinite): If the sequence X_1,X_2,X_3,\ldots is exchangeable then: ::::: \operatorname (X_i,X_j) = \operatorname (\operatorname(X_i\mid F_\mathbf)) = \operatorname (\operatorname(X_i\mid\theta)) \ge 0 \quad\texti \ne j.
Covariance for exchangeable sequences (finite): If X_1,X_2,\ldots,X_n is exchangeable with \sigma^2 = \operatorname (X_i) then: ::::: \operatorname (X_i,X_j) \ge - \frac \quad\texti \ne j. The finite sequence result may be proved as follows. Using the fact that the values are exchangeable we have: :: \begin 0 & \le \operatorname(X_1 + \cdots + X_n) \\ & = \operatorname(X_1) + \cdots + \operatorname(X_n) + \underbrace_\text \\ & = n\sigma^2 + n(n-1)\operatorname(X_1,X_2). \end We can then solve the inequality for the covariance yielding the stated lower bound. The non-negativity of the covariance for the infinite sequence can then be obtained as a limiting result from this finite sequence result. Equality of the lower bound for finite sequences is achieved in a simple urn model: An urn contains 1 red marble and ''n'' − 1 green marbles, and these are sampled without replacement until the urn is empty. Let ''X''''i'' = 1 if the red marble is drawn on the ''i''-th trial and 0 otherwise. A finite sequence that achieves the lower covariance bound cannot be extended to a longer exchangeable sequence.


Examples

* Any
convex combination In convex geometry and vector algebra, a convex combination is a linear combination of points (which can be vectors, scalars, or more generally points in an affine space) where all coefficients are non-negative and sum to 1. In other w ...
or
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collectio ...
of iid sequences of random variables is exchangeable. A converse proposition is
de Finetti's theorem In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in hono ...
. * Suppose an urn contains ''n'' red and ''m'' blue marbles. Suppose marbles are drawn without replacement until the urn is empty. Let ''X''''i'' be the indicator random variable of the event that the ''i''-th marble drawn is red. Then ''i''=1,...''n+m'' is an exchangeable sequence. This sequence cannot be extended to any longer exchangeable sequence. * Let (X, Y) have a
bivariate normal distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by ex ...
with parameters \mu = 0, \sigma_x = \sigma_y = 1 and an arbitrary
correlation coefficient A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two components ...
\rho\in (-1, 1). The random variables X and Y are then exchangeable, but independent only if \rho=0. The
density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
is p(x, y) = p(y, x) \propto \exp\left \frac(x^2+y^2-2\rho xy)\right


Applications

The
von Neumann extractor A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weakly random entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent fro ...
is a
randomness extractor A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weakly random entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent f ...
that depends on exchangeability: it gives a method to take an exchangeable sequence of 0s and 1s (
Bernoulli trials In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is c ...
), with some probability ''p'' of 0 and q=1-p of 1, and produce a (shorter) exchangeable sequence of 0s and 1s with probability 1/2. Partition the sequence into non-overlapping pairs: if the two elements of the pair are equal (00 or 11), discard it; if the two elements of the pair are unequal (01 or 10), keep the first. This yields a sequence of Bernoulli trials with p=1/2, as, by exchangeability, the odds of a given pair being 01 or 10 are equal. Exchangeable random variables arise in the study of
U statistic In statistical theory, a U-statistic is a class of statistics that is especially important in estimation theory; the letter "U" stands for unbiased. In elementary statistics, U-statistics arise naturally in producing minimum-variance unbiased est ...
s, particularly in the Hoeffding decomposition.


See also

* Resampling * , statistical tests based on exchanging between groups *
U-statistic In statistical theory, a U-statistic is a class of statistics that is especially important in estimation theory; the letter "U" stands for unbiased. In elementary statistics, U-statistics arise naturally in producing minimum-variance unbiased es ...


Notes


Bibliography

* Aldous, David J., ''Exchangeability and related topics'', in: École d'Été de Probabilités de Saint-Flour XIII — 1983, Lecture Notes in Math. 1117, pp. 1–198, Springer, Berlin, 1985. * Barlow, R. E. & Irony, T. Z. (1992) "Foundations of statistical quality control" in Ghosh, M. & Pathak, P.K. (eds.) ''Current Issues in Statistical Inference: Essays in Honor of D. Basu'', Hayward, CA: Institute of Mathematical Statistics, 99-112. * Bergman, B. (2009) "Conceptualistic Pragmatism: A framework for Bayesian analysis?", ''IIE Transactions'', 41, 86–93 * * Chow, Yuan Shih and Teicher, Henry, ''Probability theory. Independence, interchangeability, martingales,'' Springer Texts in Statistics, 3rd ed., Springer, New York, 1997. xxii+488 pp.  * * Kallenberg, O., ''Probabilistic symmetries and invariance principles''. Springer-Verlag, New York (2005). 510 pp. . * Kingman, J. F. C., ''Uses of exchangeability'', Ann. Probability 6 (1978) 83–197 * O'Neill, B. (2009) Exchangeability, Correlation and Bayes' Effect. ''International Statistical Review'' 77(2), pp. 241–250. * * Zabell, S. L. (1988) "Symmetry and its discontents", in Skyrms, B. & Harper, W. L. ''Causation, Chance and Credence, ''pp''155-190, Kluwer * {{DEFAULTSORT:Exchangeable Random Variables Statistical randomness Types of probability distributions