HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, the probability integral transform (also known as universality of the uniform) relates to the result that data values that are modeled as being random variables from any given
continuous distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
can be converted to random variables having a
standard uniform distribution In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions. The distribution describes an experiment where there is an arbitrary outcome that lies betwe ...
. This holds exactly provided that the distribution being used is the true distribution of the random variables; if the distribution is one fitted to the data, the result will hold approximately in large samples. The result is sometimes modified or extended so that the result of the transformation is a standard distribution other than the uniform distribution, such as the exponential distribution.


Applications

One use for the probability integral transform in statistical data analysis is to provide the basis for testing whether a set of observations can reasonably be modelled as arising from a specified distribution. Specifically, the probability integral transform is applied to construct an equivalent set of values, and a test is then made of whether a uniform distribution is appropriate for the constructed dataset. Examples of this are
P–P plot In statistics, a P–P plot (probability–probability plot or percent–percent plot or P value plot) is a probability plot for assessing how closely two data sets agree, or for assessing how closely a dataset fits a particular model. It works b ...
s and
Kolmogorov–Smirnov test In statistics, the Kolmogorov–Smirnov test (K–S test or KS test) is a nonparametric test of the equality of continuous (or discontinuous, see Section 2.2), one-dimensional probability distributions that can be used to compare a sample wit ...
s. A second use for the transformation is in the theory related to copulas which are a means of both defining and working with distributions for statistically dependent multivariate data. Here the problem of defining or manipulating a joint probability distribution for a set of random variables is simplified or reduced in apparent complexity by applying the probability integral transform to each of the components and then working with a joint distribution for which the marginal variables have uniform distributions. A third use is based on applying the inverse of the probability integral transform to convert random variables from a uniform distribution to have a selected distribution: this is known as
inverse transform sampling Inverse transform sampling (also known as inversion sampling, the inverse probability integral transform, the inverse transformation method, Smirnov transform, or the golden ruleAalto University, N. Hyvönen, Computational methods in inverse probl ...
.


Statement

Suppose that a random variable ''X'' has a
continuous distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
for which the
cumulative distribution function (CDF) In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
is ''F''''X''. Then the random variable ''Y'' defined as :Y=F_X(X) \,, has a
standard uniform distribution In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions. The distribution describes an experiment where there is an arbitrary outcome that lies betwe ...
.Dodge, Y. (2006) ''The Oxford Dictionary of Statistical Terms'',
Oxford University Press Oxford University Press (OUP) is the university press of the University of Oxford. It is the largest university press in the world, and its printing history dates back to the 1480s. Having been officially granted the legal right to print books ...
Equivalently, the distribution of ''X'' on \R is the
pushforward measure In measure theory, a pushforward measure (also known as push forward, push-forward or image measure) is obtained by transferring ("pushing forward") a measure from one measurable space to another using a measurable function. Definition Given meas ...
of the uniform measure on
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
/math>, pushforwarded by F_X^.


Proof

Given any random continuous variable X, define Y = F_X (X). Given y \in ,1, if F^_X(y) exists (i.e., if there exists a unique x such that F_X(x)=y ), then: : \begin F_Y (y) &= \operatorname(Y\leq y) \\ &= \operatorname(F_X (X)\leq y) \\ &= \operatorname(X\leq F^_X (y)) \\ &= F_X (F^_X (y)) \\ &= y \end If F^_X(y) does not exist, then it can be replaced in this proof by the function \chi, where we define \chi(0)=-\infty, \chi(1)=\infty, and \chi(y) \equiv \inf \ for y\in(0,1), with the same result that F_Y(y)=y. Thus, F_Y is just the CDF of a \mathrm(0,1) random variable, so that Y has a uniform distribution on the interval
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
/math>.


Examples

For an illustrative example, let ''X'' be a random variable with a standard normal distribution \mathcal(0,1). Then its CDF is :\Phi(x) = \frac \int_^x ^ \, t = \frac12\Big , 1 + \operatorname\Big(\frac\Big)\,\Big\quad x\in\mathbb, \, where \operatorname(), is the error function. Then the new random variable ''Y'', defined by ''Y''=Φ(''X''), is uniformly distributed. If ''X'' has an exponential distribution with unit mean, then its CDF is :F(x)=1-\exp(-x), and the immediate result of the probability integral transform is that :Y=1-\exp(-X) has a uniform distribution. The symmetry of the uniform distribution can then be used to show that :Y'=\exp(-X) also has a uniform distribution.


See also

*
Inverse transform sampling Inverse transform sampling (also known as inversion sampling, the inverse probability integral transform, the inverse transformation method, Smirnov transform, or the golden ruleAalto University, N. Hyvönen, Computational methods in inverse probl ...


References

{{DEFAULTSORT:Probability Integral Transform Theory of probability distributions