HOME

TheInfoList



OR:

Formalized by
John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
, the Tukey lambda distribution is a continuous, symmetric probability distribution defined in terms of its
quantile function In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value e ...
. It is typically used to identify an appropriate distribution (see the comments below) and not used in
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s directly. The Tukey lambda distribution has a single shape parameter, λ, and as with other probability distributions, it can be transformed with a
location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
, μ, and a scale parameter, σ. Since the general form of probability distribution can be expressed in terms of the standard distribution, the subsequent formulas are given for the standard form of the function.


Quantile function

For the standard form of the Tukey lambda distribution, the quantile function, ~Q(p)~, (i.e. the inverse function to the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
) and the quantile density function (~ q = \operatornameQ / \operatornamep ~ are : Q\left(p;\lambda\right) ~=~ \begin \frac \left ^\lambda - (1 - p)^\lambda\right, & \mbox \lambda \ne 0~, \\ \log(\frac)~, & \mbox \lambda = 0~. \end :q\left(p;\lambda\right) ~=~ \operatornameQ/\operatornamep ~=~ p^ + \left(1-p\right)^~. For most values of the shape parameter, , the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
(PDF) and
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
(CDF) must be computed numerically. The Tukey lambda distribution has a simple, closed form for the CDF and / or PDF only for a few exceptional values of the shape parameter, for example: (see
uniform distribution Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...
ase = 1and the
logistic distribution Logistic may refer to: Mathematics * Logistic function, a sigmoid function used in many fields ** Logistic map, a recurrence relation that sometimes exhibits chaos ** Logistic regression, a statistical model using the logistic function ** Logit ...
ase = 0 Ase may refer to: * Ase, Nigeria, a town in Delta State, Nigeria * -ase, a suffix used for the names of enzymes * Aṣẹ, a West African philosophical concept * American Sign Language (ISO 639-3 code: ase) See also * Åse (disambiguation) Åse m ...
. However, for any value of both the CDF and PDF can be tabulated for any number of cumulative probabilities, , using the quantile function to calculate the value , for each cumulative probability , with the probability density given by , the reciprocal of the quantile density function. As is the usual case with statistical distributions, the Tukey lambda distribution can readily be used by looking up values in a prepared table.


Moments

The Tukey lambda distribution is symmetric around zero, therefore the expected value of this distribution is equal to zero. The variance exists for and is given by the formula (except when ''λ'' = 0) : \operatorname = \frac\bigg(\frac - \frac\bigg). More generally, the ''n''-th order moment is finite when and is expressed in terms of the beta function ''Β''(''x'',''y'') (except when ''λ'' = 0) : : \mu_n = \operatorname ^n= \frac \sum_^n (-1)^k \, \Beta(\lambda k+1,\, \lambda(n-k)+1 ). Note that due to symmetry of the density function, all moments of odd orders are equal to zero.


L-moments

Differently from the central moments,
L-moments In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution. They are linear combinations of order statistics ( L-statistics) analogous to conventional moments, and can be used to calculate qu ...
can be expressed in a closed form. The L-moment of order ''r>1'' is given by : L_ = \frac\sum_^ (-1)^ \binom \binom \left(\frac \right). The first six L-moments can be presented as follows: : L_=0 : L_2 = \frac\left - \frac + \frac \right : L_3 =0 : L_4 = \frac\left - \frac + \frac - \frac + \frac \right : L_5 =0 : L_6 = \frac\left -\frac + \frac - \frac +\frac-\frac+\frac \right,.


Comments

The Tukey lambda distribution is actually a family of distributions that can approximate a number of common distributions. For example, The most common use of this distribution is to generate a Tukey lambda PPCC plot of a
data set A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the d ...
. Based on the PPCC plot, an appropriate model for the data is suggested. For example, if the best-fit of the curve to the data occurs for a value of at or near 0.14, then the data could be well-modeled with a normal distribution. Values of less than 0.14 suggests a heavier-tailed distribution; a milepost at = 0 (logistic) would indicate quite fat tails, with the extreme limit at = −1, approximating Cauchy. That is, as the best-fit value of varies from 0.14 towards −1, a bell-shaped PDF with increasingly heavy tails is suggested. Similarly, for an optimal value of becomes greater than 0.14 suggests a distribution with ''exceptionally'' thin tails (based on the point of view that the normal distribution itself is thin-tailed to begin with). Except for values of very close to 0, all the suggested PDF functions have finite support, between     and    . Since the Tukey lambda distribution is a
symmetric Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definit ...
distribution, the use of the Tukey lambda PPCC plot to determine a reasonable distribution to model the data only applies to symmetric distributions. A
histogram A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or " bucket") the range of values—that is, divide the ent ...
of the data should provide evidence as to whether the data can be reasonably modeled with a symmetric distribution.


References


External links

* {{ProbDistributions, continuous-variable Continuous distributions Probability distributions with non-finite variance