Gram–Charlier Theory
   HOME

TheInfoList



OR:

In
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, the Gram–Charlier A series (named in honor of
Jørgen Pedersen Gram Jørgen Pedersen Gram (27 June 1850 – 29 April 1916) was a Danish actuary and mathematician who was born in Nustrup, Duchy of Schleswig, Denmark and died in Copenhagen, Denmark. Important papers of his include ''On series expansions determin ...
and
Carl Charlier Carl Vilhelm Ludwig Charlier (1 April 1862 – 4 November 1934) was a Swedish astronomer. His parents were Emmerich Emanuel and Aurora Kristina (née Hollstein) Charlier. Career Charlier was born in Östersund. He received his Ph.D. fro ...
), and the Edgeworth series (named in honor of
Francis Ysidro Edgeworth Francis Ysidro Edgeworth (8 February 1845 – 13 February 1926) was an Anglo-Irish philosopher and political economist who made significant contributions to the methods of statistics during the 1880s. From 1891 onward, he was appointed th ...
) are
series Series may refer to: People with the name * Caroline Series (born 1951), English mathematician, daughter of George Series * George Series (1920–1995), English physicist Arts, entertainment, and media Music * Series, the ordered sets used i ...
that approximate a
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
over the real line (-\infty,\infty) in terms of its
cumulant In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
s. The series are the same; but, the arrangement of terms (and thus the accuracy of truncating the series) differ. The key idea of these expansions is to write the
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function \mathbf_A\colon X \to \, which for a given subset ''A'' of ''X'', has value 1 at points ...
of the distribution whose
probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover through the inverse
Fourier transform In mathematics, the Fourier transform (FT) is an integral transform that takes a function as input then outputs another function that describes the extent to which various frequencies are present in the original function. The output of the tr ...
.


Gram–Charlier A series

We examine a continuous random variable. Let \hat be the characteristic function of its distribution whose density function is , and \kappa_r its
cumulant In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
s. We expand in terms of a known distribution with probability density function , characteristic function \hat, and cumulants \gamma_r. The density is generally chosen to be that of the
normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...
, but other choices are possible as well. By the definition of the cumulants, we have (see Wallace, 1958) :\hat(t)= \exp\left sum_^\infty\kappa_r\frac\right/math> and : \hat(t)=\exp\left sum_^\infty\gamma_r\frac\right which gives the following formal identity: :\hat(t)=\exp\left sum_^\infty(\kappa_r-\gamma_r)\frac\righthat(t)\,. By the properties of the Fourier transform, (it)^r \hat(t) is the Fourier transform of (-1)^r ^r\psi-x), where is the
differential operator In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and retur ...
with respect to . Thus, after changing x with -x on both sides of the equation, we find for the formal expansion :f(x) = \exp\left sum_^\infty(\kappa_r - \gamma_r)\frac\rightpsi(x)\,. If is chosen as the normal density :\phi(x) = \frac\exp\left \frac\right/math> with mean and variance as given by , that is, mean \mu = \kappa_1 and variance \sigma^2 = \kappa_2, then the expansion becomes :f(x) = \exp\left sum_^\infty\kappa_r\frac\right\phi(x), since \gamma_r=0 for all > 2, as higher cumulants of the normal distribution are 0. By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. Such an expansion can be written compactly in terms of
Bell polynomials In combinatorial mathematics, the Bell polynomials, named in honor of Eric Temple Bell, are used in the study of set partitions. They are related to Stirling and Bell numbers. They also occur in many applications, such as in Faà di Bruno's for ...
as :\exp\left sum_^\infty\kappa_r\frac\right= \sum_^\infty B_n(0,0,\kappa_3,\ldots,\kappa_n)\frac. Since the n-th derivative of the Gaussian function \phi is given in terms of
Hermite polynomial In mathematics, the Hermite polynomials are a classical orthogonal polynomial sequence. The polynomials arise in: * signal processing as Hermitian wavelets for wavelet transform analysis * probability, such as the Edgeworth series, as well a ...
as :\phi^(x) = \frac He_n \left( \frac \right) \phi(x), this gives us the final expression of the Gram–Charlier A series as : f(x) = \phi(x) \sum_^\infty \frac B_n(0,0,\kappa_3,\ldots,\kappa_n) He_n \left( \frac \right). Integrating the series gives us the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
: F(x) = \int_^x f(u) du = \Phi(x) - \phi(x) \sum_^\infty \frac B_n(0,0,\kappa_3,\ldots,\kappa_n) He_ \left( \frac \right), where \Phi is the CDF of the normal distribution. If we include only the first two correction terms to the normal distribution, we obtain : f(x) \approx \frac\exp\left \frac\rightleft +\fracHe_3\left(\frac\right)+\fracHe_4\left(\frac\right)\right,, with He_3(x)=x^3-3x and He_4(x)=x^4 - 6x^2 + 3. Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if f(x) falls off faster than \exp(-(x^2)/4) at infinity (Cramér 1957). When it does not converge, the series is also not a true
asymptotic expansion In mathematics, an asymptotic expansion, asymptotic series or Poincaré expansion (after Henri Poincaré) is a formal series of functions which has the property that truncating the series after a finite number of terms provides an approximation ...
, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.


The Edgeworth series

Edgeworth developed a similar expansion as an improvement to the
central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...
. The advantage of the Edgeworth series is that the error is controlled, so that it is a true
asymptotic expansion In mathematics, an asymptotic expansion, asymptotic series or Poincaré expansion (after Henri Poincaré) is a formal series of functions which has the property that truncating the series after a finite number of terms provides an approximation ...
. Let \ be a sequence of
independent and identically distributed Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...
random variables with finite mean \mu and variance \sigma^2, and let X_n be their standardized sums: :X_n = \frac \sum_^n \frac. Let F_n denote the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
s of the variables X_n. Then by the central limit theorem, : \lim_ F_n(x) = \Phi(x) \equiv \int_^x \tfrace^dq for every x, as long as the mean and variance are finite. The standardization of \ ensures that the first two cumulants of X_n are \kappa_1^ = 0 and \kappa_2^ = 1. Now assume that, in addition to having mean \mu and variance \sigma^2, the i.i.d. random variables Z_i have higher cumulants \kappa_r. From the additivity and homogeneity properties of cumulants, the cumulants of X_n in terms of the cumulants of Z_i are for r \geq 2, : \kappa_r^ = \frac = \frac \quad \mathrm \quad \lambda_r = \frac. If we expand the formal expression of the characteristic function \hat_n(t) of F_n in terms of the standard normal distribution, that is, if we set :\phi(x)=\frac\exp(-\tfracx^2), then the cumulant differences in the expansion are : \kappa^_1-\gamma_1 = 0, : \kappa^_2-\gamma_2 = 0, : \kappa^_r-\gamma_r = \frac; \qquad r\geq 3. The Gram–Charlier A series for the density function of X_n is now : f_n(x) = \phi(x) \sum_^\infty \frac B_r \left(0,0,\frac,\ldots,\frac\right) He_r(x). The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of n. The coefficients of ''n''−''m''/2 term can be obtained by collecting the monomials of the Bell polynomials corresponding to the integer partitions of ''m''. Thus, we have the characteristic function as : \hat_n(t)=\left +\sum_^\infty \frac\right\exp(-t^2/2)\,, where P_j(x) is a
polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
of degree 3j. Again, after inverse Fourier transform, the density function f_n follows as : f_n(x) = \phi(x) + \sum_^\infty \frac \phi(x)\,. Likewise, integrating the series, we obtain the distribution function : F_n(x) = \Phi(x) + \sum_^\infty \frac \frac \phi(x)\,. We can explicitly write the polynomial P_m(-D) as : P_m(-D) = \sum \prod_i \frac \left(\frac\right)^ (-D)^s, where the summation is over all the integer partitions of ''m'' such that \sum_i i k_i = m and l_i = i+2 and s = \sum_i k_i l_i. For example, if ''m'' = 3, then there are three ways to partition this number: 1 + 1 + 1 = 2 + 1 = 3. As such we need to examine three cases: * 1 + 1 + 1 = 1 · ''k''1, so we have ''k''1 = 3, ''l''1 = 3, and ''s'' = 9. * 1 + 2 = 1 · ''k''1 + 2 · ''k''2, so we have ''k''1 = 1, ''k''2 = 1, ''l''1 = 3, ''l''2 = 4, and ''s'' = 7. * 3 = 3 · ''k''3, so we have ''k''3 = 1, ''l''3 = 5, and ''s'' = 5. Thus, the required polynomial is : \begin P_3(-D) &= \frac \left(\frac\right)^3 (-D)^9 + \frac \left(\frac\right) \left(\frac\right) (-D)^7 + \frac \left(\frac\right) (-D)^5 \\ &= \frac (-D)^9 + \frac (-D)^7 + \frac (-D)^5. \end The first five terms of the expansion are :\begin f_n(x) &= \phi(x) \\ &\quad -n^\left(\tfrac\lambda_3\,\phi^(x) \right) \\ &\quad +n^\left(\tfrac\lambda_4\,\phi^(x) + \tfrac\lambda_3^2\,\phi^(x) \right) \\ &\quad -n^\left(\tfrac\lambda_5\,\phi^(x) + \tfrac\lambda_3\lambda_4\,\phi^(x) + \tfrac\lambda_3^3\,\phi^(x)\right) \\ &\quad + n^\left(\tfrac\lambda_6\,\phi^(x) + \left(\tfrac\lambda_4^2 + \tfrac\lambda_3\lambda_5\right)\phi^(x) + \tfrac\lambda_3^2\lambda_4\,\phi^(x) + \tfrac\lambda_3^4\,\phi^(x) \right)\\ &\quad + O \left (n^ \right ). \end Here, is the ''j''-th derivative of at point ''x''. Remembering that the derivatives of the density of the normal distribution are related to the normal density by \phi^(x) = (-1)^n He_n(x)\phi(x), (where He_n is the
Hermite polynomial In mathematics, the Hermite polynomials are a classical orthogonal polynomial sequence. The polynomials arise in: * signal processing as Hermitian wavelets for wavelet transform analysis * probability, such as the Edgeworth series, as well a ...
of order ''n''), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion. Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.


Illustration: density of the sample mean of three χ² distributions

Take X_i \sim \chi^2(k=2), \, i=1, 2, 3 \, (n=3) and the sample mean \bar X = \frac \sum_^ X_i . We can use several distributions for \bar X : * The exact distribution, which follows a
gamma distribution In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the g ...
: \bar X \sim \mathrm\left(\alpha=n\cdot k /2, \theta= 2/n \right)=\mathrm\left(\alpha=3, \theta= 2/3 \right). * The asymptotic normal distribution: \bar X \xrightarrow N(k, 2\cdot k /n ) = N(2, 4/3 ). * Two Edgeworth expansions, of degrees 2 and 3.


Discussion of results

* For finite samples, an Edgeworth expansion is not guaranteed to be a proper
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
as the CDF values at some points may go beyond ,1/math>. * They guarantee (asymptotically) absolute errors, but relative errors can be easily assessed by comparing the leading Edgeworth term in the remainder with the overall leading term.


See also

* Cornish–Fisher expansion *
Edgeworth binomial tree In quantitative finance, a lattice model is a numerical approach to the valuation of derivatives in situations requiring a discrete time model. For dividend paying equity options, a typical application would correspond to the pricing of an ...


References


Further reading

* H. Cramér. (1957). ''Mathematical Methods of Statistics''. Princeton University Press, Princeton. * * M. Kendall & A. Stuart. (1977), ''The advanced theory of statistics'', Vol 1: Distribution theory, 4th Edition, Macmillan, New York. * P. McCullagh (1987). ''Tensor Methods in Statistics''. Chapman and Hall, London. * D. R. Cox and O. E. Barndorff-Nielsen (1989). ''Asymptotic Techniques for Use in Statistics''. Chapman and Hall, London. * P. Hall (1992). ''The Bootstrap and Edgeworth Expansion''. Springer, New York. * * * * J. E. Kolassa (2006). ''Series Approximation Methods in Statistics'' (3rd ed.). (Lecture Notes in Statistics #88). Springer, New York. {{DEFAULTSORT:Edgeworth Series Series (mathematics) Statistical approximations