In
probability theory and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the generalized chi-squared distribution (or generalized chi-square distribution) is the distribution of a
quadratic form
In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example,
:4x^2 + 2xy - 3y^2
is a quadratic form in the variables and . The coefficients usually belong to a ...
of a
multinormal variable (normal vector), or a linear combination of different normal variables and squares of normal variables. Equivalently, it is also a linear sum of independent
noncentral chi-square variables and a
normal variable. There are several other such generalizations for which the same term is sometimes used; some of them are special cases of the family discussed here, for example the
gamma distribution.
Definition
The generalized chi-squared variable may be described in multiple ways. One is to write it as a linear sum of independent noncentral chi-square variables and a normal variable:
[Davies, R.B. (1973) Numerical inversion of a characteristic function. ]Biometrika
''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
, 60 (2), 415–417[Davies, R,B. (1980) "Algorithm AS155: The distribution of a linear combination of ''χ''2 random variables", ''Applied Statistics'', 29, 323–333]
:
Here the parameters are the weights
, the degrees of freedom
and non-centralities
of the constituent chi-squares, and the normal parameters
and
. Some important special cases of this have all weights
of the same sign, or have central chi-squared components, or omit the normal term.
Since a non-central chi-squared variable is a sum of squares of normal variables with different means, the generalized chi-square variable is also defined as a sum of squares of independent normal variables, plus an independent normal variable: that is, a quadratic in normal variables.
Another equivalent way is to formulate it as a quadratic form of a normal vector
:
:
.
Here
is a matrix,
is a vector, and
is a scalar. These, together with the mean
and covariance matrix
of the normal vector
, parameterize the distribution. The parameters of the former expression (in terms of non-central chi-squares, a normal and a constant) can be calculated in terms of the parameters of the latter expression (quadratic form of a normal vector).
If (and only if)
in this formulation is
positive-definite, then all the
in the first formulation will have the same sign.
For the most general case, a reduction towards a common standard form can be made by using a representation of the following form:
[Sheil, J., O'Muircheartaigh, I. (1977) "Algorithm AS106: The distribution of non-negative quadratic forms in normal variables",''Applied Statistics'', 26, 92–98]
:
where ''D'' is a diagonal matrix and where ''x'' represents a vector of uncorrelated
standard normal random variables.
Computing the pdf/cdf/inverse cdf/random numbers
The probability density, cumulative distribution, and inverse cumulative distribution functions of a generalized chi-squared variable do not have simple closed-form expressions. However, numerical algorithms
and computer code
Fortran and CMatlabPython have been published to evaluate some of these, and to generate random samples.
In the case where
, it is possible to obtain an exact expression for the mean and variance of
, as shown in the article on
quadratic forms.
Applications
The generalized chi-squared is the distribution of
statistical estimates in cases where the usual
statistical theory does not hold, as in the examples below.
In model fitting and selection
If a
predictive model is fitted by
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
, but the
residuals have either
autocorrelation
Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable ...
or
heteroscedasticity
In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The s ...
, then alternative models can be compared (in
model selection) by relating changes in the
sum of squares In mathematics, statistics and elsewhere, sums of squares occur in a number of contexts:
Statistics
* For partitioning of variance, see Partition of sums of squares
* For the "sum of squared deviations", see Least squares
* For the "sum of squar ...
to an
asymptotically valid generalized chi-squared distribution.
[Jones, D.A. (1983) "Statistical analysis of empirical models fitted by optimisation", ]Biometrika
''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
, 70 (1), 67–88
Classifying normal vectors using Gaussian discriminant analysis
If
is a normal vector, its log likelihood is a
quadratic form
In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example,
:4x^2 + 2xy - 3y^2
is a quadratic form in the variables and . The coefficients usually belong to a ...
of
, and is hence distributed as a generalized chi-squared. The log likelihood ratio that
arises from one normal distribution versus another is also a
quadratic form
In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example,
:4x^2 + 2xy - 3y^2
is a quadratic form in the variables and . The coefficients usually belong to a ...
, so distributed as a generalized chi-squared.
In Gaussian discriminant analysis, samples from multinormal distributions are optimally separated by using a
quadratic classifier
In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier.
The classific ...
, a boundary that is a quadratic function (e.g. the curve defined by setting the likelihood ratio between two Gaussians to 1). The classification error rates of different types (false positives and false negatives) are integrals of the normal distributions within the quadratic regions defined by this classifier. Since this is mathematically equivalent to integrating a quadratic form of a normal vector, the result is an integral of a generalized-chi-squared variable.
In signal processing
The following application arises in the context of
Fourier analysis
In mathematics, Fourier analysis () is the study of the way general functions may be represented or approximated by sums of simpler trigonometric functions. Fourier analysis grew from the study of Fourier series, and is named after Josep ...
in
signal processing,
renewal theory
Renewal theory is the branch of probability theory that generalizes the Poisson process for arbitrary holding times. Instead of exponentially distributed holding times, a renewal process may have any independent and identically distributed (IID) ho ...
in
probability theory, and
multi-antenna systems in
wireless communication. The common factor of these areas is that the sum of exponentially distributed variables is of importance (or identically, the sum of squared magnitudes of
circularly-symmetric centered complex Gaussian variables).
If
are ''k''
independent,
circularly-symmetric centered complex Gaussian random variables with
mean 0 and
variance , then the random variable
:
has a generalized chi-squared distribution of a particular form. The difference from the standard chi-squared distribution is that
are complex and can have different variances, and the difference from the more general generalized chi-squared distribution is that the relevant scaling matrix ''A'' is diagonal. If
for all ''i'', then
, scaled down by
(i.e. multiplied by
), has a
chi-squared distribution,
, also known as an
Erlang distribution. If
have distinct values for all ''i'', then
has the pdf
:
If there are sets of repeated variances among
, assume that they are divided into ''M'' sets, each representing a certain variance value. Denote
to be the number of repetitions in each group. That is, the ''m''th set contains
variables that have variance
It represents an arbitrary linear combination of independent
-distributed random variables with different degrees of freedom:
:
The pdf of
is
[E. Björnson, D. Hammarwall, B. Ottersten (2009]
"Exploiting Quantized Channel Norm Feedback through Conditional Statistics in Arbitrarily Correlated MIMO Systems"
''IEEE Transactions on Signal Processing'', 57, 4027–4041
:
where
:
with
from the set
of
all partitions of
(with
) defined as
:
See also
*
Degrees of freedom (statistics)#Alternative
*
Noncentral chi-squared distribution
In probability theory and statistics, the noncentral chi-squared distribution (or noncentral chi-square distribution, noncentral \chi^2 distribution) is a noncentral distribution, noncentral generalization of the chi-squared distribution. It ofte ...
*
Chi-squared distribution
References
External links
Davies, R.B.: Fortran and C source code for "Linear combination of chi-squared random variables"Das, A: MATLAB code to compute the statistics, pdf, cdf, inverse cdf and random numbers of the generalized chi-square distribution.
{{ProbDistributions
Continuous distributions