In
probability and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a compound probability distribution (also known as a
mixture distribution or contagious distribution) is the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
that results from assuming that a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
is distributed according to some parametrized distribution, with (some of) the parameters of that distribution themselves being random variables.
If the parameter is a
scale parameter, the resulting mixture is also called a scale mixture.
The compound distribution ("unconditional distribution") is the result of
marginalizing
Social exclusion or social marginalisation is the social disadvantage and relegation to the fringe of society. It is a term that has been used widely in Europe and was first used in France in the late 20th century. It is used across discipline ...
(integrating) over the ''latent'' random variable(s) representing the parameter(s) of the parametrized distribution ("conditional distribution").
Definition
A compound probability distribution is the probability distribution that results from assuming that a random variable
is distributed according to some parametrized distribution
with an unknown parameter
that is again distributed according to some other distribution
. The resulting distribution
is said to be the distribution that results from compounding
with
. The parameter's distribution
is also called the mixing distribution or latent distribution. Technically, the ''unconditional'' distribution
results from ''
marginalizing
Social exclusion or social marginalisation is the social disadvantage and relegation to the fringe of society. It is a term that has been used widely in Europe and was first used in France in the late 20th century. It is used across discipline ...
'' over
, i.e., from integrating out the unknown parameter(s)
. Its
probability density function is given by:
:
The same formula applies analogously if some or all of the variables are vectors.
From the above formula, one can see that a compound distribution essentially is a special case of a
marginal distribution: The ''
joint distribution'' of
and
is given by
, and the compound results as its marginal distribution:
.
If the domain of
is discrete, then the distribution is again a special case of a
mixture distribution.
Properties
The compound distribution
will depend on the specific expression of each distribution, as well as which parameter of
is distributed according to the distribution
, and the parameters of
will include any parameters of
that are not marginalized, or integrated, out.
The
support
Support may refer to:
Arts, entertainment, and media
* Supporting character
Business and finance
* Support (technical analysis)
* Child support
* Customer support
* Income Support
Construction
* Support (structure), or lateral support, a ...
of
is the same as that of
, and if the latter is a two-parameter distribution parameterized with the mean and variance, some general properties exist.
The compound distribution's first two
moments are given by:
(Law of total variance">operatorname_F(X">\theta)\bigr+ \operatorname_G\bigl(\operatorname_F[X">\thetabigr) (Law of total variance)
If the mean of
is distributed as
, which in turn has mean
and variance
the expressions above imply
and
, where
is the variance of
.
Proof
let
and
be probability distributions parameterized with mean a variance as
then denoting the probability density functions as
and
respectively, and
being the probability density of
we have
and we have from the parameterization
and
that
and therefore the mean of the compound distribution
as per the expression for its first moment above.
The variance of
is given by
, and
given the fact that
and
. Finally we get
Applications
Testing
Distributions of common
test statistics result as compound distributions under their null hypothesis, for example in
Student's t-test
A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
(where the test statistic results as the ratio of a
normal and a
chi-squared random variable), or in the
F-test (where the test statistic is the ratio of two
chi-squared random variables).
Overdispersion modeling
Compound distributions are useful for modeling outcomes exhibiting
overdispersion, i.e., a greater amount of variability than would be expected under a certain model. For example, count data are commonly modeled using the
Poisson distribution, whose variance is equal to its mean. The distribution may be generalized by allowing for variability in its
rate parameter, implemented via a
gamma distribution, which results in a marginal
negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
. This distribution is similar in its shape to the Poisson distribution, but it allows for larger variances. Similarly, a
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
may be generalized to allow for additional variability by compounding it with a
beta distribution for its success probability parameter, which results in a
beta-binomial distribution.
Bayesian inference
Besides ubiquitous marginal distributions that may be seen as special cases of compound distributions,
in
Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
, compound distributions arise when, in the notation above, ''F'' represents the distribution of future observations and ''G'' is the
posterior distribution of the parameters of ''F'', given the information in a set of observed data. This gives a
posterior predictive distribution. Correspondingly, for the
prior predictive distribution, ''F'' is the distribution of a new data point while ''G'' is the
prior distribution of the parameters.
Convolution
Convolution of probability distributions (to derive the probability distribution of sums of random variables) may also be seen as a special case of compounding; here the sum's distribution essentially results from considering one summand as a random
location parameter for the other summand.
Computation
Compound distributions derived from
exponential family distributions often have a closed form.
If analytical integration is not possible, numerical methods may be necessary.
Compound distributions may relatively easily be investigated using
Monte Carlo methods, i.e., by generating random samples. It is often easy to generate random numbers from the
distributions
as well as
and then utilize these to perform ''
collapsed Gibbs sampling
In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when direct sampling is diffic ...
'' to generate samples from
.
A compound distribution may usually also be approximated to a sufficient degree by a
mixture distribution using a finite number of mixture components, allowing to derive approximate density, distribution function etc.
[
Parameter estimation ( maximum-likelihood or maximum-a-posteriori estimation) within a compound distribution model may sometimes be simplified by utilizing the EM-algorithm.
]
Examples
* Gaussian scale mixtures:
** Compounding a normal distribution with variance distributed according to an inverse gamma distribution
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to ...
(or equivalently, with precision distributed as a gamma distribution) yields a non-standardized Student's t-distribution
In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in sit ...
. This distribution has the same symmetrical shape as a normal distribution with the same central point, but has greater variance and heavy tails.
** Compounding a Gaussian (or normal) distribution with variance distributed according to an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
(or with standard deviation according to a Rayleigh distribution) yields a Laplace distribution
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponen ...
. More generally, compounding a Gaussian (or normal) distribution with variance distributed according to a gamma distribution yields a variance-gamma distribution.
** Compounding a Gaussian distribution with variance distributed according to an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
whose rate parameter is itself distributed according to a gamma distribution yields a Normal-exponential-gamma distribution. (This involves two compounding stages. The variance itself then follows a Lomax distribution; see below.)
** Compounding a Gaussian distribution with standard deviation distributed according to a (standard) inverse uniform distribution yields a Slash distribution.
* other Gaussian mixtures:
** Compounding a Gaussian distribution with mean distributed according to another Gaussian distribution yields (again) a Gaussian distribution.
** Compounding a Gaussian distribution with mean distributed according to a shifted exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
yields an exponentially modified Gaussian distribution.
* Compounding a Bernoulli distribution with probability of success distributed according to a distribution that has a defined expected value yields a Bernoulli distribution with success probability binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
with probability of success distributed according to a beta distribution yields a beta-binomial distribution. It possesses three parameters, a parameter n (number of samples) from the binomial distribution and shape parameters \alpha and \beta from the beta distribution.
* Compounding a multinomial distribution with probability vector distributed according to a Dirichlet distribution yields a Dirichlet-multinomial distribution.
* Compounding a Poisson distribution with rate parameter distributed according to a gamma distribution yields a negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
.
* Compounding a Poisson distribution with rate parameter distributed according to a exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
yields a geometric distribution.
* Compounding an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
with its rate parameter distributed according to a gamma distribution yields a Lomax distribution.
* Compounding a gamma distribution with inverse scale parameter distributed according to another gamma distribution yields a three-parameter beta prime distribution.
* Compounding a half-normal distribution with its scale parameter distributed according to a Rayleigh distribution yields an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
. This follows immediately from the Laplace distribution
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponen ...
resulting as a normal scale mixture; see above. The roles of conditional and mixing distributions may also be exchanged here; consequently, compounding a Rayleigh distribution with its scale parameter distributed according to a half-normal distribution ''also'' yields an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
.
* A Gamma(k=2,θ) - distributed random variable whose scale parameter θ again is uniformly
Uniform distribution may refer to:
* Continuous uniform distribution
* Discrete uniform distribution
* Uniform distribution (ecology)
* Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...
distributed marginally yields an exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
.
Similar terms
The notion of "compound distribution" as used e.g. in the definition of a Compound Poisson distribution or Compound Poisson process
A compound Poisson process is a continuous-time (random) stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution. A compound Poisso ...
is different from the definition found in this article. The meaning in this article corresponds to what is used in e.g. Bayesian hierarchical modeling.
The special case for compound probability distributions where the parametrized distribution F is the Poisson distribution is also called mixed Poisson distribution.
See also
* Mixture distribution
* Mixed Poisson distribution
* Bayesian hierarchical modeling
* Marginal distribution
* Conditional distribution
* Joint distribution
* Convolution
* Overdispersion
* EM-algorithm
References
Further reading
*
*
*
* {{citation
, title=Univariate discrete distributions
, last1=Johnson , first1=N. L.
, last2=Kemp , first2=A. W.
, last3=Kotz , first3=S.
, chapter=8 ''Mixture distributions''
, year=2005
, publisher=Wiley
, location=New York
, isbn=978-0-471-27246-5
Types of probability distributions