In
statistics, the generalized Pareto distribution (GPD) is a family of continuous
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
s. It is often used to model the tails of another distribution. It is specified by three parameters: location
, scale
, and shape
. Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as
.
Definition
The standard cumulative distribution function (cdf) of the GPD is defined by
:
where the support is
for
and
for
. The corresponding probability density function (pdf) is
:
Characterization
The related location-scale family of distributions is obtained by replacing the argument ''z'' by
and adjusting the support accordingly.
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
of
(
,
, and
) is
:
where the support of
is
when
, and
when
.
The
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
(pdf) of
is
:
,
again, for
when
, and
when
.
The pdf is a solution of the following
differential equation
In mathematics, a differential equation is an equation that relates one or more unknown functions and their derivatives. In applications, the functions generally represent physical quantities, the derivatives represent their rates of change, a ...
:
:
Special cases
*If the shape
and location
are both zero, the GPD is equivalent to the
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
.
*With shape
, the GPD is equivalent to the
continuous uniform distribution
In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions. The distribution describes an experiment where there is an arbitrary outcome that lies betw ...
.
[Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.]
*With shape
and location
, the GPD is equivalent to the
Pareto distribution
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actu ...
with scale
and shape
.
*If
,
,
, then
(exGPD stands for the
Generalized Pareto distribution#Exponentiated generalized Pareto distribution, exponentiated generalized Pareto distribution.)
*GPD is similar to the
Burr distribution
In probability theory, statistics and econometrics, the Burr Type XII distribution or simply the Burr distribution is a continuous probability distribution for a non-negative random variable. It is also known as the Singh–Maddala distribution ...
.
Generating generalized Pareto random variables
Generating GPD random variables
If ''U'' is
uniformly distributed on
(0, 1
], then
:
and
:
Both formulas are obtained by inversion of the cdf.
In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.
GPD as an Exponential-Gamma Mixture
A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.
:
and
:
then
:
Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that:
must be positive.
Exponentiated generalized Pareto distribution
The exponentiated generalized Pareto distribution (exGPD)

If
,
,
, then
is distributed according to th
exponentiated generalized Pareto distribution denoted by
,
.
The
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
(pdf) of
,
is
:
where the support is
for
, and
for
.
For all
, the
becomes the location parameter. See the right panel for the pdf when the shape
is positive.
The exGPD has finite moments of all orders for all
and
.

The
moment-generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compar ...
of
is
:
where
and
denote the
beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^ ...
and
gamma function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except th ...
, respectively.
The
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of
,
depends on the scale
and shape
parameters, while the
participates through the
digamma function
In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function:
:\psi(x)=\frac\ln\big(\Gamma(x)\big)=\frac\sim\ln-\frac.
It is the first of the polygamma functions. It is strictly increasing and strict ...
:
:
Note that for a fixed value for the
, the
plays as the location parameter under the exponentiated generalized Pareto distribution.
The
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
of
,
depends on the shape parameter
only through the
polygamma function
In mathematics, the polygamma function of order is a meromorphic function on the complex numbers \mathbb defined as the th derivative of the logarithm of the gamma function:
:\psi^(z) := \frac \psi(z) = \frac \ln\Gamma(z).
Thus
:\psi^(z) ...
of order 1 (also called the
trigamma function):
:
See the right panel for the variance as a function of
. Note that
.
Note that the roles of the scale parameter
and the shape parameter
under
are separably interpretable, which may lead to a robust efficient estimation for the
than using the
The roles of the two parameters are associated each other under
(at least up to the second central moment); see the formula of variance
wherein both parameters are participated.
The Hill's estimator
Assume that
are
observations (not need to be i.i.d.) from an unknown
heavy-tailed distribution
In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distrib ...
such that its tail distribution is regularly varying with the tail-index
(hence, the corresponding shape parameter is
). To be specific, the tail distribution is described as
:
It is of a particular interest in the
extreme value theory
Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the ...
to estimate the shape parameter
, especially when
is positive (so called the heavy-tailed distribution).
Let
be their conditional excess distribution function.
Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions
, and large
,
is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate
: ''the GPD plays the key role in POT approach.''
A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For
, write
for the
-th largest value of
. Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et a
based on the
upper order statistics is defined as
:
In practice, the Hill estimator is used as follows. First, calculate the estimator
at each integer
, and then plot the ordered pairs
. Then, select from the set of Hill estimators
which are roughly constant with respect to
: these stable values are regarded as reasonable estimates for the shape parameter
. If
are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter
Note that the Hill estimator
makes a use of the log-transformation for the observations
. (The Pickand's estimator
also employed the log-transformation, but in a slightly different wa
)
See also
*
Burr distribution
In probability theory, statistics and econometrics, the Burr Type XII distribution or simply the Burr distribution is a continuous probability distribution for a non-negative random variable. It is also known as the Singh–Maddala distribution ...
*
Pareto distribution
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actu ...
*
Generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
Exponentiated generalized Pareto distribution*
Pickands–Balkema–de Haan theorem
References
Further reading
*
*
*
* Chapter 20, Section 12: Generalized Pareto Distributions.
*
*
External links
Mathworks: Generalized Pareto distribution
{{ProbDistributions, continuous-variable
Continuous distributions
Power laws
Probability distributions with non-finite variance