In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
and
statistics, the Gumbel distribution (also known as the type-I
generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
) is used to model the distribution of the maximum (or the minimum) of a number of samples of various distributions.
This distribution might be used to represent the distribution of the maximum level of a river in a particular year if there was a list of maximum values for the past ten years. It is useful in predicting the chance that an extreme earthquake, flood or other natural disaster will occur. The potential applicability of the Gumbel distribution to represent the distribution of maxima relates to
extreme value theory
Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the ...
, which indicates that it is likely to be useful if the distribution of the underlying sample data is of the normal or exponential type. ''This article uses the Gumbel distribution to model the distribution of the maximum value''. ''To model the minimum value, use the negative of the original values.''
The Gumbel distribution is a particular case of the
generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
(also known as the Fisher-Tippett distribution). It is also known as the ''log-
Weibull distribution
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice R ...
'' and the ''double exponential distribution'' (a term that is alternatively sometimes used to refer to the
Laplace distribution
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two expo ...
). It is related to the
Gompertz distribution
In probability and statistics, the Gompertz distribution is a continuous probability distribution, named after Benjamin Gompertz. The Gompertz distribution is often applied to describe the distribution of adult lifespans by demographers and act ...
: when its density is first reflected about the origin and then restricted to the positive half line, a Gompertz function is obtained.
In the
latent variable
In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
formulation of the
multinomial logit
In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the pro ...
model — common in
discrete choice
In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Su ...
theory — the errors of the latent variables follow a Gumbel distribution. This is useful because the difference of two Gumbel-distributed
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
s has a
logistic distribution
Logistic may refer to:
Mathematics
* Logistic function, a sigmoid function used in many fields
** Logistic map, a recurrence relation that sometimes exhibits chaos
** Logistic regression, a statistical model using the logistic function
** Logit ...
.
The Gumbel distribution is named after
Emil Julius Gumbel
Emil Julius Gumbel (18 July 1891, in Munich – 10 September 1966, in New York City) was a German mathematician and political writer.
Gumbel specialised in mathematical statistics and, along with Leonard Tippett and Ronald Fisher, was instru ...
(1891–1966), based on his original papers describing the distribution.
Definitions
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
of the Gumbel distribution is
:
Standard Gumbel distribution
The standard Gumbel distribution is the case where
and
with cumulative distribution function
:
and probability density function
:
In this case the mode is 0, the median is
, the mean is
(the
Euler–Mascheroni constant), and the standard deviation is
The cumulants, for n>1, are given by
:
Properties
The mode is μ, while the median is
and the mean is given by
:
,
where
is the
Euler-Mascheroni constant
Euler's constant (sometimes also called the Euler–Mascheroni constant) is a mathematical constant usually denoted by the lowercase Greek letter gamma ().
It is defined as the limiting difference between the harmonic series and the natural ...
.
The standard deviation
is
hence
At the mode, where
, the value of
becomes
, irrespective of the value of
Related distributions
* If
has a Gumbel distribution, then the conditional distribution of ''Y=−X'' given that ''Y'' is positive, or equivalently given that ''X'' is negative, has a
Gompertz distribution
In probability and statistics, the Gompertz distribution is a continuous probability distribution, named after Benjamin Gompertz. The Gompertz distribution is often applied to describe the distribution of adult lifespans by demographers and act ...
. The cdf ''G'' of ''Y'' is related to ''F'', the cdf of ''X'', by the formula
for ''y''>0. Consequently, the densities are related by
: the
Gompertz density is proportional to a reflected Gumbel density, restricted to the positive half-line.
* If ''X'' is an exponentially distributed variable with mean 1, then −log(''X'') has a standard Gumbel distribution.
* If
and
are independent, then
(see
Logistic distribution
Logistic may refer to:
Mathematics
* Logistic function, a sigmoid function used in many fields
** Logistic map, a recurrence relation that sometimes exhibits chaos
** Logistic regression, a statistical model using the logistic function
** Logit ...
).
* If
are independent, then
. Note that
. More generally, the distribution of linear combinations of independent Gumbel random variables can be approximated by GNIG and GIG distributions.
Theory related to the
generalized multivariate log-gamma distribution In probability theory and statistics, the generalized multivariate log-gamma (G-MVLG) distribution is a multivariate distribution introduced by Demirhan and Hamurkaroglu in 2011. The G-MVLG is a flexible distribution. Skewness and kurtosis are well ...
provides a multivariate version of the Gumbel distribution.
Occurrence and applications
Gumbel has shown that the maximum value (or last
order statistic
In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference.
Importa ...
) in a sample of
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
s following an
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
minus the natural logarithm of the sample size approaches the Gumbel distribution as the sample size increases.
Concretely, let
be the probability distribution of
and
its cumulative distribution. Then the maximum value out of
realizations of
is smaller than
if and only if all realizations are smaller than
. So the cumulative distribution of the maximum value
satisfies
:
,
and, for large
, the right-hand-side converges to
In
hydrology
Hydrology () is the scientific study of the movement, distribution, and management of water on Earth and other planets, including the water cycle, water resources, and environmental watershed sustainability. A practitioner of hydrology is calle ...
, therefore, the Gumbel distribution is used to analyze such variables as monthly and annual maximum values of daily rainfall and river discharge volumes,
and also to describe droughts.
Gumbel has also shown that the
estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
for the probability of an event — where ''r'' is the rank number of the observed value in the data series and ''n'' is the total number of observations — is an
unbiased estimator
In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...
of the
cumulative probability
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Eve ...
around the
mode of the distribution. Therefore, this estimator is often used as a
plotting position
Plot or Plotting may refer to:
Art, media and entertainment
* Plot (narrative), the story of a piece of fiction
Music
* ''The Plot'' (album), a 1976 album by jazz trumpeter Enrico Rava
* The Plot (band), a band formed in 2003
Other
* ''Plot' ...
.
In
number theory
Number theory (or arithmetic or higher arithmetic in older usage) is a branch of pure mathematics devoted primarily to the study of the integers and integer-valued functions. German mathematician Carl Friedrich Gauss (1777–1855) said, "Math ...
, the Gumbel distribution approximates the number of terms in a random
partition of an integer
In number theory and combinatorics, a partition of a positive integer , also called an integer partition, is a way of writing as a sum of positive integers. Two sums that differ only in the order of their summands are considered the same parti ...
as well as the trend-adjusted sizes of maximal
prime gaps
A prime gap is the difference between two successive prime numbers. The ''n''-th prime gap, denoted ''g'n'' or ''g''(''p'n'') is the difference between the (''n'' + 1)-th and the
''n''-th prime numbers, i.e.
:g_n = p_ - p_n.\
W ...
and maximal gaps between
prime constellations
In number theory, a prime -tuple is a finite collection of values representing a repeatable pattern of differences between prime numbers. For a -tuple , the positions where the -tuple matches a pattern in the prime numbers are given by the set o ...
.
Gumbel reparametrization tricks
In
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
, the Gumbel distribution is sometimes employed to generate samples from the
categorical distribution
In probability theory and statistics, a categorical distribution (also called a generalized Bernoulli distribution, multinoulli distribution) is a discrete probability distribution that describes the possible results of a random variable that ca ...
. This technique is called "Gumbel-max trick" and is a special example of "
reparametrization tricks".
In detail, let
be nonnegative, and not all zero, and let
be independent samples of Gumbel(0, 1), then by routine integration,
That is,
Equivalently, given any
, we can sample from its
Boltzmann distribution
In statistical mechanics and mathematics, a Boltzmann distribution (also called Gibbs distribution Translated by J.B. Sykes and M.J. Kearsley. See section 28) is a probability distribution or probability measure that gives the probability ...
by
Related equations include:
* If
, then
.
*
.
*
. That is, the Gumbel distribution is a max-stable distribution family.
*
.
Random variate generation
Since the quantile function (inverse
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
),
, of a Gumbel distribution is given by
:
the variate
has a Gumbel distribution with parameters
and
when the random variate
is drawn from the
uniform distribution on the interval
.
Probability paper

In pre-software times probability paper was used to picture the Gumbel distribution (see illustration). The paper is based on linearization of the cumulative distribution function
:
:
In the paper the horizontal axis is constructed at a double log scale. The vertical axis is linear. By plotting
on the horizontal axis of the paper and the
-variable on the vertical axis, the distribution is represented by a straight line with a slope 1
. When
distribution fitting software like
CumFreq
In statistics and data analysis the application software CumFreq is a tool for cumulative frequency analysis of a single variable and for probability distribution fitting.
Originally the method was developed for the analysis of hydrologica ...
became available, the task of plotting the distribution was made easier, as is demonstrated in the section below.
See also
*
Type-2 Gumbel distribution
*
Extreme value theory
Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the ...
*
Generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
*
Fisher–Tippett–Gnedenko theorem
In statistics, the Fisher–Tippett–Gnedenko theorem (also the Fisher–Tippett theorem or the extreme value theorem) is a general result in extreme value theory regarding asymptotic distribution of extreme order statistics. The maximum of a sa ...
*
Emil Julius Gumbel
Emil Julius Gumbel (18 July 1891, in Munich – 10 September 1966, in New York City) was a German mathematician and political writer.
Gumbel specialised in mathematical statistics and, along with Leonard Tippett and Ronald Fisher, was instru ...
References
External links
{{DEFAULTSORT:Gumbel Distribution
Continuous distributions
Extreme value data
Location-scale family probability distributions