In
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
and
statistics, the Yule–Simon distribution is a
discrete probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
named after
Udny Yule
George Udny Yule FRS (18 February 1871 – 26 June 1951), usually known as Udny Yule, was a British statistician, particularly known for the Yule distribution.
Personal life
Yule was born at Beech Hill, a house in Morham near Haddingto ...
and
Herbert A. Simon
Herbert Alexander Simon (June 15, 1916 – February 9, 2001) was an American political scientist, with a Ph.D. in political science, whose work also influenced the fields of computer science, economics, and cognitive psychology. His primary ...
. Simon originally called it the ''Yule distribution''.
The
probability mass function
In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
(pmf) of the Yule–Simon (''ρ'') distribution is
:
for
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
and
real
Real may refer to:
Currencies
* Brazilian real (R$)
* Central American Republic real
* Mexican real
* Portuguese real
* Spanish real
* Spanish colonial real
Music Albums
* ''Real'' (L'Arc-en-Ciel album) (2000)
* ''Real'' (Bright album) (201 ...
, where
is the
beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^ ...
. Equivalently the pmf can be written in terms of the
rising factorial
In mathematics, the falling factorial (sometimes called the descending factorial, falling sequential product, or lower factorial) is defined as the polynomial
:\begin
(x)_n = x^\underline &= \overbrace^ \\
&= \prod_^n(x-k+1) = \prod_^(x-k) \,.
\ ...
as
:
where
is the
gamma function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except th ...
. Thus, if
is an integer,
:
The parameter
can be estimated using a fixed point algorithm.
The probability mass function ''f'' has the property that for sufficiently large ''k'' we have
:

This means that the tail of the Yule–Simon distribution is a realization of
Zipf's law:
can be used to model, for example, the relative frequency of the
th most frequent word in a large collection of text, which according to Zipf's law is
inversely proportional
In mathematics, two sequences of numbers, often experimental data, are proportional or directly proportional if their corresponding elements have a constant ratio, which is called the coefficient of proportionality or proportionality const ...
to a (typically small) power of
.
Occurrence
The Yule–Simon distribution arose originally as the limiting distribution of a particular model studied by Udny Yule in 1925 to analyze the growth in the number of species per genus in some higher taxon of biotic organisms.
The Yule model makes use of two related Yule processes, where a Yule process is defined as a continuous time
birth process
In probability theory, a birth process or a pure birth process is a special case of a continuous-time Markov process and a generalisation of a Poisson process. It defines a continuous process which takes values in the natural numbers and can only ...
which starts with one or more individuals. Yule proved that when time goes to infinity, the limit distribution of the number of species in a genus selected uniformly
at random has a specific form and exhibits a power-law behavior in its tail. Thirty years later, the Nobel laureate Herbert A. Simon proposed a time-discrete preferential attachment model to describe the appearance of new words in a large piece of a text. Interestingly enough,
the limit distribution of the number of occurrences of each word, when the number of words diverges, coincides with that of the number of species belonging to the randomly chosen genus in the Yule model, for a specific choice of the parameters. This fact explains the designation
Yule–Simon distribution that is commonly assigned to that limit distribution. In the context of random graphs, the
Barabási–Albert model
The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and s ...
also exhibits an asymptotic degree distribution that equals the Yule–Simon distribution in correspondence of a specific choice of the parameters and still presents power-law characteristics for more general choices of the parameters. The same happens also for other
preferential attachment
A preferential attachment process is any of a class of processes in which some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who ...
random graph models.
The preferential attachment process can also be studied as an
urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number (of balls) the urn already contains.
The distribution also arises as a
compound distribution In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to some ...
, in which the parameter of a
geometric distribution
In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:
* The probability distribution of the number ''X'' of Bernoulli trials needed to get one success, supported on the set \; ...
is treated as a function of random variable having an
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
. Specifically, assume that
follows an exponential distribution with
scale
Scale or scales may refer to:
Mathematics
* Scale (descriptive set theory), an object defined on a set of points
* Scale (ratio), the ratio of a linear dimension of a model to the corresponding dimension of the original
* Scale factor, a number ...
or rate
:
:
with density
:
Then a Yule–Simon distributed variable ''K'' has the following geometric distribution conditional on ''W'':
:
The pmf of a geometric distribution is
:
for
. The Yule–Simon pmf is then the following exponential-geometric compound distribution:
:
The
maximum likelihood estimator
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
for the parameter
given the observations
is the solution to the fixed point equation
:
where
are the rate and shape parameters of the
gamma distribution
In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
prior on
.
This algorithm is derived by Garcia
by directly optimizing the likelihood. Roberts and Roberts
generalize the algorithm to
Bayesian
Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister.
Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a follower ...
settings with the compound geometric formulation described above. Additionally, Roberts and Roberts
[ are able to use the Expectation Maximisation (EM) framework to show convergence of the fixed point algorithm. Moreover, Roberts and Roberts ][ derive the sub-linearity of the convergence rate for the fixed point algorithm. Additionally, they use the EM formulation to give 2 alternate derivations of the standard error of the estimator from the fixed point equation. The variance of the estimator is
:
the ]standard error
The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
is the square root of the quantity of this estimate divided by N.
Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t ...
. The probability mass function of the generalized Yule–Simon(''ρ'', ''α'') distribution is defined as
:
with . For the ordinary Yule–Simon(''ρ'') distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
See also
* Zeta distribution
In probability theory and statistics, the zeta distribution is a discrete probability distribution. If ''X'' is a zeta-distributed random variable with parameter ''s'', then the probability that ''X'' takes the integer value ''k'' is given by the ...
* Scale-free network
A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as
:
P(k ...
* Beta negative binomial distribution
In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable X equal to the number of failures needed to get r successes in a sequence of independent Bernoulli trials. The probabil ...
Bibliography
* Colin Rose and Murray D. Smith, ''Mathematical Statistics with Mathematica''. New York: Springer, 2002, . (''See page 107, where it is called the "Yule distribution".'')
References
{{DEFAULTSORT:Yule-Simon Distribution
Discrete distributions
Compound probability distributions