In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, the Gillespie algorithm (or the Doob-Gillespie algorithm or ''Stochastic Simulation Algorithm'', the SSA) generates a statistically correct trajectory (possible solution) of a
stochastic
Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselve ...
equation system for which the
reaction rates
The reaction rate or rate of reaction is the speed at which a chemical reaction takes place, defined as proportional to the increase in the concentration of a product per unit time and to the decrease in the concentration of a reactant per unit ...
are known. It was created by
Joseph L. Doob
Joseph Leo Doob (February 27, 1910 – June 7, 2004) was an American mathematician, specializing in analysis and probability theory.
The theory of martingales was developed by Doob.
Early life and education
Doob was born in Cincinnati, Ohio, ...
and others (circa 1945), presented by
Dan Gillespie in 1976, and popularized in 1977 in a paper where he uses it to simulate chemical or biochemical systems of reactions efficiently and accurately using limited computational power (see
stochastic simulation A stochastic simulation is a simulation of a system that has variables that can change stochastically (randomly) with individual probabilities.DLOUHÝ, M.; FÁBRY, J.; KUNCOVÁ, M.. Simulace pro ekonomy. Praha : VŠE, 2005.
Realizations of these ...
). As computers have become faster, the algorithm has been used to simulate increasingly complex systems. The algorithm is particularly useful for simulating reactions within cells, where the number of
reagents is low and keeping track of the position and behaviour of individual molecules is computationally feasible. Mathematically, it is a variant of a
dynamic Monte Carlo method In chemistry, dynamic Monte Carlo (DMC) is a Monte Carlo method for modeling the dynamic behaviors of molecules by comparing the rates of individual steps with random numbers. It is essentially the same as Kinetic Monte Carlo. Unlike the Metrop ...
and similar to the
kinetic Monte Carlo methods. It is used heavily in
computational systems biology.
History
The process that led to the algorithm recognizes several important steps. In 1931,
Andrei Kolmogorov
Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
introduced the differential equations corresponding to the time-evolution of stochastic processes that proceed by jumps, today known as
Kolmogorov equations (Markov jump process)
In mathematics and statistics, in the context of Markov processes, the Kolmogorov equations, including Kolmogorov forward equations and Kolmogorov backward equations, are a pair of systems of differential equations that describe the time evolu ...
(a simplified version is known as
master equation
In physics, chemistry and related fields, master equations are used to describe the time evolution of a system that can be modelled as being in a probabilistic combination of states at any given time and the switching between states is determine ...
in the natural sciences). It was
William Feller
William "Vilim" Feller (July 7, 1906 – January 14, 1970), born Vilibald Srećko Feller, was a Croatian- American mathematician specializing in probability theory.
Early life and education
Feller was born in Zagreb to Ida Oemichen-Perc, a Croa ...
, in 1940, who found the conditions under which the Kolmogorov equations admitted (proper) probabilities as solutions. In his Theorem I (1940 work) he establishes that the time-to-the-next-jump was exponentially distributed and the probability of the next event is proportional to the rate. As such, he established the relation of Kolmogorov's equations with
stochastic processes.
Later, Doob (1942, 1945) extended Feller's solutions beyond the case of pure-jump processes. The method was implemented in computers by
David George Kendall
David George Kendall FRS (15 January 1918 – 23 October 2007) was an English statistician and mathematician, known for his work on probability, statistical shape analysis, ley lines and queueing theory. He spent most of his academic lif ...
(1950) using the
Manchester Mark 1
The Manchester Mark 1 was one of the earliest stored-program computers, developed at the Victoria University of Manchester, England from the Manchester Baby (operational in June 1948). Work began in August 1948, and the first version was operat ...
computer and later used by
Maurice S. Bartlett (1953) in his studies of epidemics outbreaks. Gillespie (1977) obtains the algorithm in a different manner by making use of a physical argument.
Idea behind the algorithm
Traditional continuous and deterministic biochemical
rate equation
In chemistry, the rate law or rate equation for a reaction is an equation that links the initial or forward reaction rate with the concentrations or pressures of the reactants and constant parameters (normally rate coefficients and partial rea ...
s do not accurately predict cellular reactions since they rely on bulk reactions that require the interactions of millions of molecules. They are typically modeled as a set of coupled ordinary differential equations. In contrast, the Gillespie algorithm allows a discrete and stochastic simulation of a system with few reactants because every reaction is explicitly simulated. A trajectory corresponding to a single Gillespie simulation represents an exact sample from the probability mass function that is the solution of the
master equation
In physics, chemistry and related fields, master equations are used to describe the time evolution of a system that can be modelled as being in a probabilistic combination of states at any given time and the switching between states is determine ...
.
The physical basis of the algorithm is the collision of molecules within a reaction vessel. It is assumed that collisions are frequent, but collisions with the proper orientation and energy are infrequent. Therefore, all reactions within the Gillespie framework must involve at most two molecules. Reactions involving three molecules are assumed to be extremely rare and are modeled as a sequence of binary reactions. It is also assumed that the reaction environment is well mixed.
Algorithm
A recent review (Gillespie, 2007) outlines three different, but equivalent formulations; the direct, first-reaction, and first-family methods, whereby the former two are special cases of the latter. The formulation of the direct and first-reaction methods is centered on performing the usual Monte-Carlo inversion steps on the so-called "fundamental premise of stochastic chemical kinetics", which mathematically is the function
:
,
where each of the
terms are propensity functions of an elementary reaction, whose argument is
, the vector of species counts. The
parameter is the time to the next reaction (or sojourn time), and
is the current time. To paraphrase Gillespie, this expression is read as "the probability, given
, that the system's next reaction will occur in the infinitesimal time interval
Examples
Reversible binding of A and B to form AB dimers
A simple example may help to explain how the Gillespie algorithm works. Consider a system of molecules of two types, and . In this system, and reversibly bind together to form dimers such that two reactions are possible: either A and B react reversibly to form an dimer, or an dimer dissociates into and . The reaction rate constant for a given single A molecule reacting with a given single molecule is
k_\mathrm, and the reaction rate for an dimer breaking up is
k_\mathrm.
If at time ''t'' there is one molecule of each type then the rate of dimer formation is
k_\mathrm, while if there are
n_\mathrm molecules of type and
n_\mathrm molecules of type , the rate of dimer formation is
k_\mathrmn_\mathrmn_\mathrm. If there are
n_\mathrm dimers then the rate of dimer dissociation is
k_\mathrmn_\mathrm.
The total reaction rate,
R_\mathrm, at time ''t'' is then given by
R_\mathrm=k_\mathrmn_\mathrmn_\mathrm+k_\mathrmn_\mathrm
So, we have now described a simple model with two reactions. This definition is independent of the Gillespie algorithm. We will now describe how to apply the Gillespie algorithm to this system.
In the algorithm, we advance forward in time in two steps: calculating the time to the next reaction, and determining which of the possible reactions the next reaction is. Reactions are assumed to be completely random, so if the reaction rate at a time ''t'' is
R_\mathrm, then the time, δ''t'', until the next reaction occurs is a random number drawn from exponential distribution function with mean
1/R_\mathrm. Thus, we advance time from ''t'' to ''t'' + δ''t''.

The probability that this reaction is an molecule binding to a molecule is simply the fraction of total rate due to this type of reaction, i.e.,
the probability that reaction is
P(\ce) = k_Dn_An_B/R_\ce
The probability that the next reaction is an dimer dissociating is just 1 minus that. So with these two probabilities we either form a dimer by reducing
n_\mathrm and
n_\mathrm by one, and increase
n_\mathrm by one, or we dissociate a dimer and increase
n_\mathrm and
n_\mathrm by one and decrease
n_\mathrm by one.
Now we have both advanced time to ''t'' + δ''t'', and performed a single reaction. The Gillespie algorithm just repeats these two steps as many times as needed to simulate the system for however long we want (i.e., for as many reactions). The result of a Gillespie simulation that starts with
n_\mathrm=n_\mathrm=10 and
n_\mathrm=0 at ''t''=0, and where
k_\mathrm=2 and
k_\mathrm=1, is shown at the right. For these parameter values, on average there are 8
n_\mathrm dimers and 2 of and but due to the small numbers of molecules fluctuations around these values are large. The Gillespie algorithm is often used to study systems where these fluctuations are important.
That was just a simple example, with two reactions. More complex systems with more reactions are handled in the same way. All reaction rates must be calculated at each time step, and one chosen with probability equal to its fractional contribution to the rate. Time is then advanced as in this example.
Further reading
*
*
*
*
*
*
*
*
*
*
*
*
*
* (Slepoy Thompson Plimpton 2008):
* (Bratsun et al. 2005):
* (Barrio et al. 2006):
* (Cai 2007):
* (Barnes Chu 2010):
* (Ramaswamy González-Segredo Sbalzarini 2009):
* (Ramaswamy Sbalzarini 2010):
* (Indurkhya Beal 2010):
* (Ramaswamy Sbalzarini 2011):
* (Yates Klingbeil 2013):
* {{cite journal , author=Gillespie, Daniel T. , title=Stochastic Simulation of Chemical Kinetics , journal=Annual Review of Physical Chemistry , volume=58 , pages=35–55 , year=2007 , doi=10.1146/annurev.physchem.58.032806.104637 , pmid=17037977 , bibcode=2007ARPC...58...35G
Chemical kinetics
Computational chemistry
Monte Carlo methods
Stochastic simulation