Branching process
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, a branching process is a type of mathematical object known as a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
, which consists of collections of
random variables A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
. The random variables of a stochastic process are indexed by the natural numbers. The original purpose of branching processes was to serve as a mathematical model of a population in which each individual in generation n produces some random number of individuals in generation n+1, according, in the simplest case, to a fixed
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
that does not vary from individual to individual. Branching processes are used to model reproduction; for example, the individuals might correspond to bacteria, each of which generates 0, 1, or 2 offspring with some probability in a single time unit. Branching processes can also be used to model other systems with similar dynamics, e.g., the spread of
surname In some cultures, a surname, family name, or last name is the portion of one's personal name that indicates one's family, tribe or community. Practices vary by culture. The family name may be placed at either the start of a person's full name ...
s in
genealogy Genealogy () is the study of families, family history, and the tracing of their lineages. Genealogists use oral interviews, historical records, genetic analysis, and other records to obtain information about a family and to demonstrate kins ...
or the propagation of neutrons in a
nuclear reactor A nuclear reactor is a device used to initiate and control a fission nuclear chain reaction or nuclear fusion reactions. Nuclear reactors are used at nuclear power plants for electricity generation and in nuclear marine propulsion. Heat from nu ...
. A central question in the theory of branching processes is the probability of ultimate extinction, where no individuals exist after some finite number of generations. Using Wald's equation, it can be shown that starting with one individual in generation zero, the expected size of generation ''n'' equals ''μ''''n'' where ''μ'' is the expected number of children of each individual. If ''μ'' < 1, then the expected number of individuals goes rapidly to zero, which implies ultimate extinction with probability 1 by
Markov's inequality In probability theory, Markov's inequality gives an upper bound for the probability that a non-negative function (mathematics), function of a random variable is greater than or equal to some positive Constant (mathematics), constant. It is named a ...
. Alternatively, if ''μ'' > 1, then the probability of ultimate extinction is less than 1 (but not necessarily zero; consider a process where each individual either has 0 or 100 children with equal probability. In that case, ''μ'' = 50, but probability of ultimate extinction is greater than 0.5, since that's the probability that the first individual has 0 children). If ''μ'' = 1, then ultimate extinction occurs with probability 1 unless each individual always has exactly one child. In
theoretical ecology Theoretical ecology is the scientific discipline devoted to the study of ecological systems using theoretical methods such as simple conceptual models, mathematical models, computational simulations, and advanced data analysis. Effective models im ...
, the parameter ''μ'' of a branching process is called the
basic reproductive rate In epidemiology, the basic reproduction number, or basic reproductive number (sometimes called basic reproduction ratio or basic reproductive rate), denoted R_0 (pronounced ''R nought'' or ''R zero''), of an infection is the expected number of ...
.


Mathematical formulation

The most common formulation of a branching process is that of the
Galton–Watson process The Galton–Watson process is a branching stochastic process arising from Francis Galton's statistical investigation of the extinction of family names. The process models family names as patrilineal (passed from father to son), while offspr ...
. Let ''Z''''n'' denote the state in period ''n'' (often interpreted as the size of generation ''n''), and let ''X''''n,i'' be a random variable denoting the number of direct successors of member ''i'' in period ''n'', where ''X''''n,i'' are
independent and identically distributed random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
over all ''n'' ∈ and ''i'' ∈ . Then the recurrence equation is :Z_ = \sum_^ X_ with ''Z''0 = 1. Alternatively, the branching process can be formulated as a
random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...
. Let ''S''''i'' denote the state in period ''i'', and let ''X''''i'' be a random variable that is
iid In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
over all ''i''. Then the recurrence equation is :S_ = S_i+X_-1 = \sum_^ X_j-i with ''S''0 = 1. To gain some intuition for this formulation, imagine a walk where the goal is to visit every node, but every time a previously unvisited node is visited, additional nodes are revealed that must also be visited. Let ''S''''i'' represent the number of revealed but unvisited nodes in period ''i'', and let ''X''''i'' represent the number of new nodes that are revealed when node ''i'' is visited. Then in each period, the number of revealed but unvisited nodes equals the number of such nodes in the previous period, plus the new nodes that are revealed when visiting a node, minus the node that is visited. The process ends once all revealed nodes have been visited.


Continuous-time branching processes

For discrete-time branching processes, the "branching time" is fixed to be ''1'' for all individuals. For continuous-time branching processes, each individual waits for a random time (which is a continuous random variable), and then divides according to the given distribution. The waiting time for different individuals are independent, and are independent with the number of children. In general, the waiting time is an exponential variable with parameter ''λ'' for all individuals, so that the process is Markovian.


Extinction problem for a Galton Watson process

The ultimate extinction probability is given by :\lim_ \Pr(Z_n=0). For any nontrivial cases (trivial cases are ones in which the probability of having no offspring is zero for every member of the population - in such cases the probability of ultimate extinction is 0), the probability of ultimate extinction equals one if ''μ'' ≤ 1 and strictly less than one if ''μ'' > 1. The process can be analyzed using the method of
probability generating function In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often ...
. Let ''p''0, ''p''1, ''p''2, ... be the probabilities of producing 0, 1, 2, ... offspring by each individual in each generation. Let ''d''''m'' be the extinction probability by the ''m''th generation. Obviously, ''d''0 = 0. Since the probabilities for all paths that lead to 0 by the ''m''th generation must be added up, the extinction probability is nondecreasing in generations. That is, :0=d_0 \leq d_1\leq d_2 \leq \cdots \leq 1. Therefore, ''d''''m'' converges to a limit ''d'', and ''d'' is the ultimate extinction probability. If there are ''j'' offspring in the first generation, then to die out by the mth generation, each of these lines must die out in ''m'' − 1 generations. Since they proceed independently, the probability is (''d''''m''−1) ''j''. Thus, :d_m=p_0+p_1d_+p_2(d_)^2+p_3(d_)^3+\cdots. \, The right-hand side of the equation is a probability generating function. Let ''h''(''z'') be the ordinary generating function for ''p''''i'': :h(z)=p_0+p_1z+p_2z^2+\cdots. \, Using the generating function, the previous equation becomes :d_m=h(d_). \, Since ''d''''m'' → ''d'', ''d'' can be found by solving :d=h(d). \, This is also equivalent to finding the intersection point(s) of lines ''y'' = ''z'' and ''y'' = ''h''(''z'') for ''z'' ≥ 0. ''y'' = ''z'' is a straight line. ''y'' = ''h''(''z'') is an increasing (since h'(z) = p_1 + 2 p_2 z + 3 p_3 z^2 + \cdots \geq 0) and convex (since h''(z) = 2 p_2 + 6 p_3 z + 12 p_4 z^2 + \cdots \geq 0) function. There are at most two intersection points. Since (1,1) is always an intersect point for the two functions, there only exist three cases: Case 1 has another intersect point at ''z'' < 1 (see the red curve in the graph). Case 2 has only one intersect point at ''z'' = 1.(See the green curve in the graph) Case 3 has another intersect point at ''z'' > 1.(See the black curve in the graph) In case 1, the ultimate extinction probability is strictly less than one. For case 2 and 3, the ultimate extinction probability equals to one. By observing that ''h′''(1) = ''p''1 + 2''p''2 + 3''p''3 + ... = ''μ'' is exactly the expected number of offspring a parent could produce, it can be concluded that for a branching process with generating function ''h''(''z'') for the number of offspring of a given parent, if the mean number of offspring produced by a single parent is less than or equal to one, then the ultimate extinction probability is one. If the mean number of offspring produced by a single parent is greater than one, then the ultimate extinction probability is strictly less than one.


Size dependent branching processes

Along with discussion of a more general model of branching processes known as age-dependent branching processes by Grimmett, in which individuals live for more than one generation, Krishna Athreya has identified three distinctions between size-dependent branching processes which have general application. Athreya identifies the three classes of size-dependent branching processes as sub-critical, stable, and super-critical branching measures. For Athreya, the central parameters are crucial to control if sub-critical and super-critical unstable branching is to be avoided. Size dependent branching processes are also discussed under the topic of
resource-dependent branching process A branching process (BP) (see e.g. Jagers (1975)) is a mathematical model to describe the development of a population. Here population is meant in a general sense, including a human population, animal populations, bacteria and others which reprod ...


Example of extinction problem

Consider a parent can produce at most two offspring. The extinction probability in each generation is: :d_m=p_0+p_1d_+p_2(d_)^2. \, with ''d''0 = 0. For the ultimate extinction probability, we need to find ''d'' which satisfies ''d'' = ''p''0 + ''p''1d + ''p''2''d''2. Taking as example probabilities for the numbers of offspring produced ''p''0 = 0.1, ''p''1 = 0.6, and ''p''2 = 0.3, the extinction probability for the first 20 generations is as follows: In this example, we can solve algebraically that ''d'' = 1/3, and this is the value to which the extinction probability converges with increasing generations.


Simulating branching processes

Branching processes can be simulated for a range of problems. One specific use of simulated branching process is in the field of evolutionary biology. Phylogenetic trees, for example, can be simulated under several models, helping to develop and validate estimation methods as well as supporting hypothesis testing.


Multitype branching processes

In multitype branching processes, individuals are not identical, but can be classified into ''n'' types. After each time step, an individual of type ''i'' will produce individuals of different types, and \mathbf_i, a random vector representing the numbers of children in different types, satisfies a probability distribution on \mathbb^n. For example, consider the population of cancer stem cells (CSCs) and non-stem cancer cells (NSCCs). After each time interval, each CSC has probability p_1 to produce two CSCs (symmetric division), probability p_2 to produce one CSC and one NSCC (asymmetric division), probability p_3 to produce one CSC (stagnation), and probability 1-p_1-p_2-p_3 to produce nothing (death); each NSCC has probability p_4 to produce two NSCCs (symmetric division), probability p_5 to produce one NSCC (stagnation), and probability 1-p_4-p_5 to produce nothing (death).


Law of large numbers for multitype branching processes

For multitype branching processes that the populations of different types grow exponentially, the proportions of different types converge almost surely to a constant vector under some mild conditions. This is the strong law of large numbers for multitype branching processes. For continuous-time cases, proportions of the population expectation satisfy an
ODE An ode (from grc, ᾠδή, ōdḗ) is a type of lyric poetry. Odes are elaborately structured poems praising or glorifying an event or individual, describing nature intellectually as well as emotionally. A classic ode is structured in three majo ...
system, which has a unique attracting fixed point. This fixed point is just the vector that the proportions converge to in the law of large numbers. The monograph by Athreya and Ney summarizes a common set of conditions under which this law of large numbers is valid. Later there are some improvements through discarding different conditions.


Other branching processes

There are many other branching processes, for example, branching processes in random environments, in which the reproduction law is chosen randomly at each generation, or branching processes, where the growth of the population is controlled by external influences or interacting processes. Branching processes where particles have to work (contribute resources to the environment) in order to be able to reproduce, and live in a changing society structure controlling the distribution of resources, are so-called resource-dependent branching processes. The scaling limit of near-critical branching processes can be used to obtain superprocesses.


See also

*
Galton–Watson process The Galton–Watson process is a branching stochastic process arising from Francis Galton's statistical investigation of the extinction of family names. The process models family names as patrilineal (passed from father to son), while offspr ...
*
Random tree In mathematics and computer science, a random tree is a tree or arborescence that is formed by a stochastic process. Types of random trees include: *Uniform spanning tree, a spanning tree of a given graph in which each different tree is equally ...
*
Branching random walk In probability theory, a branching random walk is a stochastic process that generalizes both the concept of a random walk and of a branching process. At every generation (a point of discrete time), a branching random walk's value is a set of ele ...
*
Resource-dependent branching process A branching process (BP) (see e.g. Jagers (1975)) is a mathematical model to describe the development of a population. Here population is meant in a general sense, including a human population, animal populations, bacteria and others which reprod ...
*
Bruss–Duerinckx theorem The theorem of the envelopment of societies for resource-dependent populations, also called the Bruss–Duerinckx theorem, is a mathematical result on the behavior of populations which choose their society form according to only two hypotheses, name ...
*
Martingale (probability theory) In probability theory, a martingale is a sequence of random variables (i.e., a stochastic process) for which, at a particular time, the conditional expectation of the next value in the sequence is equal to the present value, regardless of all ...
* Superprocess


References

* C. M. Grinstead and J. L. Snell
Introduction to Probability
, 2nd ed. Section 10.3 discusses branching processes in detail together with the application of generating functions to study them. * G. R. Grimmett and D. R. Stirzaker, ''Probability and Random Processes'', 2nd ed., Clarendon Press, Oxford, 1992. Section 5.4 discusses the model of branching processes described above. Section 5.5 discusses a more general model of branching processes known as age-dependent branching processes, in which individuals live for more than one generation. {{DEFAULTSORT:Branching Process Markov processes