HOME

TheInfoList



OR:

The Galton–Watson process, also called the Bienaymé-Galton-Watson process or the Galton-Watson branching process, is a branching
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...
arising from
Francis Galton Sir Francis Galton (; 16 February 1822 – 17 January 1911) was an English polymath and the originator of eugenics during the Victorian era; his ideas later became the basis of behavioural genetics. Galton produced over 340 papers and b ...
's statistical investigation of the extinction of
family name In many societies, a surname, family name, or last name is the mostly hereditary portion of one's personal name that indicates one's family. It is typically combined with a given name to form the full name of a person, although several give ...
s. The process models family names as patrilineal (passed from father to son), while offspring are randomly either male or female, and names become extinct if the family name line dies out (holders of the family name die without male descendants). Galton's investigation of this process laid the groundwork for the study of branching processes as a subfield of
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, and along with these subsequent processes the Galton-Watson process has found numerous applications across population genetics, computer science, and other fields.


History

There was concern amongst the Victorians that aristocratic surnames were becoming extinct. In 1869, Galton published '' Hereditary Genius'', in which he treated the extinction of different social groups. Galton originally posed a mathematical question regarding the distribution of surnames in an idealized population in an 1873 issue of '' The Educational Times:''
A large nation, of whom we will only concern ourselves with the adult males, in number, and who each bear separate surnames colonise a district. Their law of population is such that, in each generation, per cent. of the adult males have no male children who reach adult life ; have one such male child ; have two ; and so on up to who have five. Find (1) what proportion of their surnames will have become extinct after generations ; and (2) how many instances there will be of the surname being held by persons.
The Reverend Henry William Watson replied with a solution. Together, they then wrote an 1874 paper titled "On the probability of the extinction of families" in the ''Journal of the Anthropological Institute of Great Britain and Ireland'' (now the '' Journal of the Royal Anthropological Institute''). Galton and Watson appear to have derived their process independently of the earlier work by I. J. Bienaymé; see. Their solution is incomplete, according to which ''all'' family names go extinct with probability 1. Bienaymé had previously published the answer to the problem in 1845, with a promise to publish the derivation later, however there is no known publication of his solution. (However, Bru (1991) purports to reconstruct the proof). He was inspired by Émile Littré and Louis-François Benoiston de Châteauneuf (a friend of Bienaymé). Cournot published a solution in 1847, in Chapter V, §36 of ''De l'origine et des limites de la correspondance entre l'algèbre et la géométrie''. The problem in his formulation is the following: consider a gambler who buys lotteries. Each lottery costs 1 écu and pays écus with probabilities , respectively. The gambler starts with 1 écu before round 1, and at each round of gambling spends all their money to buy lotteries. Let denote the probability that the gambler goes backrupt before the -th round of gambling. What is the limit of ? Ronald A. Fisher in 1922 studied the same problem formulated in terms of genetics. Instead of the extinction of family names, he studied the probability for a mutant gene to eventually disappear in a large population. Haldane solved the problem in 1927. Agner Krarup Erlang was a member of the prominent Krarup family, which was going extinct. In 1929, he published the same problem posthumously (his obituary appears beside the problem). Erlang died childless. Steffensen solved it in 1930. For a detailed history, see Kendall (1966 and 1975) and and also Section 17 of.


Concepts

Assume, for the sake of the model, that surnames are passed on to all male children by their father. Suppose the number of a man's sons to be a
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
distributed on the set . Further suppose the numbers of different men's sons to be independent random variables, all having the same distribution. Then the simplest substantial mathematical conclusion is that if the average number of a man's sons is 1 or less, then their surname will almost surely die out, and if it is more than 1, then there is more than zero probability that it will survive for any given number of generations. A corollary of high extinction probabilities is that if a lineage ''has'' survived, it is likely to have experienced, purely by chance, an unusually high growth rate in its early generations at least when compared to the rest of the population.


Mathematical definition

A Galton–Watson process is a stochastic process which evolves according to the recurrence formula ''X''0 = 1 and :X_ = \sum_^ \xi_j^ where \ is a set of independent and identically-distributed natural number-valued random variables. In the analogy with family names, ''X''''n'' can be thought of as the number of descendants (along the male line) in the ''n''th generation, and \xi_j^ can be thought of as the number of (male) children of the ''j''th of these descendants. The recurrence relation states that the number of descendants in the ''n''+1st generation is the sum, over all ''n''th generation descendants, of the number of children of that descendant. The extinction probability (i.e. the probability of final extinction) is given by :\lim_ \Pr(X_n = 0).\, This is clearly equal to zero if each member of the population has exactly one descendant. Excluding this case (usually called the trivial case) there exists a simple necessary and sufficient condition, which is given in the next section.


Extinction criterion for Galton–Watson process

In the non-trivial case, the probability of final extinction is equal to 1 if ''E'' ≤ 1 and strictly less than 1 if ''E'' > 1. The process can be treated analytically using the method of probability generating functions. If the number of children ''ξ j'' at each node follows a Poisson distribution with parameter λ, a particularly simple recurrence can be found for the total extinction probability ''xn'' for a process starting with a single individual at time ''n'' = 0: :x_ = e^,\, giving the above curves.


Bisexual Galton–Watson process

In the classical family surname Galton–Watson process described above, only men need to be considered, since only males transmit their family name to descendants. This effectively means that reproduction can be modeled as asexual. (Likewise, if mitochondrial transmission is analyzed, only women need to be considered, since only females transmit their mitochondria to descendants.) A model more closely following actual sexual reproduction is the so-called "bisexual Galton–Watson process", where only couples reproduce. (''Bisexual'' in this context refers to the number of sexes involved, not
sexual orientation Sexual orientation is an enduring personal pattern of romantic attraction or sexual attraction (or a combination of these) to persons of the opposite sex or gender, the same sex or gender, or to both sexes or more than one gender. Patterns ar ...
.) In this process, each child is supposed as male or female, independently of each other, with a specified probability, and a so-called "mating function" determines how many couples will form in a given generation. As before, reproduction of different couples is considered to be independent of each other. Now the analogue of the trivial case corresponds to the case of each male and female reproducing in exactly one couple, having one male and one female descendant, and that the mating function takes the value of the minimum of the number of males and females (which are then the same from the next generation onwards). Since the total reproduction within a generation depends now strongly on the mating function, there exists in general no simple necessary and sufficient condition for final extinction as is the case in the classical Galton–Watson process. However, excluding the non-trivial case, the concept of the averaged reproduction mean (Bruss (1984)) allows for a general sufficient condition for final extinction, treated in the next section.


Extinction criterion

If in the non-trivial case the averaged reproduction mean per couple stays bounded over all generations and will not exceed 1 for a sufficiently large population size, then the probability of final extinction is always 1.


Applicability to family name extinction

Citing historical examples of Galton–Watson process is complicated due to the history of family names often deviating significantly from the theoretical model. Notably, new names can be created, existing names can be changed over a person's lifetime, and people historically have often assumed names of unrelated persons, particularly nobility. Thus, a small number of family names at present is not in itself evidence for names having become extinct over time, or that they did so due to dying out of family name lines – that requires that there were more names in the past ''and'' that they die out due to the line dying out, rather than the name changing for other reasons, such as vassals assuming the name of their lord. Chinese names are a well-studied example of surname extinction: there are currently only about 3,100 surnames in use in China, compared with close to 12,000 recorded in the past, also part o
Morrison Institute for Population and Resource StudiesWorking papers
with 22% of the population sharing the names Li,
Wang Wang may refer to: Names * Wang (surname) Wang () is the pinyin romanization of Chinese, romanization of the common Chinese surname (''Wáng''). It has a mixture of various origin with uncertain lineage of family history, however it is c ...
and Zhang (numbering close to 300 million people), and the top 200 names (6½%) covering 96% of the population. Names have changed or become extinct for various reasons such as people taking the names of their rulers, orthographic simplifications, and taboos against using characters from an emperor's name, among others. While family name lines dying out may be a factor in the surname extinction, it is by no means the only or even a significant factor. Indeed, the most significant factor affecting the surname frequency is other ethnic groups identifying as Han and adopting Han names. Further, while new names have arisen for various reasons, this has been outweighed by old names disappearing. By contrast, some nations have adopted family names only recently. This means both that they have not experienced surname extinction for an extended period, and that the names were adopted when the nation had a relatively large population, rather than the smaller populations of ancient times. Further, these names have often been chosen creatively and are very diverse. Examples include: * Japanese names, which in general use date only to the
Meiji restoration The , referred to at the time as the , and also known as the Meiji Renovation, Revolution, Regeneration, Reform, or Renewal, was a political event that restored Imperial House of Japan, imperial rule to Japan in 1868 under Emperor Meiji. Althoug ...
in the late 19th century (when the population was over 30,000,000), have over 100,000 family names, surnames are very varied, and the government restricts married couples to using the same surname. * Many
Dutch name Dutch names consist of one or more given names and a surname. The given name is usually gender-specific. Given names A Dutch child's birth and given name(s) must be officially registered by the parents within 3 days after birth. It is not uncom ...
s have included a formal family name only since the
Napoleonic Wars {{Infobox military conflict , conflict = Napoleonic Wars , partof = the French Revolutionary and Napoleonic Wars , image = Napoleonic Wars (revision).jpg , caption = Left to right, top to bottom:Battl ...
in the early 19th century. Earlier, surnames originated from patronyms (e.g., Jansen = John's son), personal qualities (e.g., de Rijke = the rich one), geographical locations (e.g., van Rotterdam), and occupations (e.g., Visser = the fisherman), sometimes even combined (e.g., Jan Jansz van Rotterdam). There are over 68,000 Dutch family names. * Thai names have included a family name only since 1920, and only a single family can use a given family name; hence there are a great number of Thai names. Furthermore, Thai people change their family names with some frequency, complicating the analysis. On the other hand, some examples of high concentration of family names are not primarily due to the Galton–Watson process: * Vietnamese names have about 100 family names, with 60% of the population sharing three family names. The name Nguyễn alone is estimated to be used by almost 40% of the Vietnamese population, and 90% share 15 names. However, as the history of the Nguyễn name makes clear, this is in no small part due to names being forced on people or adopted for reasons unrelated to genetic relation.


Other applications

Modern applications include the survival probabilities for a new mutant gene, or the initiation of a nuclear chain reaction, or the dynamics of
disease outbreak In epidemiology, an outbreak is a sudden increase in occurrences of a disease when cases are in excess of normal expectancy for the location or season. It may affect a small and localized group or impact upon thousands of people across an entire ...
s in their first generations of spread, or the chances of
extinction Extinction is the termination of an organism by the death of its Endling, last member. A taxon may become Functional extinction, functionally extinct before the death of its last member if it loses the capacity to Reproduction, reproduce and ...
of small
population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...
of
organism An organism is any life, living thing that functions as an individual. Such a definition raises more problems than it solves, not least because the concept of an individual is also difficult. Many criteria, few of them widely accepted, have be ...
s.


Nuclear fission

In the late 1930s, Leo Szilard independently reinvented Galton-Watson processes to describe the behavior of free neutrons during
nuclear fission Nuclear fission is a reaction in which the nucleus of an atom splits into two or more smaller nuclei. The fission process often produces gamma photons, and releases a very large amount of energy even by the energetic standards of radioactiv ...
. This work involved generalizing formulas for extinction probabilities, which became essential for calculating the critical mass required for a continuous chain reaction with fissionable materials.


Genetics

The Galton–Watson model is an accurate description of
Y chromosome The Y chromosome is one of two sex chromosomes in therian mammals and other organisms. Along with the X chromosome, it is part of the XY sex-determination system, in which the Y is the sex-determining chromosome because the presence of the ...
transmission in genetics, and the model is thus useful for understanding human Y-chromosome DNA haplogroups. Likewise, since mitochondria are inherited only on the maternal line, the same mathematical formulation describes transmission of mitochondria. It explains (perhaps closest to Galton's original interest) why only a handful of males in the deep past of humanity now have ''any'' surviving male-line descendants, reflected in a rather small number of distinctive human Y-chromosome DNA haplogroups.


See also

* Branching process * Resource-dependent branching process * Pedigree collapse


References


Further reading

* F. Thomas Bruss (1984). "A Note on Extinction Criteria for Bisexual Galton–Watson Processes". '' Journal of Applied Probability'' 21: 915–919. * C C Heyde and E Seneta (1977). ''I.J. Bienayme: Statistical Theory Anticipated''. Berlin, Germany. * * *


External links


"Survival of a Single Mutant" by Peter M. Lee of the University of York

The simple Galton-Watson process: Classical approach
University of Muenster {{DEFAULTSORT:Galton-Watson Process Genetic genealogy Genetics in the United Kingdom Human population genetics Stochastic processes Surname