The Infinite sites model (ISM) is a mathematical model of
molecular evolution
Molecular evolution is the process of change in the sequence composition of cell (biology), cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and popula ...
first proposed by
Motoo Kimura
(November 13, 1924 – November 13, 1994) was a Japanese biologist best known for introducing the neutral theory of molecular evolution in 1968. He became one of the most influential theoretical population geneticists. He is remembered in gen ...
in 1969.
Like other mutation models, the ISM provides a basis for understanding how mutation develops new alleles in DNA sequences. Using allele frequencies, it allows for the calculation of
heterozygosity
Zygosity (the noun, zygote, is from the Greek "yoked," from "yoke") () is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism.
Mo ...
, or
genetic diversity
Genetic diversity is the total number of Genetics, genetic characteristics in the genetic makeup of a species, it ranges widely from the number of species to differences within species and can be attributed to the span of survival for a species. ...
, in a finite population and for the estimation of
genetic distance
Genetic distance is a measure of the genetic divergence between species or between populations within a species, whether the distance measures time from common ancestor or degree of differentiation. Populations with many similar alleles have s ...
s between populations of interest.
The assumptions of the ISM are that (1) there are an infinite number of sites where mutations can occur, (2) every new mutation occurs at a novel site, and (3) there is no
recombination.
The term ‘site’ refers to a single nucleotide base pair.
Because every new mutation has to occur at a novel site, there can be no
homoplasy
Homoplasy, in biology and phylogenetics, is the term used to describe a feature that has been gained or lost independently in separate lineages over the course of evolution. This is different from homology, which is the term used to characterize ...
, or back-mutation to an allele that previously existed. All identical alleles are
identical by descent
A DNA segment is identical by state (IBS) in two or more individuals if they have identical nucleotide sequences in this segment. An IBS segment is identical by descent (IBD) in two or more individuals if they have inherited it from a common a ...
. The
four gamete rule can be applied to the data to ensure that they do not violate the model assumption of no recombination.
The mutation rate (
) can be estimated as follows, where
is the number of mutations found within a randomly selected DNA sequence (per generation),
is the effective population size.
The
coefficient
In mathematics, a coefficient is a multiplicative factor in some term of a polynomial, a series, or an expression; it is usually a number, but may be any expression (including variables such as , and ). When the coefficients are themselves ...
is the product of twice the gene copies in individuals of the population; in the case of diploid, biparentally-inherited genes the appropriate coefficient is 4 whereas for uniparental, haploid genes, such as mitochondrial genes, the coefficient would be 2 but applied to the ''female''
effective population size
The effective population size (''N'e'') is a number that, in some simplified scenarios, corresponds to the number of breeding individuals in the population. More generally, ''N'e'' is the number of individuals that an idealised population w ...
which is, for most species, roughly half of
.
When considering the length of a DNA sequence, the expected number of mutations is calculated as follows
Where k is the length of a DNA sequence and
is the probability a mutation will occur at a site.
Watterson developed an estimator for mutation rate that incorporates the number of segregating sites
(Watterson's estimator).
One way to think of the ISM is in how it applies to genome evolution. To understand the ISM as it applies to genome evolution, we must think of this model as it applies to
chromosome
A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins ar ...
s. Chromosomes are made up of ''sites'', which are
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecul ...
s represented by either A, C, G, or T. While individual chromosomes are not infinite, we must think of chromosomes as continuous intervals or continuous circles.
Multiple assumptions are applied to understanding the ISM in terms of genome evolution:
* ''k'' breaks are made in these chromosomes, which leaves ''2k'' free ends available. The ''2k'' free ends will rejoin in a new manner rearranging the set of chromosomes (i.e.
reciprocal translocation
In genetics, chromosome translocation is a phenomenon that results in unusual rearrangement of chromosomes. This includes balanced and unbalanced translocation, with two main types: reciprocal-, and Robertsonian translocation. Reciprocal translo ...
,
fusion, fission,
inversion
Inversion or inversions may refer to:
Arts
* , a French gay magazine (1924/1925)
* ''Inversion'' (artwork), a 2005 temporary sculpture in Houston, Texas
* Inversion (music), a term with various meanings in music theory and musical set theory
* ...
, circularized incision, circularized excision).
* No break point is ever used twice.
* A set of chromosomes can be duplicated or lost.
* DNA that never existed before can be observed in the chromosomes, such as
horizontal gene transfer
Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between unicellular and/or multicellular organisms other than by the ("vertical") transmission of DNA from parent to offspring ( reproduction). ...
of DNA or viral integration.
* If the chromosomes become different enough, evolution can form a new species.
* Substitutions that alter a single base pair are individually invisible and substitutions occur at a finite rate per site.
* The substitution rate is the same for all sites in a species, but is allowed to vary between species (i.e. no
molecular clock
The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleo ...
is assumed).
* Instead of thinking about substitutions themselves, think about the effect of the substitution at each point along the chromosome as a continuous increase in evolutionary distance between the previous version of the genome at that site and the next version of the genome at the corresponding site in the descendant.
References
Further reading
*
*
*
* {{cite journal , last1 = Tsitrone , first1 = Anne , last2 = Rousset , first2 = François , last3 = David , first3 = Patrice , year = 2001 , title = Heterosis, marker mutational processes and population inbreeding history , url = http://www.genetics.org/content/159/4/1845.short , journal = Genetics , volume = 159 , issue = 4, pages = 1845–1859 , doi = 10.1093/genetics/159.4.1845 , pmid = 11779819 , pmc = 1461896
Molecular evolution
Population genetics
Mathematical and theoretical biology