HOME

TheInfoList



OR:

Genetic saturation is the result of multiple substitutions at the same site in a sequence, or identical substitutions in different sequences, such that the apparent sequence divergence rate is lower than the actual divergence that has occurred. When comparing two or more genetic sequences consisting of single nucleotides, differences in sequence observed are only differences in the final state of the nucleotide sequence. Single nucleotides that undergoing genetic saturation change multiple times, sometimes back to their original nucleotide or to a nucleotide common to the compared genetic sequence. Without genetic information from intermediate taxa, it is difficult to know how much, or if any saturation has occurred on an observed sequence. Genetic saturation occurs most rapidly on fast-evolving sequences, such as the hypervariable region of mitochondrial DNA, or in short tandem repeats such as on the
Y-chromosome The Y chromosome is one of two sex chromosomes in therian mammals and other organisms. Along with the X chromosome, it is part of the XY sex-determination system, in which the Y is the sex-determining chromosome because the presence of the Y ...
. In phylogenetics, saturation effects result in
long branch attraction In phylogenetics, long branch attraction (LBA) is a form of systematic error whereby distantly related lineages are incorrectly inferred to be closely related. LBA arises when the amount of molecular or morphological change accumulated within a lin ...
, where the most distant lineages have misleadingly short branch lengths. It also decreases phylogenetic information contained in the sequences.


Phylogenetic saturation


Multiple substitutions

Multiple substitutions take place when single nucleotides undergo multiple changes before reaching their final nucleotide identity. A sequence is said to be saturated because mutation has acted multiple times upon nucleotides and observed change in sequence is, in fact, less than the historical change in sequence.


Detection

It is possible to estimate the amount of saturation that a sequence might have undergone by estimating the substitution rate of a genetic sequence and how much time has passed since divergence. Divergence rates are estimated from a variety of sources including ancestral DNA, fossil records and biographical events. This use of molecular clocks to determine divergence is controversial because of its potential for inaccuracy and assumptions made in the model (such as consistent mutation rate for all branches) and is used mostly as an estimation tool. Genetic saturation can also be estimated by comparing the number of observed differences in nucleotide sequences between multiple pairs of species. The number of observed substitutions between sequences of different species can be compared to the number of inferred substitutions based on branch length to find the approximate point where the number of inferred substitutions surpasses the number of observed substitutions. This method can give researchers an idea of the level of saturation of a particular gene but is thought to underestimate the amount of saturation, especially for very large branch lengths.


Impact on phylogenetics

In the field of
molecular phylogenetics Molecular phylogenetics () is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to ...
, the distances and relationships between species are investigated by looking at the DNA, RNA or amino acid sequences of an organism. When phylogenetic trees are constructed without considering possible saturation, the possibility of multiple substitutions can cause the distance between taxa to appear much smaller than the true distance.
Multiple sequence alignment Multiple sequence alignment (MSA) is the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. These alignments are used to infer evolutionary relationships via phylogenetic analysis an ...
, a common technique to construct phylogenies, relies on the comparison of homologous sequences. It can easily be confounded by genetic saturation because the homologous loci under investigation show no indication whether or not more than one substitution on each nucleotide separates the taxa being described. Substitution decreases the amount of phylogenetic information that can be contained in sequences, especially when deep branches are involved. This is particularly evident in studies examining arthropod groups. Furthermore, saturation effects can lead to a gross underestimation of divergence time. This is mainly attributed to the randomization of the phylogenetic signal with the number of observed sequence mutations and substitutions. The effects of saturation can mask the true amount of divergence time leading to inaccurate phylogenetic trees.


The principle of parsimony in genetic saturation analysis

Parsimony plays a fundamental role in genetic saturation analysis. This principle gives preference to the simplest explanation that can explain the data. In regards to genetic saturation, parsimony means that the hypothesized relationship is one that has the smallest number of character changes. Using parsimony to analyze genetic saturation can lead to conflict when creating a phylogenetic tree. When only sequence data is used, it is possible to come up with numerous phylogenetic trees with the same amount of parsimony.


Long branch attraction

Genetic saturation contributes to long-branch attraction in its ability to greatly mix up genetic code without easily observable associated phenotypic changes.
Long branch attraction In phylogenetics, long branch attraction (LBA) is a form of systematic error whereby distantly related lineages are incorrectly inferred to be closely related. LBA arises when the amount of molecular or morphological change accumulated within a lin ...
occurs when two relatively outgrouped taxa are seemingly closely linked. The more substitution mutations, the more likely it is for previously dissimilar sequences to share nucleotides and as a result, show homology in phylogenetic tree calculations. Long-branch attraction due to saturation has been proposed to be the cause of links in ancient phylogenies and puts into question even some of the earliest relationships between
eukaryotes The eukaryotes ( ) constitute the domain of Eukaryota or Eukarya, organisms whose cells have a membrane-bound nucleus. All animals, plants, fungi, seaweeds, and many unicellular organisms are eukaryotes. They constitute a major group of ...
,
archaea Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
, and
eubacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among the ...
.


Other uses of "Saturation" in genetics


Gene site saturation mutagenesis

Gene site saturation mutagenesis (GSSM) is mutagenesis technique of one or more codons in a gene to create a library of variants covering all other codons at that position. It is used in
biochemistry Biochemistry, or biological chemistry, is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology, a ...
and
protein engineering Protein engineering is the process of developing useful or valuable proteins through the design and production of unnatural polypeptides, often by altering amino acid sequences found in nature. It is a young discipline, with much research taking pl ...
to explore the functions and characteristics of specific amino acid sequences. This systemic identification of amino acid substitutions allows researchers to look at every possible variant of each position. This will provide crucial structural information about the protein of interest and will identify amino acid sequences that are more vital to the function of the protein. Researchers often lean towards using a one-step PCR-based to explore the specific effects of different variations in an amino acid of interest within a protein with GSSM. With a one-step PCR-based approached, researchers create a primer that has a corresponding sequence to the protein of interest at its two ends. Only one codon of a three codon amino acid sequence is substituted. The type of codon set, will determine the number of sequences that can be derived from GSSM. To determine which codon set to use, researchers will need to check the library quality on the DNA level, which means that massive sequence data is needed. If all 3 positions can be substituted for each of the four different nucleotides, researchers can code for all 20 amino acids. Although it’s possible to code for all 20 amino acids, this is not the most efficient method. The most efficient method is to use an NNK codon degeneracy, also known as a limited codon set. This method, will result in only 32 codons rather than 64.


Advantages of GSSM

In comparison to other techniques, GSSM is able to offer unique advantages such as: * A complete analysis of every position in a given gene, which can be helpful in identifying critical positions. Critical positions are identified by analyzing the immensity of the effects of mutagenesis — both positive and negative. GSSM can also identify positions that are more flexible, as GSSM at these positions will have less of an impact on the amino acid. * A residue-specific analysis, which allows for researchers to create a schematic representation of the amino acid. This allows for more complex and detailed genetic research in further studies. * An ability to look at the effects of various amino acids without knowing any structural information about the protein. The data collected can then provide valuable insight into this area. * Fast delivery times and cost-efficiency. GSSM was able to open up a whole frontier in genetic research, as it revolutionized fundamental beliefs about DNA. Before GSSM, researchers mutated DNA through radiation or with various chemicals. Both of these methods are imprecise.


References

{{reflist Phylogenetics