Microbial phylogenetics is the study of the manner in which various groups of
microorganism
A microorganism, or microbe, is an organism of microscopic scale, microscopic size, which may exist in its unicellular organism, single-celled form or as a Colony (biology)#Microbial colonies, colony of cells. The possible existence of unseen ...
s are genetically related. This helps to trace their
evolution
Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
.
To study these relationships biologists rely on
comparative genomics
Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach c ...
, as
physiology
Physiology (; ) is the science, scientific study of function (biology), functions and mechanism (biology), mechanisms in a life, living system. As a branches of science, subdiscipline of biology, physiology focuses on how organisms, organ syst ...
and
comparative anatomy
Comparative anatomy is the study of similarities and differences in the anatomy of different species. It is closely related to evolutionary biology and phylogeny (the evolution of species).
The science began in the classical era, continuing in t ...
are not possible methods.
History
1960s–1970s
Microbial
phylogenetics
In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...
emerged as a field of study in the 1960s, scientists started to create
genealogical trees based on differences in the order of
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s of
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s and
nucleotide
Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
s of genes instead of using comparative anatomy and physiology.
One of the most important figures in the early stage of this field is
Carl Woese
Carl Richard Woese ( ; July 15, 1928 – December 30, 2012) was an American microbiologist and biophysicist. Woese is famous for defining the Archaea (a new domain of life) in 1977 through a pioneering phylogenetic taxonomy of 16S ribosomal ...
, who in his researches, focused on
Bacteria
Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
, looking at
RNA
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
instead of proteins. More specifically, he decided to compare the small subunit
ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
(16rRNA) oligonucleotides. Matching oligonucleotides in different bacteria could be compared to one another to determine how closely the organisms were related. In 1977, after collecting and comparing 16s rRNA fragments for almost 200 species of bacteria,
Woese and his team in 1977 concluded that
Archaebacteria
Archaea ( ) is a domain of organisms. Traditionally, Archaea only included its prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even though the domain Archaea cladis ...
were not part of Bacteria but completely independent organisms.
1980s–1990s
In the 1980s microbial phylogenetics went into its golden age, as the techniques for
sequencing
In genetics and biochemistry, sequencing means to determine the primary structure (sometimes incorrectly called the primary sequence) of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succ ...
RNA and
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
improved greatly. For example, comparison of the
nucleotide sequences of whole
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s was facilitated by the development of the means to clone DNA, making possible to create many copies of sequences from minute samples. Of incredible impact for the microbial phylogenetics was the invention of the
polymerase chain reaction
The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA (or a part of it) sufficiently to enable detailed st ...
(PCR). All these new techniques led to the formal proposal of the three
domains of life:
Bacteria
Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
,
Archaea
Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
(Woese himself proposed this name to replace the old nomination of Archaebacteria), and Eukarya, arguably one of the key passage in the history of taxonomy.
One of the intrinsic problems of studying microbial organisms was the dependence of the studies from pure culture in a laboratory. Biologists tried to overcome this limitation by sequencing
rRNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
genes obtained from DNA isolated directly from the environment. This technique made possible to fully appreciate that bacteria, not only to have the greatest diversity but to constitute the greatest
biomass
Biomass is a term used in several contexts: in the context of ecology it means living organisms, and in the context of bioenergy it means matter from recently living (but now dead) organisms. In the latter context, there are variations in how ...
on earth.
In the late 1990s sequencing of genomes from various microbial organisms started and by 2005, 260 complete genomes had been sequenced resulting in the classification of 33 eucaryotes, 206 eubacteria, and 21 archeons.
2000s
In the early 2000s, scientists started creating phylogenetic trees based not on
rRNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
, but on other genes with different function (for example the gene for the
enzyme RNA polymerase). The resulting
genealogies
Genealogy () is the study of families, family history, and the tracing of their lineages. Genealogists use oral interviews, historical records, genetic analysis, and other records to obtain information about a family and to demonstrate kins ...
differed greatly from the ones based on the rRNA. These gene histories were so different between them that the only hypothesis that could explain these divergences was a major influence of
horizontal gene transfer
Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the e ...
(HGT), a mechanism which permits a
bacterium
Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among the ...
to acquire one or more
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s from a completely unrelated organism. HGT explains why similarities and differences in some
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s have to be carefully studied before being used as a measure of genealogical relationship for microbial organisms.
Studies aimed at understanding the widespread of HGT suggested that the ease with which genes are transferred among
bacteria
Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
made impossible to apply ‘the biological species concept’ for them.
Phylogenetic representation
Since
Darwin, every phylogeny for every organism has been represented in the form of a tree. Nonetheless, due to the great role that
HGT plays for
microbes
A microorganism, or microbe, is an organism of microscopic size, which may exist in its single-celled form or as a colony of cells. The possible existence of unseen microbial life was suspected from antiquity, with an early attestation in ...
some evolutionary microbiologists suggested abandoning this classical view in favor of a representation of genealogies more closely resembling a web, also known as network. However, there are some issues with this network representation, such as the inability to precisely establish the donor organism for a HGT event and the difficulty to determine the correct path across organisms when multiple HGT events happened. Therefore, there is not still a consensus between biologists on which representation is a better fit for the microbial world.
Methods for Microbial Phylogenetic Analysis
Most microbial taxa have never been cultivated or experimentally characterized. Utilizing
taxonomy
image:Hierarchical clustering diagram.png, 280px, Generalized scheme of taxonomy
Taxonomy is a practice and science concerned with classification or categorization. Typically, there are two parts to it: the development of an underlying scheme o ...
and
phylogeny
A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or Taxon, taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, M ...
are essential tools for organizing the diversity of life. Collecting gene sequences, aligning such sequences based on homologies and thus using models of mutation to infer evolutionary history are common methods to estimate microbial phylogenies.
Small subunit (SSU) rRNA (SSU rRNA) have revolutionized microbial classification since the 1970s and has since become the most sequenced gene
. Phylogenetic inferences are determined based on the genes chosen, for example, 16S rRNA gene is commonly selected to investigate inferences in Bacteria and Archaea, and microbial eukaryotes most commonly use the 18S RNA gene.
Phylogenetic comparative methods
Phylogenetic comparative methods (
PCMs) are commonly utilized to compare multiple traits across organisms. Within the scope of microbiome studies, it is not common for the use of PCMs, however, recent studies have been successful in identifying genes associated with colonization of human gut.
This challenge was addressed through measuring the statistical association between a species that harbors the gene and the probability the species is present in the gut microbiome. The analyses showcase the combination of shotgun metagenomics paired with phylogenetically aware models.
Ancestral state reconstruction
This method is commonly used for estimation of genetic and metabolic profiles of extant communities using a set of reference genomes, commonly performed with
PICRUSt (Phylogenetic Investigation of Communities by Reconstructing of Unobserved States) in microbiome studies.
PICRUSt is a computational approach capable of prediction functional composition of a metagenome with marker data and a database of reference genomes. To predict which gene families are present, PICRUSt uses extended ancestral-state reconstruction algorithm and then combines the gene families to estimate composite metagenome.
Analysis of phylogenetic variables and distances
Phylogenetic variables are used to describe variables that are constructed using features in the phylogeny to summarize and contrast data of species in the phylogenetic tree. Microbiome datasets can be simplifies using phylogenetic variables by reducing the dimensions of the data to a few variables carrying biological information.
Recent methods such as PhILR and phylofactorization address the challenges of phylogenetic variables analysis. The PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges. Incorporating both microbial evolutionary models with the isometric log-ratio transform creates the PhILR transform. Phylofactorization is a dimensionality-reducing tool used to identify edges in the phylogeny from which putative functional ecological traits may have arisen.
Challenges
Inferences in phylogenetics requires the assumption of common ancestry or homology but when this assumption is violated the signal can be disrupted by noise.
It is possible for microbial traits to be unrelated due to horizontal gene transfer causing the taxonomic composition to reveal little about the function of a system.
See also
*
Comparative genomics
Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach c ...
*
Phylogenomics
*
Multilocus sequence typing
*
Bacterial taxonomy
Bacterial taxonomy is subfield of taxonomy devoted to the classification of bacteria specimens into taxonomic ranks. Archaeal taxonomy are governed by the same rules.
In the scientific classification established by Carl Linnaeus, each species is ...
*
Computational phylogenetics
Computational phylogenetics, phylogeny inference, or phylogenetic inference focuses on computational and optimization algorithms, Heuristic (computer science), heuristics, and approaches involved in Phylogenetics, phylogenetic analyses. The goal i ...
*
History of molecular evolution
The history of molecular evolution starts in the early 20th century with "comparative biochemistry", but the field of molecular evolution came into its own in the 1960s and 1970s, following the rise of molecular biology. The advent of protein sequ ...
*
Molecular phylogenetics
Molecular phylogenetics () is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to ...
*
Phylogenetics
In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...
References
{{microorganisms
Phylogenetics
Microorganisms
Eukaryotic microbiology