16S RNA
   HOME

TheInfoList



OR:

16S ribosomal RNA (or 16 S
rRNA Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
) is the RNA component of the
30S The prokaryotic small ribosomal subunit, or 30Svedberg, S subunit, is the smaller subunit of the 70S ribosome found in prokaryotes. It is a complex of the 16S ribosomal RNA (rRNA) and 19 proteins. This complex is implicated in the binding of tr ...
subunit of a
prokaryotic A prokaryote (; less commonly spelled procaryote) is a single-celled organism whose cell lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Greek (), meaning 'before', and (), meaning 'nut' ...
ribosome Ribosomes () are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (messenger RNA translation). Ribosomes link amino acids together in the order s ...
(
SSU rRNA Small subunit ribosomal ribonucleic acid (SSU rRNA) is the smaller of the two major RNA components of the ribosome. Associated with a number of ribosomal proteins, the SSU rRNA forms the small subunit of the ribosome. It is encoded by SSU- rDNA. ...
). It binds to the Shine-Dalgarno sequence and provides most of the SSU structure. The genes coding for it are referred to as 16S rRNA genes and are used in reconstructing
phylogenies A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In o ...
, due to the slow rates of
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
of this region of the gene.
Carl Woese Carl Richard Woese ( ; July 15, 1928 – December 30, 2012) was an American microbiologist and biophysicist. Woese is famous for defining the Archaea (a new domain of life) in 1977 through a pioneering phylogenetic taxonomy of 16S ribosomal ...
and
George E. Fox George Edward Fox (born December 17, 1945) is an astrobiologist, a Professor Emeritus and researcher at the University of Houston. He is an elected fellow of the American Academy of Microbiology, the American Association for the Advancement of S ...
were two of the people who pioneered the use of 16S rRNA in phylogenetics in 1977. Multiple sequences of the 16S rRNA gene can exist within a single
bacterium Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among the ...
.


Terminology

The descriptor ''16S'' refers to the size of these ribosomal subunits as reflected indirectly by the speed at which they sediment when samples are centrifuged. Thus ''16S'' means 16 Svedburg units.


Functions

* Like the large (23S) ribosomal RNA, it has a structural role, acting as a scaffold defining the positions of the
ribosomal protein A ribosomal protein (r-protein or rProtein) is any of the proteins that, in conjunction with rRNA, make up the ribosomal subunits involved in the cellular process of translation. ''E. coli'', other bacteria and Archaea have a 30S small subunit ...
s. * The 3-end contains the anti- Shine-Dalgarno sequence, which binds upstream to the AUG
start codon The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and archaea and a ''N''-formylmethionine (fMet) in bacteria, mitochondria and plastids. ...
on the
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
. The 3-end of 16S RNA binds to the proteins S1 and S21 which are known to be involved in initiation of
protein synthesis Protein biosynthesis, or protein synthesis, is a core biological process, occurring inside cells, balancing the loss of cellular proteins (via degradation or export) through the production of new proteins. Proteins perform a number of critica ...
* Interacts with 23S, aiding in the binding of the two ribosomal subunits (
50S 50 S is the larger subunit of the 70S ribosome of prokaryotes, i.e. bacteria and archaea. It is the site of inhibition for antibiotics such as macrolides, chloramphenicol, clindamycin, and the pleuromutilins. It includes the 5S ribosom ...
and
30S The prokaryotic small ribosomal subunit, or 30Svedberg, S subunit, is the smaller subunit of the 70S ribosome found in prokaryotes. It is a complex of the 16S ribosomal RNA (rRNA) and 19 proteins. This complex is implicated in the binding of tr ...
) * Stabilizes correct codon-anticodon pairing in the
A-site The A-site (A for aminoacyl) of a ribosome is a binding site for charged t-RNA molecules during protein synthesis. One of three such binding sites, the A-site is the first location the t-RNA binds during the protein synthesis process, the othe ...
by forming a
hydrogen bond In chemistry, a hydrogen bond (H-bond) is a specific type of molecular interaction that exhibits partial covalent character and cannot be described as a purely electrostatic force. It occurs when a hydrogen (H) atom, Covalent bond, covalently b ...
between the N1 atom of
adenine Adenine (, ) (nucleoside#List of nucleosides and corresponding nucleobases, symbol A or Ade) is a purine nucleotide base that is found in DNA, RNA, and Adenosine triphosphate, ATP. Usually a white crystalline subtance. The shape of adenine is ...
residues 1492 and 1493 and the 2OH group of the mRNA backbone.


Structure


Universal primers

The 16S rRNA gene is used for
phylogenetic In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...
studies as it is highly conserved between different species of bacteria and archaea.
Carl Woese Carl Richard Woese ( ; July 15, 1928 – December 30, 2012) was an American microbiologist and biophysicist. Woese is famous for defining the Archaea (a new domain of life) in 1977 through a pioneering phylogenetic taxonomy of 16S ribosomal ...
pioneered this use of 16S rRNA in 1977. It is suggested that 16S rRNA gene can be used as a reliable
molecular clock The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleot ...
because 16S rRNA sequences from distantly related bacterial lineages are shown to have similar functionalities. Some
thermophilic A thermophile is a type of extremophile that thrives at relatively high temperatures, between . Many thermophiles are archaea, though some of them are bacteria and fungi. Thermophilic eubacteria are suggested to have been among the earliest bact ...
archaea Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
(e.g. order
Thermoproteales Thermoproteales are an order of archaeans in the class Thermoprotei. They are the only organisms known to lack the SSB proteins, instead possessing the protein ThermoDBP that has displaced them. The rRNA genes of these organisms contain multi ...
) contain 16S rRNA gene
intron An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e., a region inside a gene."The notion of the cistron .e., gen ...
s that are located in highly conserved regions and can impact the annealing of "universal" primers. Mitochondrial and
chloroplast A chloroplast () is a type of membrane-bound organelle, organelle known as a plastid that conducts photosynthesis mostly in plant cell, plant and algae, algal cells. Chloroplasts have a high concentration of chlorophyll pigments which captur ...
ic rRNA are also amplified. The most common primer pair was devised by Weisburg ''et al.'' (1991) and is currently referred to as 27F and 1492R; however, for some applications shorter
amplicon In molecular biology, an amplicon is a piece of DNA or RNA that is the source and/or product of amplification or replication events. It can be formed artificially, using various methods including polymerase chain reactions (PCR) or ligase chain ...
s may be necessary, for example for 454 sequencing with titanium chemistry the primer pair 27F-534R covering V1 to V3. Often 8F is used rather than 27F. The two primers are almost identical, but 27F has an M instead of a C. AGAGTTTGATCMTGGCTCAG compared with 8F.


PCR and NGS applications

In addition to highly conserved primer binding sites, 16S rRNA gene sequences contain
hypervariable region A hypervariable region (HVR) is a location within a sequence where polymorphisms frequently occur. It is used in two contexts: * In the case of nucleic acids, an HVR is where base pairs frequently change. This can be due to a change in the number ...
s that can provide species-specific signature sequences useful for identification of bacteria. As a result, 16S rRNA gene sequencing has become prevalent in
medical microbiology Medical microbiology, the large subset of microbiology that is applied to medicine, is a branch of medical science concerned with the prevention, diagnosis and treatment of infectious diseases. In addition, this field of science studies various ...
as a rapid and cheap alternative to
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
methods of bacterial identification. Although it was originally used to identify bacteria, 16S sequencing was subsequently found to be capable of reclassifying bacteria into completely new
species A species () is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction. It is the basic unit of Taxonomy (biology), ...
, or even
genera Genus (; : genera ) is a taxonomic rank above species and below family as used in the biological classification of living and fossil organisms as well as viruses. In binomial nomenclature, the genus name forms the first part of the binomial s ...
. It has also been used to describe new species that have never been successfully cultured. With
third-generation sequencing Third-generation sequencing (also known as long-read sequencing) is a class of DNA sequencing methods that have the capability to produce substantially longer reads (ranging from 10 kb to >1 Mb in length) than second generation sequencing, also kno ...
coming to many labs, simultaneous identification of thousands of 16S rRNA sequences is possible within hours, allowing
metagenomic Metagenomics is the study of all genetic material from all organisms in a particular environment, providing insights into their composition, diversity, and functional potential. Metagenomics has allowed researchers to profile the microbial co ...
studies, for example of
gut flora Gut microbiota, gut microbiome, or gut flora are the microorganisms, including bacteria, archaea, fungi, and viruses, that live in the digestive tracts of animals. The gastrointestinal metagenome is the aggregate of all the genomes of the g ...
. In samples collected from patients with confirmed infections, 16S rRNA next-generation sequencing (NGS) demonstrated enhanced detection in 40% of cases compared to traditional culture methods; moreover, pre-sampling antibiotic consumption did not significantly affect the sensitivity of 16S NGS.


Hypervariable regions

The bacterial 16S gene contains nine hypervariable regions (V1–V9), ranging from about 30 to 100
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s long, that are involved in the secondary structure of the small ribosomal subunit. The degree of conservation varies widely between hypervariable regions, with more conserved regions correlating to higher-level taxonomy and less conserved regions to lower levels, such as genus and species. While the entire 16S sequence allows for comparison of all hypervariable regions, at approximately 1,500 base pairs long it can be prohibitively expensive for studies seeking to identify or characterize diverse bacterial communities. These studies commonly utilize the Illumina platform, which produces reads at rates 50-fold and 12,000-fold less expensive than 454
pyrosequencing Pyrosequencing is a method of DNA sequencing (determining the order of nucleotides in DNA) based on the "sequencing by synthesis" principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase. Pyrosequ ...
and
Sanger sequencing Sanger sequencing is a method of DNA sequencing that involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. After first being developed by Fred ...
, respectively. While cheaper and allowing for deeper community coverage, Illumina sequencing only produces reads 75–250 base pairs long (up to 300 base pairs with Illumina MiSeq), and has no established protocol for reliably assembling the full gene in community samples. Full hypervariable regions can be assembled from a single Illumina run, however, making them ideal targets for the platform. While 16S hypervariable regions can vary dramatically between bacteria, the 16S gene as a whole maintains greater length homogeneity than its eukaryotic counterpart (
18S ribosomal RNA 18S ribosomal RNA (abbreviated 18S rRNA) is a part of the ribosomal RNA in eukaryotes. It is a component of the Eukaryotic small ribosomal subunit (40S) and the cytosolic homologue of both the 12S ribosomal RNA, 12S rRNA in mitochondria and the 1 ...
), which can make alignments easier. Additionally, the 16S gene contains highly conserved sequences between hypervariable regions, enabling the design of universal primers that can reliably produce the same sections of the 16S sequence across different
taxa In biology, a taxon (back-formation from ''taxonomy''; : taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular name and ...
. Although no hypervariable region can accurately classify all bacteria from
domain A domain is a geographic area controlled by a single person or organization. Domain may also refer to: Law and human geography * Demesne, in English common law and other Medieval European contexts, lands directly managed by their holder rather ...
to species, some can reliably predict specific taxonomic levels. Many community studies select semi-conserved hypervariable regions like the V4 for this reason, as it can provide resolution at the phylum level as accurately as the full 16S gene. While lesser-conserved regions struggle to classify new species when higher order taxonomy is unknown, they are often used to detect the presence of specific pathogens. In one study by Chakravorty et al. in 2007, the authors characterized the V1–V8 regions of a variety of pathogens in order to determine which hypervariable regions would be most useful to include for disease-specific and broad
assay An assay is an investigative (analytic) procedure in laboratory medicine, mining, pharmacology, environmental biology and molecular biology for qualitatively assessing or quantitatively measuring the presence, amount, or functional activity ...
s. Amongst other findings, they noted that the V3 region was best at identifying the genus for all pathogens tested, and that V6 was the most accurate at differentiating species between all CDC-watched pathogens tested, including
anthrax Anthrax is an infection caused by the bacterium '' Bacillus anthracis'' or ''Bacillus cereus'' biovar ''anthracis''. Infection typically occurs by contact with the skin, inhalation, or intestinal absorption. Symptom onset occurs between one ...
. While 16S hypervariable region analysis is a powerful tool for bacterial taxonomic studies, it struggles to differentiate between closely related species. In the families ''
Enterobacteriaceae Enterobacteriaceae is a large family (biology), family of Gram-negative bacteria. It includes over 30 genera and more than 100 species. Its classification above the level of Family (taxonomy), family is still a subject of debate, but one class ...
'', ''
Clostridiaceae The Clostridiaceae are a family of the bacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryo ...
'', and ''
Peptostreptococcaceae The Peptostreptococcaceae are a family of Gram-positive anaerobic bacteria in the class Clostridia. A majority of members are identified as obligate anaerobes. The bacteria can be found in humans, vertebrates, manure, soil and hydrothermal ven ...
'', species can share up to 99% sequence similarity across the full 16S gene. As a result, the V4 sequences can differ by only a few
nucleotide Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
s, leaving reference databases unable to reliably classify these bacteria at lower taxonomic levels. By limiting 16S analysis to select hypervariable regions, these studies can fail to observe differences in closely related taxa and group them into single taxonomic units, therefore underestimating the total diversity of the sample. Furthermore, bacterial genomes can house multiple 16S genes, with the V1, V2, and V6 regions containing the greatest intraspecies diversity. While not the most precise method of classifying bacterial species, analysis of the hypervariable regions remains one of the most useful tools available to bacterial community studies.


Promiscuity of 16S rRNA genes

Under the assumption that evolution is driven by
vertical transmission Vertical transmission of symbionts is the transfer of a microbial symbiont from the parent directly to the offspring.  Many metazoan species carry symbiotic bacteria which play a mutualistic, commensal, or parasitic role.  A symbiont is acq ...
, 16S rRNA genes have long been believed to be species-specific, and infallible as genetic markers inferring phylogenetic relationships among
prokaryotes A prokaryote (; less commonly spelled procaryote) is a single-celled organism whose cell lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Greek (), meaning 'before', and (), meaning 'nut' ...
. However, a growing number of observations suggest the occurrence of
horizontal transfer Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the e ...
of these genes. In addition to observations of natural occurrence, transferability of these genes is supported experimentally using a specialized ''
Escherichia coli ''Escherichia coli'' ( )Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. is a gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Escherichia'' that is commonly fo ...
'' genetic system. Using a
null mutant A null allele is a nonfunctional allele (a variant of a gene) caused by a genetic mutation. Such mutations can cause a complete lack of production of the associated gene product or a product that does not function properly; in either case, the all ...
of ''E. coli'' as host, growth of the mutant strain was shown to be complemented by foreign 16S rRNA genes that were phylogenetically distinct from ''E. coli'' at the phylum level. Such functional compatibility was also seen in ''
Thermus thermophilus ''Thermus thermophilus'' is a gram stain, Gram-negative bacterium used in a range of biotechnological applications, including as a model organism for genetic manipulation, structural genomics, and systems biology. The bacterium is extremely therm ...
''. Furthermore, in ''T. thermophilus'', both complete and partial gene transfer was observed. Partial transfer resulted in spontaneous generation of apparently random
chimera Chimera, Chimaera, or Chimaira (Greek for " she-goat") originally referred to: * Chimera (mythology), a fire-breathing monster of ancient Lycia said to combine parts from multiple animals * Mount Chimaera, a fire-spewing region of Lycia or Cilicia ...
between host and foreign bacterial genes. Thus, 16S rRNA genes may have evolved through multiple mechanisms, including vertical inheritance and
horizontal gene transfer Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the e ...
; the frequency of the latter may be much higher than previously thought.


16S ribosomal databases

The 16S rRNA gene is used as the standard for classification and identification of microbes, because it is present in most microbes and shows proper changes. Type strains of 16S rRNA gene sequences for most bacteria and archaea are available on public databases, such as
NCBI The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is loca ...
. However, the quality of the sequences found on these databases is often not validated. Therefore, secondary databases that collect only 16S rRNA sequences are widely used.


SILVA

SILVA Silva, da Silva, and de Silva are surnames of Portuguese or Galician origin which are widespread in the Portuguese-speaking countries including Brazil. The name is derived from Latin ("forest" or "woodland"). It is the family name of the Hous ...
provides comprehensive, quality checked and regularly updated datasets of aligned small (16S/ 18S, SSU) and large subunit ( 23S/ 28S,
LSU Louisiana State University and Agricultural and Mechanical College, commonly referred to as Louisiana State University (LSU), is an American Public university, public Land-grant university, land-grant research university in Baton Rouge, Louis ...
) ribosomal RNA (rRNA) sequences for all three domains of life as well as a suite of search, primer-design and alignment tools (Bacteria, Archaea and Eukarya).


GreenGenes

GreenGenes is a quality controlled, comprehensive 16S rRNA gene reference database and taxonomy based on a ''de novo'' phylogeny that provides standard operational taxonomic unit sets. In 2023, GreenGenes2 was released.


EzBioCloud

EzBioCloud database, formerly known as EzTaxon, consists of a complete hierarchical taxonomic system containing 62,988 bacteria and archaea species/phylotypes which includes 15,290 valid published names as of September 2018. Based on the phylogenetic relationship such as maximum-likelihood and OrthoANI, all species/subspecies are represented by at least one 16S rRNA gene sequence. The EzBioCloud database is systematically curated and updated regularly which also includes novel candidate species. Moreover, the website provides bioinformatics tools such as ANI calculator, ContEst16S and 16S rRNA DB for QIIME and Mothur pipeline.


MIMt

MIMt is a compact non-redundant 16S database for a rapid metagenomic samples identification. It is composed of 48,749 full 16S sequences belonging to 24,626 well classified bacteria and archaea species. All sequences were obtained from complete genomes deposited in NCBI and for each of the sequences full taxonomic hierarchy is provided. It contains no redundancy, so only one representative for each species was considered avoiding same sequences from different strains, isolates or pathovars resulting in a very fast tool for microorganisms identification, compatible with any classification software (QIIME, Mothur, DADA, etc).


Ribosomal Database Project

The Ribosomal Database Project (RDP) was a curated database that offers ribosome data along with related programs and services. The offerings included phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams and various software packages for handling, analyzing and displaying alignments and trees. Due to its large size the RDP database is often used as the basis for bioinformatic tool development and creating manually curated databases. The RDP server was taken offline in 2023, but the software is still available for download.The RDP Classifier is available as a stand-alone tool a
https://sourceforge.net/projects/rdp-classifier/
It is written in Java and so will run on any computer that has Java installed. The RDPTools are available on GitHub and as a Docker image. See installation instructions a
https://john-quensen.com/tutorials/tutorial-1/
an
https://jfq3.gitbook.io/rdptools-docker/rdptools-docker/readme
Instructions for downloading a SeqMatch database and running it from the command line are available a
https://john-quensen.com/tutorials/seqmatch/


References


External links


University of Washington Laboratory Medicine: Molecular Diagnosis , Bacterial Sequencing

MIMt 16S database

The Ribosomal Database Project



SILVA rRNA database

Greengenes: 16S rDNA data and tools

EzBioCloud
{{DEFAULTSORT:16s Ribosomal Rna Ribosomal RNA Metagenomics