HOME

TheInfoList



OR:

Bacterial genomes are generally smaller and less variant in size among species when compared with
genome A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
s of
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s. Bacterial genomes can range in size anywhere from about 130 kbp to over 14 Mbp. A study that included, but was not limited to, 478 bacterial genomes, concluded that as genome size increases, the number of genes increases at a disproportionately slower rate in eukaryotes than in non-eukaryotes. Thus, the proportion of
non-coding DNA Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and reg ...
goes up with genome size more quickly in non-bacteria than in
bacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
. This is consistent with the fact that most eukaryotic nuclear DNA is non-gene coding, while the majority of prokaryotic, viral, and organellar genes are coding. Right now, we have genome sequences from 50 different bacterial phyla and 11 different archaeal phyla. Second-generation sequencing has yielded many draft genomes (close to 90% of bacterial genomes in GenBank are currently not complete);
third-generation sequencing Third-generation sequencing (also known as long-read sequencing) is a class of DNA sequencing methods that have the capability to produce substantially longer reads (ranging from 10 kb to >1 Mb in length) than second generation sequencing, also kno ...
might eventually yield a complete genome in a few hours. The genome sequences reveal much diversity in bacteria. Analysis of over 2000 ''Escherichia coli'' genomes reveals an ''E. coli'' core genome of about 3100 gene families and a total of about 89,000 different gene families. This article contains quotations from this source, which is available under th
Creative Commons Attribution 4.0 International (CC BY 4.0)
license.
Genome sequences show that parasitic bacteria have 500–1200 genes, free-living bacteria have 1500–7500 genes, and archaea have 1500–2700 genes. A striking discovery by Cole et al. described massive amounts of gene decay when comparing
Leprosy Leprosy, also known as Hansen's disease (HD), is a Chronic condition, long-term infection by the bacteria ''Mycobacterium leprae'' or ''Mycobacterium lepromatosis''. Infection can lead to damage of the Peripheral nervous system, nerves, respir ...
bacillus to ancestral bacteria. Studies have since shown that several bacteria have smaller genome sizes than their ancestors did. Over the years, researchers have proposed several theories to explain the general trend of bacterial genome decay and the relatively small size of bacterial genomes. Compelling evidence indicates that the apparent degradation of bacterial genomes is owed to a deletional bias.


Methods and techniques

As of 2014, there are over 30,000 sequenced bacterial genomes publicly available and thousands of metagenome projects. Projects such as the Genomic Encyclopedia of Bacteria and Archaea (GEBA) intend to add more genomes. The single gene comparison is now being supplanted by more general methods. These methods have resulted in novel perspectives on genetic relationships that previously have only been estimated. A significant achievement in the second decade of bacterial
genome sequencing Whole genome sequencing (WGS), also known as full genome sequencing or just genome sequencing, is the process of determining the entirety of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's ...
was the production of metagenomic data, which covers all DNA present in a sample. Previously, there were only two metagenomic projects published.


Bacterial genomes

Bacteria possess a compact genome architecture distinct from
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s in two important ways: bacteria show a strong correlation between genome size and number of functional genes in a genome, and those genes are structured into
operon In genetics, an operon is a functioning unit of DNA containing a cluster of genes under the control of a single promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo splic ...
s. The main reason for the relative density of bacterial genomes compared to eukaryotic genomes (especially multicellular eukaryotes) is the presence of
noncoding DNA Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and regu ...
in the form of
intergenic region An intergenic region is a stretch of DNA sequences located between genes. Intergenic regions may contain functional elements and junk DNA. Properties and functions Intergenic regions may contain a number of functional DNA sequences such as p ...
s and
introns An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e., a region inside a gene."The notion of ...
. Some notable exceptions include recently formed pathogenic bacteria. This was initially described in a study by Cole ''et al''. in which ''
Mycobacterium leprae ''Mycobacterium leprae'' (also known as the leprosy bacillus or Hansen's bacillus) is one of the two species of bacteria that cause Hansen's disease (leprosy), a chronic but curable infectious disease that damages the peripheral nerves and ta ...
'' was discovered to have a significantly higher percentage of
pseudogene Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Pseudogenes can be formed from both protein-coding genes and non-coding genes. In the case of protein-coding genes, most pseudogenes arise as superfluous copies of fun ...
s to functional genes (~40%) than its free-living ancestors. Furthermore, amongst species of bacteria, there is relatively little variation in genome size when compared with the genome sizes of other major groups of life. Genome size is of little relevance when considering the number of functional genes in eukaryotic species. In bacteria, however, the strong correlation between the number of genes and the genome size makes the size of bacterial genomes an interesting topic for research and discussion. The general trends of bacterial evolution indicate that bacteria started as free-living organisms. Evolutionary paths led some bacteria to become
pathogens In biology, a pathogen (, "suffering", "passion" and , "producer of"), in the oldest and broadest sense, is any organism or agent that can produce disease. A pathogen may also be referred to as an infectious agent, or simply a germ. The term ...
and
symbiont Symbiosis (Ancient Greek : living with, companionship < : together; and ''bíōsis'': living) is any type of a close and long-term biological interaction, between two organisms of different species. The two organisms, termed symbionts, can fo ...
s. The lifestyles of bacteria play an integral role in their respective genome sizes. Free-living bacteria have the largest genomes out of the three types of bacteria; however, they have fewer pseudogenes than bacteria that have recently acquired
pathogen In biology, a pathogen (, "suffering", "passion" and , "producer of"), in the oldest and broadest sense, is any organism or agent that can produce disease. A pathogen may also be referred to as an infectious agent, or simply a Germ theory of d ...
icity. Facultative and recently evolved pathogenic bacteria exhibit a smaller genome size than free-living bacteria, yet they have more pseudogenes than any other form of bacteria.
Obligate {{wiktionary, obligate As an adjective, obligate means "by necessity" (antonym '' facultative'') and is used mainly in biology in phrases such as: * Obligate aerobe, an organism that cannot survive without oxygen * Obligate anaerobe, an organism ...
bacterial symbionts or pathogens have the smallest genomes and the fewest pseudogenes of the three groups. The relationship between life-styles of bacteria and
genome size Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms (trillionths or 10−12 of a gram, abbreviated pg) or less frequently in daltons, or as the tot ...
raises questions as to the mechanisms of bacterial genome evolution. Researchers have developed several theories to explain the patterns of genome size evolution amongst bacteria.


Genome comparisons

As single-gene comparisons have largely given way to genome comparisons, phylogeny of bacterial genomes have improved in accuracy. The Average Nucleotide Identity (ANI) method quantifies genetic distance between entire genomes by taking advantage of regions of about 10,000 bp. With enough data from genomes of one genus, algorithms are executed to categorize species. This has been done for the '' Pseudomonas avellanae'' species in 2013 and for all sequenced bacteria and archaea since 2020. Observed ANI values among sequences appear to have an "ANI gap" at 85–95%, suggesting that a genetic boundary suitable for defining a species concept is present. To extract information about bacterial genomes, core- and
pan-genome In the fields of molecular biology and genetics, a pan-genome (pangenome or supragenome) is the entire set of genes from all strains within a clade. More generally, it is the union of all the genomes of a clade. The pan-genome can be broken do ...
sizes have been assessed for several strains of bacteria. In 2012, the number of core gene families was about 3000. However, by 2015, with an over tenfold increased in available genomes, the pan-genome has increased as well. There is roughly a positive correlation between the number of genomes added and the growth of the pan-genome. On the other hand, the core genome has remain static since 2012. Currently, the ''E. coli'' pan-genome is composed of about 90,000 gene families. About one-third of these exist only in a single genome. Many of these, however, are merely gene fragments and the result of calling errors. Still, there are probably over 60,000 unique gene families in ''E. coli''.


Theories of bacterial genome evolution

Bacteria lose a large amount of genes as they transition from free-living or facultatively parasitic life cycles to permanent host-dependent life. Towards the lower end of the scale of bacterial genome size are the mycoplasmas and related bacteria. Early molecular phylogenetic studies revealed that mycoplasmas represented an evolutionary derived state, contrary to prior hypotheses. Furthermore, it is now known that mycoplasmas are just one instance of many of genome shrinkage in obligately host-associated bacteria. Other examples are ''
Rickettsia ''Rickettsia'' is a genus of nonmotile, gram-negative, nonspore-forming, highly pleomorphic bacteria that may occur in the forms of cocci (0.1 μm in diameter), bacilli (1–4 μm long), or threads (up to about 10 μm long). The genus was n ...
'', ''Buchnera aphidicola'', and ''Borrelia burgdorferi''. Small genome size in such species is associated with certain particularities, such as rapid evolution of polypeptide sequences and low GC content in the genome. The convergent evolution of these qualities in unrelated bacteria suggests that an obligate association with a host promotes genome reduction. Given that over 80% of almost all of the fully sequenced bacterial genomes consist of intact ORFs, and that gene length is nearly constant at ~1 kb per gene, it is inferred that small genomes have few metabolic capabilities. While free-living bacteria, such as ''E. coli'', ''Salmonella'' species, or ''Bacillus'' species, usually have 1500 to 6000 proteins encoded in their DNA, obligately pathogenic bacteria often have as few as 500 to 1000 such proteins. One candidate explanation is that reduced genomes maintain genes that are necessary for vital processes pertaining to cellular growth and replication, in addition to those genes that are required to survive in the bacteria's
ecological niche In ecology, a niche is the match of a species to a specific environmental condition. Three variants of ecological niche are described by It describes how an organism or population responds to the distribution of Resource (biology), resources an ...
. However, sequence data contradicts this hypothesis. The set of universal orthologs amongst eubacteria comprises only 15% of each genome. Thus, each lineage has taken a different evolutionary path to reduced size. Because universal cellular processes require over 80 genes, variation in genes imply that the same functions can be achieved by exploitation of nonhomologous genes. Host-dependent bacteria are able to secure many compounds required for
metabolism Metabolism (, from ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run cellular processes; the co ...
from the host's
cytoplasm The cytoplasm describes all the material within a eukaryotic or prokaryotic cell, enclosed by the cell membrane, including the organelles and excluding the nucleus in eukaryotic cells. The material inside the nucleus of a eukaryotic cell a ...
or tissue. They can, in turn, discard their own biosynthetic pathways and associated genes. This removal explains many of the specific gene losses. For example, the ''Rickettsia'' species, which relies on specific energy substrate from its host, has lost many of its native energy metabolism genes. Similarly, most small genomes have lost their amino acid biosynthesizing genes, as these are found in the host instead. One exception is the ''Buchnera'', an obligate maternally transmitted symbiont of aphids. It retains 54 genes for biosynthesis of crucial amino acids, but no longer has pathways for those amino acids that the host can synthesize. Pathways for nucleotide biosynthesis are gone from many reduced genomes. Those anabolic pathways that evolved through niche adaptation remain in particular genomes. The hypothesis that unused genes are eventually removed does not explain why many of the removed genes would indeed remain helpful in obligate pathogens. For example, many eliminated genes code for products that are involved in universal cellular processes, including replication, transcription, and
translation Translation is the communication of the semantics, meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English la ...
. Even genes supporting DNA recombination and repair are deleted from every small genome. However some genes, such as those encoding the
RecA RecA is a 38 kilodalton protein essential for the repair and maintenance of DNA in bacteria. Structural and functional homologs to RecA have been found in all kingdoms of life. RecA serves as an archetype for this class of homologous DNA repair p ...
protein, were found to be nearly ubiquitous, indicating that a large majority of bacterial genomes are probably capable of
homologous recombination Homologous recombination is a type of genetic recombination in which genetic information is exchanged between two similar or identical molecules of double-stranded or single-stranded nucleic acids (usually DNA as in Cell (biology), cellular organi ...
. In addition, small genomes have fewer
tRNA Transfer ribonucleic acid (tRNA), formerly referred to as soluble ribonucleic acid (sRNA), is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes). In a cell, it provides the physical link between the gene ...
s, utilizing one for several amino acids. So, a single
codon Genetic code is a set of rules used by living cells to translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished by the ribosome, which links prote ...
pairs with multiple codons, which likely yields less-than-optimal translation machinery. It is unknown why obligate intracellular pathogens would benefit by retaining fewer tRNAs and fewer DNA repair enzymes. Another factor to consider is the change in population that corresponds to an evolution towards an obligately pathogenic life. Such a shift in lifestyle often results in a reduction in the genetic population size of a lineage, since there is a finite number of hosts to occupy. This genetic drift may result in fixation of mutations that inactivate otherwise beneficial genes, or otherwise may decrease the efficiency of gene products. Hence, not will only useless genes be lost (as mutations disrupt them once the bacteria has settled into host dependency), but also beneficial genes may be lost if genetic drift enforces ineffective
purifying selection In natural selection, negative selection or purifying selection is the selective removal of alleles that are deleterious. This can result in stabilising selection through the purging of deleterious genetic polymorphisms that arise through random ...
. The number of universally maintained genes is small and inadequate for independent cellular growth and replication, so that small genome species must achieve such feats by means of varying genes. This is done partly through nonorthologous gene displacement. That is, the role of one gene is replaced by another gene that achieves the same function. Redundancy within the ancestral, larger genome is eliminated. The descendant small genome content depends on the content of chromosomal deletions that occur in the early stages of genome reduction. The very small genome of '' M. genitalium'' possesses dispensable genes. In a study in which single genes of this organism were inactivated using transposon-mediated mutagenesis, at least 129 of its 484 ORGs were not required for growth. A much smaller genome than that of the ''M. genitalium'' is therefore feasible.


Doubling time

One theory predicts that bacteria have smaller genomes due to a selective pressure on genome size to ensure faster replication. The theory is based upon the logical premise that smaller bacterial genomes will take less time to replicate. Subsequently, smaller genomes will be selected preferentially due to enhanced fitness. A study done by Mira et al. indicated little to no correlation between genome size and
doubling time The doubling time is the time it takes for a population to double in size/value. It is applied to population growth, inflation, resource extraction, consumption of goods, compound interest, the volume of malignant tumours, and many other things t ...
. The data indicates that selection is not a suitable explanation for the small sizes of bacterial genomes. Still, many researchers believe there is some selective pressure on bacteria to maintain small
genome size Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms (trillionths or 10−12 of a gram, abbreviated pg) or less frequently in daltons, or as the tot ...
.


Deletional bias

Selection Selection may refer to: Science * Selection (biology), also called natural selection, selection in evolution ** Sex selection, in genetics ** Mate selection, in mating ** Sexual selection in humans, in human sexuality ** Human mating strat ...
is but one process involved in evolution. Two other major processes (
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
and
genetic drift Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene va ...
) can account for the genome sizes of various types of bacteria. A study done by Mira et al. examined the size of insertions and deletions in bacterial pseudogenes. Results indicated that mutational deletions tend to be larger than insertions in bacteria in the absence of gene transfer or
gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
. Insertions caused by horizontal or lateral gene transfer and
gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
tend to involve transfer of large amounts of genetic material. Assuming a lack of these processes, genomes will tend to reduce in size in the absence of selective constraint. Evidence of a deletional bias is present in the respective genome sizes of free-living bacteria, facultative and recently derived
parasite Parasitism is a Symbiosis, close relationship between species, where one organism, the parasite, lives (at least some of the time) on or inside another organism, the Host (biology), host, causing it some harm, and is Adaptation, adapted str ...
s and obligate parasites and
symbiont Symbiosis (Ancient Greek : living with, companionship < : together; and ''bíōsis'': living) is any type of a close and long-term biological interaction, between two organisms of different species. The two organisms, termed symbionts, can fo ...
s. Free-living bacteria tend to have large population sizes and are subject to more opportunity for gene transfer. As such, selection can effectively operate on free-living bacteria to remove deleterious sequences resulting in a relatively small number of
pseudogene Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Pseudogenes can be formed from both protein-coding genes and non-coding genes. In the case of protein-coding genes, most pseudogenes arise as superfluous copies of fun ...
s. Continually, further selective pressure is evident as free-living bacteria must produce all gene-products independent of a host. Given that there is sufficient opportunity for gene transfer to occur and there are selective pressures against even slightly deleterious deletions, it is intuitive that free-living bacteria should have the largest bacterial genomes of all bacteria types. Recently formed parasites undergo severe bottlenecks and can rely on host environments to provide gene products. As such, in recently formed and facultative parasites, there is an accumulation of pseudogenes and
transposable element A transposable element (TE), also transposon, or jumping gene, is a type of mobile genetic element, a nucleic acid sequence in DNA that can change its position within a genome. The discovery of mobile genetic elements earned Barbara McClinto ...
s due to a lack of selective pressure against deletions. The population bottlenecks reduce gene transfer and as such, deletional bias ensures the reduction of genome size in parasitic bacteria. Obligatory parasites and symbionts have the smallest genome sizes due to prolonged effects of deletional bias. Parasites which have evolved to occupy specific niches are not exposed to much selective pressure. As such, genetic drift dominates the evolution of niche-specific bacteria. Extended exposure to deletional bias ensures the removal of most superfluous sequences. Symbionts occur in drastically lower numbers and undergo the most severe bottlenecks of any bacterial type. There is almost no opportunity for gene transfer for endosymbiotic bacteria, and thus genome compaction can be extreme. One of the smallest bacterial genomes ever to be sequenced is that of the
endosymbiont An endosymbiont or endobiont is an organism that lives within the body or cells of another organism. Typically the two organisms are in a mutualism (biology), mutualistic relationship. Examples are nitrogen-fixing bacteria (called rhizobia), whi ...
'' Carsonella rudii''. At 160 kbp, the genome of ''Carsonella'' is one of the most streamlined examples of a genome examined to date.


Genomic reduction

Molecular phylogenetics Molecular phylogenetics () is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to ...
has revealed that every clade of bacteria with genome sizes under 2 Mb was derived from ancestors with much larger genomes, thus refuting the hypothesis that bacteria evolved by the successive doubling of small-genomed ancestors. Recent studies performed by Nilsson et al. examined the rates of bacterial genome reduction of obligate bacteria. Bacteria were cultured introducing frequent bottlenecks and growing cells in serial passage to reduce gene transfer so as to mimic conditions of endosymbiotic bacteria. The data predicted that bacteria exhibiting a one-day generation time lose as many as 1,000 kbp in as few as 50,000 years (a relatively short evolutionary time period). Furthermore, after deleting genes essential to the methyl-directed
DNA mismatch repair DNA mismatch repair (MMR) is a system for recognizing and repairing erroneous insertion, deletion, and mis-incorporation of nucleobase, bases that can arise during DNA replication and Genetic recombination, recombination, as well as DNA repair, ...
(MMR) system, it was shown that bacterial genome size reduction increased in rate by as much as 50 times. These results indicate that genome size reduction can occur relatively rapidly, and loss of certain genes can speed up the process of bacterial genome compaction. This is not to suggest that all bacterial genomes are reducing in size and complexity. While many types of bacteria have reduced in genome size from an ancestral state, there are still a huge number of bacteria that maintained or increased genome size over ancestral states. Free-living bacteria experience huge population sizes, fast generation times and a relatively high potential for gene transfer. While deletional bias tends to remove unnecessary sequences, selection can operate significantly amongst free-living bacteria resulting in evolution of new genes and processes.


Horizontal gene transfer

Unlike eukaryotes, which evolve mainly through the modification of existing genetic information, bacteria have acquired a large percentage of their genetic diversity by the horizontal transfer of genes. This creates quite dynamic genomes, in which DNA can be introduced into and removed from the chromosome. Bacteria have more variation in their metabolic properties, cellular structures, and lifestyles than can be accounted for by point mutations alone. For example, none of the phenotypic traits that distinguish ''E. coli'' from ''
Salmonella enterica ''Salmonella enterica'' (formerly ''Salmonella choleraesuis'') is a rod-shaped, flagellate, facultative anaerobic, Gram-negative bacterium and a species of the genus ''Salmonella''. It is divided into six subspecies, arizonae (IIIa), diarizonae ...
'' can be attributed to point mutation. On the contrary, evidence suggests that horizontal gene transfer has bolstered the diversification and speciation of many bacteria. Horizontal gene transfer is often detected via DNA sequence information. DNA segments obtained by this mechanism often reveal a narrow phylogenetic distribution between related species. Furthermore, these regions sometimes display an unexpected level of similarity to genes from taxa that are assumed to be quite divergent. Although gene comparisons and phylogenetic studies are helpful in investigating horizontal gene transfer, the DNA sequences of genes are even more revelatory of their origin and ancestry within a genome. Bacterial species differ widely in overall GC content, although the genes in any one species' genome are roughly identical with respect to base composition, patterns of codon usage, and frequencies of di- and trinucleotides. As a result, sequences that are newly acquired through lateral transfer can be identified via their characteristics, which remains that of the donor. For example, many of the ''S. enterica'' genes that are not present in ''E. coli'' have base compositions that differ from the overall 52% GC content of the entire chromosome. Within this species, some lineages have more than a megabase of DNA that is not present in other lineages. The base compositions of these lineage-specific sequences imply that at least half of these sequences were captured through lateral transfer. Furthermore, the regions adjacent to horizontally obtained genes often have remnants of translocatable elements, transfer origins of
plasmid A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria and ...
s, or known attachment sites of phage
integrase Retroviral integrase (IN) is an enzyme An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme ...
s. In some species, a large proportion of laterally transferred genes originate from plasmid-,
phage A bacteriophage (), also known informally as a phage (), is a virus that infects and replicates within bacteria. The term is derived . Bacteriophages are composed of proteins that encapsulate a DNA or RNA genome, and may have structures tha ...
-, or
transposon A transposable element (TE), also transposon, or jumping gene, is a type of mobile genetic element, a nucleic acid sequence in DNA that can change its position within a genome. The discovery of mobile genetic elements earned Barbara McClinto ...
-related sequences. Although sequence-based methods reveal the prevalence of horizontal gene transfer in bacteria, the results tend to be underestimates of the magnitude of this mechanism, since sequences obtained from donors whose sequence characteristics are similar to those of the recipient will avoid detection. Comparisons of completely sequenced genomes confirm that bacterial chromosomes are amalgams of ancestral and laterally acquired sequences. The hyperthermophilic Eubacteria ''Aquifex aeolicus'' and ''Thermotoga maritima'' each has many genes that are similar in protein sequence to homologues in thermophilic Archaea. 24% of ''Thermotoga's'' 1,877 ORFs and 16% of ''Aquifex's'' 1,512 ORFs show high matches to an Archaeal protein, while mesophiles such as ''E. coli'' and ''B. subtilis'' have far lesser proportions of genes that are most like Archaeal homologues.


Mechanisms of lateral transfer

The genesis of new abilities due to horizontal gene transfer has three requirements. First, there must exist a possible route for the donor DNA to be accepted by the recipient cell. Additionally, the obtained sequence must be integrated with the rest of the genome. Finally, these integrated genes must benefit the recipient bacterial organism. The first two steps can be achieved via three mechanisms: transformation, transduction and conjugation. Transformation involves the uptake of named DNA from the environment. Through transformation, DNA can be transmitted between distantly related organisms. Some bacterial species, such as ''
Haemophilus influenzae ''Haemophilus influenzae'' (formerly called Pfeiffer's bacillus or ''Bacillus influenzae'') is a Gram-negative, Motility, non-motile, Coccobacillus, coccobacillary, facultative anaerobic organism, facultatively anaerobic, Capnophile, capnophili ...
'' and ''
Neisseria gonorrhoeae ''Neisseria gonorrhoeae'', also known as ''gonococcus'' (singular) or ''gonococci'' (plural), is a species of Gram-negative diplococci bacteria first isolated by Albert Ludwig Sigesmund Neisser, Albert Neisser in 1879. An obligate human pathog ...
'', are continuously competent to accept DNA. Other species, such as ''
Bacillus subtilis ''Bacillus subtilis'' (), known also as the hay bacillus or grass bacillus, is a gram-positive, catalase-positive bacterium, found in soil and the gastrointestinal tract of ruminants, humans and marine sponges. As a member of the genus ''Bacill ...
'' and ''
Streptococcus pneumoniae ''Streptococcus pneumoniae'', or pneumococcus, is a Gram-positive, spherical bacteria, hemolysis (microbiology), alpha-hemolytic member of the genus ''Streptococcus''. ''S. pneumoniae'' cells are usually found in pairs (diplococci) and do not f ...
'', become competent when they enter a particular phase in their lifecycle. Transformation in ''N. gonorrhoeae'' and ''H. influenzae'' is effective only if particular recognition sequences are found in the recipient genomes (5'-GCCGTCTGAA-3' and 5'-AAGTGCGGT-3'. respectively). Although the existence of certain uptake sequences improve transformation capability between related species, many of the inherently competent bacterial species, such as ''B. subtilis'' and ''S. pneumoniae'', do not display sequence preference. New genes may be introduced into bacteria by a bacteriophage that has replicated within a donor through generalized transduction or specialized transduction. The amount of DNA that can be transmitted in one event is constrained by the size of the phage
capsid A capsid is the protein shell of a virus, enclosing its genetic material. It consists of several oligomeric (repeating) structural subunits made of protein called protomers. The observable 3-dimensional morphological subunits, which may or m ...
(although the upper limit is about 100 kilobases). While phages are numerous in the environment, the range of microorganisms that can be transduced depends on receptor recognition by the bacteriophage. Transduction does not require both donor and recipient cells to be present simultaneously in time nor space. Phage-encoded proteins both mediate the transfer of DNA into the recipient cytoplasm and assist integration of DNA into the chromosome. Conjugation involves physical contact between donor and recipient cells and is able to mediate transfers of genes between domains, such as between bacteria and yeast. DNA is transmitted from donor to recipient either by self-transmissible or mobilizable plasmid. Conjugation may mediate the transfer of chromosomal sequences by plasmids that integrate into the chromosome. Despite the multitude of mechanisms mediating gene transfer among bacteria, the process's success is not guaranteed unless the received sequence is stably maintained in the recipient. DNA integration can be sustained through one of many processes. One is persistence as an episome, another is homologous recombination, and still another is illegitimate incorporation through lucky double-strand break repair.


Natural transformation

Natural transformation In category theory, a branch of mathematics, a natural transformation provides a way of transforming one functor into another while respecting the internal structure (i.e., the composition of morphisms) of the categories involved. Hence, a natur ...
is a DNA transfer process that depends on the expression of numerous bacterial genes. In order for a bacterium to bind, take up and recombine exogenous DNA into its chromosome, it must enter a special physiological state referred to as “competence”. Competence development in the bacterium ''
Bacillus subtilis ''Bacillus subtilis'' (), known also as the hay bacillus or grass bacillus, is a gram-positive, catalase-positive bacterium, found in soil and the gastrointestinal tract of ruminants, humans and marine sponges. As a member of the genus ''Bacill ...
'' requires expression of about 40 genes. In general, the DNA integrated into the host chromosome is (with rare exceptions) derived from another bacterium of the same species, and is therefore homologous to the resident chromosome. In ''B. subtilis'' the length of the transferred DNA is more than 1 million bases, is likely double stranded DNA, and is often more than a third of the total chromosome length of 4215 kb. Approximately 7-9% of the recipient cells take up an entire chromosome. The capacity for natural transformation appears to be common among prokaryotes, and thus far 67 prokaryotic species (in seven different phyla) are known to undergo this process. Competence for transformation is typically induced by high cell density and/or nutritional limitation, conditions associated with the stationary phase of bacterial growth. Competence is also specifically induced by conditions that damage DNA. For example, transformation is induced in ''
Streptococcus pneumoniae ''Streptococcus pneumoniae'', or pneumococcus, is a Gram-positive, spherical bacteria, hemolysis (microbiology), alpha-hemolytic member of the genus ''Streptococcus''. ''S. pneumoniae'' cells are usually found in pairs (diplococci) and do not f ...
'' by the DNA damaging agents mitomycin C (a DNA cross-linking agent) and fluoroquinolone (a topoisomerase inhibitor that causes double-strand breaks). In ''
Bacillus subtilis ''Bacillus subtilis'' (), known also as the hay bacillus or grass bacillus, is a gram-positive, catalase-positive bacterium, found in soil and the gastrointestinal tract of ruminants, humans and marine sponges. As a member of the genus ''Bacill ...
'', transformation is stimulated by exposure to UV light, a DNA damaging agent. In ''
Helicobacter pylori ''Helicobacter pylori'', previously known as ''Campylobacter pylori'', is a gram-negative, Flagellum#bacterial, flagellated, Bacterial cellular morphologies#Helical, helical bacterium. Mutants can have a rod or curved rod shape that exhibits l ...
'', ciprofloxacin, an agent that interacts with DNA gyrase and causes double-strand breaks, induces expression of competence genes, thus increasing the frequency of transformation Using ''
Legionella pneumophila ''Legionella pneumophila'', the primary causative agent for Legionnaires' disease, Legionnaire's disease, is an Aerobic organism, aerobic, pleomorphic, Flagellum, flagellated, non-spore-forming, Gram-negative bacteria, Gram-negative bacterium. ' ...
'', Charpentier et al. examined 64 toxic molecules to find out which of these induce competence. Of these toxic compounds, only six, all DNA damaging agents, caused strong induction. Bacteria that are growing logarithmically differ from stationary phase bacteria with regard to the number of genome copies present in the cell, and this has implications for the ability to carry out an important
DNA repair DNA repair is a collection of processes by which a cell (biology), cell identifies and corrects damage to the DNA molecules that encode its genome. A weakened capacity for DNA repair is a risk factor for the development of cancer. DNA is cons ...
process. During logarithmic growth, two or more copies of any particular region of the chromosome are ordinarily present in a bacterial cell, as cell division is not precisely matched with chromosome replication.
Homologous recombination Homologous recombination is a type of genetic recombination in which genetic information is exchanged between two similar or identical molecules of double-stranded or single-stranded nucleic acids (usually DNA as in Cell (biology), cellular organi ...
al repair is an important DNA repair process that is particularly effective for repairing double-strand damages, such as double-strand breaks. This DNA repair process depends on a second homologous chromosome in addition to the damaged chromosome. During logarithmic growth, a
DNA damage DNA repair is a collection of processes by which a cell identifies and corrects damage to the DNA molecules that encode its genome. A weakened capacity for DNA repair is a risk factor for the development of cancer. DNA is constantly modified ...
in one chromosome may be removed by homologous recombinational repair using sequence information from the other homologous chromosome. However, when cells approach stationary phase they typically have just one copy of the chromosome, and homologous recombinational repair then requires input of an homologous template from outside the cell by transformation. To determine whether the adaptive function of transformation is repair of DNA damages, a series of experiments were performed using ''B. subtilis'' irradiated by UV light as the damaging agent (reviewed by Michod et al. and Bernstein et al.) These experiments produced results indicating that transforming DNA acts to repair potentially lethal DNA damages caused by UV light in the recipient DNA. The particular process likely responsible for repair was homologous recombinational repair. Thus transformation in bacteria can be regarded as a primitive sexual process, in the sense that it involves interaction of homologous DNA from two individuals to form recombinant DNA that is then passed on to succeeding generations. Bacterial transformation in prokaryotes may have been the ancestral process that evolved into meiotic sexual reproduction in eukaryotes (see
Evolution of sexual reproduction Sexually reproducing animals, plants, fungi and protists are thought to have evolved from a common ancestor that was a single-celled eukaryotic species. Sexual reproduction is widespread in eukaryotes, though a few eukaryotic species have ...
;
Meiosis Meiosis () is a special type of cell division of germ cells in sexually-reproducing organisms that produces the gametes, the sperm or egg cells. It involves two rounds of division that ultimately result in four cells, each with only one c ...
.)


Traits introduced through lateral gene transfer

Antimicrobial resistance Antimicrobial resistance (AMR or AR) occurs when microbes evolve mechanisms that protect them from antimicrobials, which are drugs used to treat infections. This resistance affects all classes of microbes, including bacteria (antibiotic resista ...
genes grant an organism the ability to grow its ecological niche, since it can now survive in the presence of previously lethal compounds. As the benefit to a bacterium earned from receiving such genes are time- and space-independent, those sequences that are highly mobile are selected for. Plasmids are quite mobilizable between taxa and are the most frequent way by which bacteria acquire antibiotic resistance genes. Adoption of a pathogenic lifestyle often yields a fundamental shift in an organism's ecological niche. The erratic phylogenetic distribution of pathogenic organisms implies that bacterial virulence is a consequence of the presence, or obtainment of, genes that are missing in avirulent forms. Evidence of this includes the discovery of large 'virulence' plasmids in pathogenic ''Shigella'' and ''Yersinia'', as well as the ability to bestow pathogenic properties onto ''E. coli'' via experimental exposure to genes from other species.


Computer-made form

In April 2019, scientists at
ETH Zurich ETH Zurich (; ) is a public university in Zurich, Switzerland. Founded in 1854 with the stated mission to educate engineers and scientists, the university focuses primarily on science, technology, engineering, and mathematics. ETH Zurich ran ...
reported the creation of the world's first bacterial genome, named '' Caulobacter ethensis-2.0'', made entirely by a computer, although a related viable form of ''C. ethensis-2.0'' does not yet exist.


See also

*
Comparative genomics Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach c ...
* Fungal genome


References

{{Reflist, 2 Bacteria Genomics