Minimal Genome
   HOME

TheInfoList



OR:

The minimal genome is a concept which can be defined as the set of genes sufficient for life to exist and propagate under nutrient-rich and stress-free conditions. Alternatively, it may be defined as the gene set supporting life on an
axenic In biology, axenic (, ) describes the state of a culture in which only a single species, variety, or strain of organism is present and entirely free of all other contaminating organisms. The earliest axenic cultures were of bacteria or unicellul ...
cell culture in rich media, and it is thought what makes up the minimal genome will depend on the environmental conditions that the organism inhabits. This minimal genome concept assumes that
genome A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
s can be reduced to a bare minimum, given that they contain many non-essential genes of limited or situational importance to the organism. Therefore, if a collection of all the
essential genes Essential genes are indispensable genes for organisms to grow and reproduce offspring under certain environment. However, being ''essential'' is highly dependent on the circumstances in which an organism lives. For instance, a gene required to dige ...
were put together, a minimum genome could be created artificially in a stable environment. By adding more genes, the creation of an organism of desired characteristics is possible. The concept of minimal genome arose from the observations that many genes do not appear to be necessary for survival. In order to create a new organism a scientist must determine the minimal set of genes required for
metabolism Metabolism (, from ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run cellular processes; the co ...
and replication. This can be achieved by experimental and computational analysis of the biochemical pathways needed to carry out basic metabolism and reproduction. A good model for a minimal genome is ''
Mycoplasma genitalium ''Mycoplasma genitalium'' (also known as ''MG','' Mgen, or since 2018, ''Mycoplasmoides genitalium'') is a sexually transmitted, small and pathogenic bacterium that lives on the mucous epithelial cells of the urinary and genital tracts in ...
'' due to its very small genome size. Most genes that are used by this organism are usually considered essential for survival; based on this concept a minimal set of 256 genes has been proposed. Scientifically, minimal genome projects allow the identification of the most essential genes, and the reduction of genetic complexity, making engineered strains more predictable. Industrially and agriculturally, they could be used to engineer plants to resist herbicides or harsh environments; bacteria to synthetically produce chemicals; or microbes to produce beneficial bio-products. Environmentally, they could be a source of clean energy or renewable chemicals, or help in carbon sequestration from the atmosphere.


Contents

By one early investigation, the minimal genome of a
bacterium Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among the ...
should include a virtually complete set of proteins for replication and translation, a transcription apparatus including four subunits of
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template. Using the e ...
including the
sigma factor A sigma factor (σ factor or specificity factor) is a protein needed for initiation of Transcription (biology), transcription in bacteria. It is a bacterial transcription initiation factor that enables specific binding of RNA polymerase (RNAP) to g ...
rudimentary proteins sufficient for recombination and repair, several chaperone proteins, the capacity for anaerobic metabolism through
glycolysis Glycolysis is the metabolic pathway that converts glucose () into pyruvic acid, pyruvate and, in most organisms, occurs in the liquid part of cells (the cytosol). The Thermodynamic free energy, free energy released in this process is used to form ...
and
substrate-level phosphorylation Substrate-level phosphorylation is a metabolism reaction that results in the production of ATP or GTP supported by the energy released from another high-energy bond that leads to phosphorylation of ADP or GDP to ATP or GTP (note that the rea ...
, transamination of glutamyl-tRNA to glutaminyl-tRNA, lipid (but no fatty acid) biosynthesis, eight cofactor enzymes, protein export machinery, and a limited metabolite transport network including membrane ATPases. Proteins involved in the minimum bacterial genome tend to be substantially more related to proteins found in
archaea Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
and
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s compared to the average gene in the bacterial genome more generally indicating a substantial number of universally (or near universally) conserved proteins. The minimal genomes reconstructed on the basis of existing genes does not preclude simpler systems in more primitive cells, such as an
RNA world The RNA world is a hypothetical stage in the evolutionary history of life on Earth in which self-replicating RNA molecules proliferated before the evolution of DNA and proteins. The term also refers to the hypothesis that posits the existence ...
genome which does not have the need for DNA replication machinery, which is otherwise part of the minimal genome of current cells. The genes which most frequently survive gene loss include those involved in DNA replication, transcription, and translation, although a number of exceptions are known. For example, loss can be frequently seen in subunits of the DNA polymerase holoenzyme and some
DNA repair DNA repair is a collection of processes by which a cell (biology), cell identifies and corrects damage to the DNA molecules that encode its genome. A weakened capacity for DNA repair is a risk factor for the development of cancer. DNA is cons ...
genes. The majority of ribosomal proteins are retained (though some like RpmC are sometimes missing). In some cases, some tRNA synthetases are lost. Gene loss is also seen in genes for components in the cellular envelope, biosynthesis of biomolecules like purine, energy metabolism, and more. The minimal genome corresponds to small genome sizes, as bacterial genome size correlates with the number of protein-coding genes, typically one gene per kilobase.
Mycoplasma genitalium ''Mycoplasma genitalium'' (also known as ''MG','' Mgen, or since 2018, ''Mycoplasmoides genitalium'') is a sexually transmitted, small and pathogenic bacterium that lives on the mucous epithelial cells of the urinary and genital tracts in ...
, with a 580 kb genome and 482 protein-coding genes, is a key model for minimal genomes.


In nature


Gene outsourcing

The smallest known genome of a free-living bacterium is 1.3 Mb with ~1100 genes. However, significantly more reduced genomes are commonly observed in naturally occurring symbiotic and parasitic organisms. Genome reduction driven by
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
and
genetic drift Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene va ...
in small and asexual populations with biases for gene deletion can be seen in symbionts and parasites, which commonly experience rapid evolution, codon reassignments, biases for AT nucleotide compositions, and elevated levels of
protein misfolding In medicine, proteinopathy ( 'pref''. protein -pathy 'suff''. disease proteinopathies ''pl''.; proteinopathic ''adj''), or proteopathy, protein conformational disorder, or protein misfolding disease, is a class of diseases in which certain prote ...
which results in a heavy dependence on
molecular chaperones In molecular biology, molecular chaperones are proteins that assist the conformational folding or unfolding of large proteins or macromolecular protein complexes. There are a number of classes of molecular chaperones, all of which function to assi ...
to ensure protein functionality. These effects, which coincide with the proliferation of mobile genetic elements, pseudogenes, genome rearrangements, and chromosomal deletion are best studied and observed in more recently evolved symbionts. The cause for this is that the symbiont or parasite can outsource a usual cellular function to another cell and so, in the absence of needing to carry out this function for itself, subsequently lose its own genes meant to perform this function. The most extreme examples of genome reduction have been found in maternally transmitted endosymbionts which have experienced lengthy coevolution with their hosts and, in the process, lost a substantial amount of their cellular autonomy. Beneficial symbionts have a greater capacity for genome reduction than do parasites, as host co-adaptation allows them to lose additional crucial genes. Another important distinction between genome reduction in parasites and genome reduction in endosymbionts is that parasites lose both the gene and its associated function, whereas endosymbionts often retain the function of the lost gene since that function is taken over by the host.


Endosymbionts

For endosymbionts in some lineages, it is possible for the entire genome to be lost. For example, some
mitosomes A mitosome (also called a ''crypton'' in early literature) is a mitochondrion-related organelle (MRO) found in a variety of parasitic unicellular eukaryotes, such as members of the supergroup Excavata. The mitosome was first discovered in 1999 in ' ...
and hydrogenosomes (degenerate versions of the mitochondria known in some organisms) have experienced a total gene loss and have no remaining genes, whereas the human
mitochondria A mitochondrion () is an organelle found in the cells of most eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is us ...
still retains some of its genome. The extant genome in the human mitochondrial organelle is 16.6kb in length and contains 37 genes. Between organisms, the mitochondrial genome can code for between 3 and 67 proteins, with suggestions that the last eukaryotic common ancestor encoded a minimum of 70 genes in its genome. The smallest known mitochondrial genome is that of ''
Plasmodium falciparum ''Plasmodium falciparum'' is a Unicellular organism, unicellular protozoan parasite of humans and is the deadliest species of ''Plasmodium'' that causes malaria in humans. The parasite is transmitted through the bite of a female ''Anopheles'' mos ...
'', with a genome size of 6kb containing three protein-coding genes and a few rRNA genes. (On the other hand, the largest known mitochondrial genome is 490kb.) Genomes nearly as small can be found in related apicomplexans as well. On the other hands, the mitochondrial genomes of land plants have expanded to over 200kb with the largest one (at over 11Mb) exceeding the size of the genome of bacteria and even the simplest eukaryotes. Organelles known as
plastids A plastid is a membrane-bound organelle found in the cells of plants, algae, and some other eukaryotic organisms. Plastids are considered to be intracellular endosymbiotic cyanobacteria. Examples of plastids include chloroplasts (used for photo ...
in plants (including
chloroplasts A chloroplast () is a type of membrane-bound organelle, organelle known as a plastid that conducts photosynthesis mostly in plant cell, plant and algae, algal cells. Chloroplasts have a high concentration of chlorophyll pigments which captur ...
,
chromoplasts Chromoplasts are plastids, heterogeneous organelles responsible for pigment synthesis and storage in specific photosynthetic eukaryotes. It is thought (according to symbiogenesis) that like all other plastids including chloroplasts and leucop ...
, and
leucoplasts Leucoplasts ("formed, molded") are a category of plastid and as such are organelles found in plant cells. They are non-pigmented, in contrast to other plastids such as the chloroplast. Background Lacking photosynthetic pigments, leucoplasts are ...
), once free-living
cyanobacteria Cyanobacteria ( ) are a group of autotrophic gram-negative bacteria that can obtain biological energy via oxygenic photosynthesis. The name "cyanobacteria" () refers to their bluish green (cyan) color, which forms the basis of cyanobacteri ...
, typically retain longer genomes on the order of 100-200kb with 80-250 genes. In one analysis of 15 chloroplast genomes, the analyzed chloroplasts had between 60 and 200 genes. Across these chloroplasts, a total of 274 distinct protein-coding genes were identified, and only 44 of them were universally found in all sequenced chloroplast genomes. Examples of organisms which have experienced genome reduction include species of '' Buchnera'', ''
Chlamydia Chlamydia, or more specifically a chlamydia infection, is a sexually transmitted infection caused by the bacterium ''Chlamydia trachomatis''. Most people who are infected have no symptoms. When symptoms do appear, they may occur only several w ...
'', ''
Treponema ''Treponema'' is a genus of spiral-shaped bacteria. The major treponeme species of human pathogens is ''Treponema pallidum'', whose subspecies are responsible for diseases such as syphilis, bejel, and yaws. '' Treponema carateum'' is the cause ...
'', ''
Mycoplasma ''Mycoplasma'' is a genus of bacteria that, like the other members of the class ''Mollicutes'', lack a cell wall, and its peptidoglycan, around their cell membrane. The absence of peptidoglycan makes them naturally resistant to antibiotics ...
'', and many others. Comparisons of multiple sequenced genomes of endosymbionts in multiple isolates of the same species and lineages have confirmed that even long-time symbionts are still experiencing ongoing gene loss and transfer to the nucleus. Nuclear integrants of mitochondrial or plastid DNA have sometimes been termed "numts" and "nupts" respectively.


Cellular parasites and insect symbionts

A number of symbionts have now been discovered with genomes under 500 kb in length, the majority of them being bacterial symbionts of insects typically from the taxa ''
Pseudomonadota Pseudomonadota (synonym "Proteobacteria") is a major phylum of gram-negative bacteria. Currently, they are considered the predominant phylum within the domain of bacteria. They are naturally found as pathogenic and free-living (non- parasitic) ...
'' and ''
Bacteroidota The phylum (biology), phylum Bacteroidota (synonym Bacteroidetes) is composed of three large classes of Gram-negative bacteria, Gram-negative, nonsporeforming, anaerobic or aerobic, and rod-shaped bacteria that are widely distributed in the envir ...
''. The parasitic archaea ''
Nanoarchaeum equitans ''Nanoarchaeum equitans'' is a species of marine archaea that was discovered in 2002 in a hydrothermal vent off the coast of Iceland on the Kolbeinsey Ridge by Karl Stetter. It has been proposed as the first species in a new phylum, and is th ...
'' has a genome 491 kb in length. In 2002, it was found that some species of the genus ''Buchnera'' have a reduced genome of only 450 kb in size. In 2021, the endosymbiont "''Candidatus'' Azoamicus ciliaticola" was found to have a genome 290 kb in length. The symbiont ''Zinderia insecticola'' was found to have a genome of 208 kb in 2010. In 2006, another endosymbiont Carsonella ruddii was found with a reduced genome 160 kb in length encompassing 182 protein-coding genes. Surprisingly, it was found that gene loss in ''Carsonella'' symbionts is an ongoing process. Other intermediate stages in gene loss have been observed in other reduced genomes, including the transition of some genes into pseudogenes as a result of accumulating mutations that are not selected against since the host carries out the needed purpose of that gene. The genome of ''Candidatus'' Hodgkinia cicadicola, a symbiont of cicadas, was found to be 144 kb. In 2011, ''Tremblaya princeps'' was found to contain an intracellular endosymbiont with a genome of 139 kb, reduced to the point that even some translation genes had been lost. In the smallest to date, a 2013 study found some bacterial symbionts of insects with even smaller genomes. Specifically, two leafhopper symbionts contained highly reduced genomes: while '' Sulcia muelleri'' had a genome of 190 kb, '' Nasuia deltocephalinicola'' had a genome of only 112 kb and contains 137 protein-coding genes. Combined, the genomes of these two symbionts can only synthesize ten amino acids, in addition to some of the machinery involved in DNA replication, transcription, and translation. The genes for ATP synthesis through oxidative phosphorylation have been lost, however.


Viruses and virus-like particles

Viruses and virus-like particles have the smallest genomes in nature. For instance,
bacteriophage MS2 Bacteriophage MS2 (''Emesvirus zinderi''), commonly called MS2, is an icosahedral, positive-sense single-stranded RNA virus that infects the bacterium ''Escherichia coli'' and other members of the Enterobacteriaceae. MS2 is a member of a family ...
consists of only 3569 nucleotides (single-stranded RNA) and encodes just four proteins which overlap to make efficient use of the genome space. Similarly, among eukaryotic viruses,
porcine circovirus Porcine circovirus (PCV) is a group of four single-stranded DNA viruses that are non-enveloped with an unsegmented circular genome. They are members of the genus ''Circovirus'' that can infect pigs. The viral capsid is icosahedral and approxima ...
es are among the smallest. They encode only 2–3
open reading frame In molecular biology, reading frames are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible reading frames ...
s.
Viroids Viroids are small single-stranded, circular RNAs that are infectious pathogens. Unlike viruses, they have no protein coating. All known viroids are inhabitants of angiosperms (flowering plants), and most cause diseases, whose respective econo ...
are circular molecules RNA which do not have any protein-coding genes at all, although the RNA molecule itself acts as a ribozyme to help enable its replication. The genome of a viroid is between 200 and 400 nucleotides in length.


History


NASA collaboration

This concept arose as a result of a collaborative effort between
National Aeronautics and Space Administration The National Aeronautics and Space Administration (NASA ) is an independent agency of the US federal government responsible for the United States's civil space program, aeronautics research and space research. Established in 1958, it su ...
(NASA) and two scientists: Harold Morowitz and Mark Tourtellotte. In the 1960s, NASA was searching for extraterrestrial life forms, assuming that if they existed they may be simple creatures. To attract people's attention, Morowitz published about mycoplasmas as being the smallest and simplest self-replicating creatures. NASA and the two scientists grouped together and came up with the idea to assemble a living cell from the components of mycoplasmas. Mycoplasmas were selected as the best candidate for cell reassembly, since they are composed of a minimum set of organelles, such as a plasma membrane,
ribosomes Ribosomes () are macromolecular machines, found within all cells, that perform biological protein synthesis (messenger RNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA molecules to fo ...
and a circular double stranded DNA. Morowitz' major idea was to define the entire machinery of mycoplasmas cell in molecular level. He announced that an international effort would help him accomplish this main objective. :The main plan consisted of: :# Physical and functional mapping with complete sequencing of the mycoplasma :# Determine the open reading frames (ORFs) :# Determining the encoded amino acids :# Understanding the functions of genes :# Final step: reassemble mycoplasma's cellular machinery


Attempts

By the 1980s, Richard Herrmann's laboratory had fully sequenced and genetically characterized the 800kb genome of '' M. pneumoniae''. Despite the small size of the genome, the process took three years. In 1995, another laboratory from Maryland the Institute for Genomic Research (TIGR) collaborated with the teams of Johns Hopkins and the University of North Carolina. This group chose to sequence the genome of
Mycoplasma genitalium ''Mycoplasma genitalium'' (also known as ''MG','' Mgen, or since 2018, ''Mycoplasmoides genitalium'') is a sexually transmitted, small and pathogenic bacterium that lives on the mucous epithelial cells of the urinary and genital tracts in ...
, consisting of only 580 kb genome. This was completed in 6 months. Sequencing M. genitalium revealed conserved genes crucial for defining essential life functions in a minimal self-replicating cell, making it a key candidate for the minimal genome project. Finding a minimal set of essential genes is usually done by selective inactivation or deletions of genes and then testing the effect of each under a given set of conditions. The J. Craig Venter institute conducted these types of experiment on ''M. genitalium'' and found 382 essential genes. The J.Craig Venter institute later started a project to create a synthetic organism named Mycoplasma laboratorium, using the minimal set genes identified from ''M. genitalium''.


Studies of orthologs

Reconstruction of a minimal genome is possible by using the knowledge of existing genomes via which the sets of genes, essential for living can also be determined. Once the set of essential genetic elements are known, one can proceed to define the key pathways and core-players by modeling simulations and wet lab genome engineering. As of 1999, the two organisms upon which the ‘minimal gene set for cellular life' have been applied are: ''
Haemophilus influenzae ''Haemophilus influenzae'' (formerly called Pfeiffer's bacillus or ''Bacillus influenzae'') is a Gram-negative, Motility, non-motile, Coccobacillus, coccobacillary, facultative anaerobic organism, facultatively anaerobic, Capnophile, capnophili ...
'', and ''M. genitalium''. A list of
ortholog Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
ous proteins were compiled in hope that it would contain protein necessary for cell survival, as orthologous analysis determines how two organisms evolved and shed away any non-essential genes. Since ''H. influenza'' and ''M. genitalium'' are
Gram negative Gram-negative bacteria are bacteria that, unlike gram-positive bacteria, do not retain the crystal violet stain used in the Gram staining method of bacterial differentiation. Their defining characteristic is that their cell envelope consists of ...
and
Gram positive bacteria In bacteriology, gram-positive bacteria are bacteria that give a positive result in the Gram stain test, which is traditionally used to quickly classify bacteria into two broad categories according to their type of cell wall. The Gram stain is ...
and due to their vast evolution it was expected that these organisms would be enriched genes that were of universal importance. However, 244 detected orthologs discovered contained no parasitism-specific proteins. The conclusion of this analysis was that similar biochemical functions might be performed by non-orthologous proteins. Even when biochemical pathways of these two organisms were mapped, several pathways were present but many were incomplete. Proteins determined to be common between the two organisms were non-orthologous to each other. Much of the research mainly focuses on the ancestral genome and less on the minimal genome. Studies of these existing genomes have helped determine that orthologous gene found in these two species are not necessarily essential for survival, in fact non-orthologous genes were found to be more important. Also, it was determined that in order for proteins to share same functions they do not need to have same sequence or common three dimensional folds. Distinguishing between orthologs and
paralogs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
and detecting displacements of orthologs have been quiet beneficial in reconstructing evolution and determining the minimal gene set required for a cellular life. Instead, of conducting a strict orthology study, comparing groups of orthologs and occurrence in most
clades In biology, a clade (), also known as a monophyletic group or natural group, is a group of organisms that is composed of a common ancestor and all of its descendants. Clades are the fundamental unit of cladistics, a modern approach to taxonomy ...
instead of every species helped encounter genes lost or displaced. Only genomes that have been completely sequenced have enabled in studying orthologs among the group of organisms. Without a fully sequenced genome it would not be possible to determine the essential minimal gene set required for survival.


JCVI projects

J. Craig Venter Institute The J. Craig Venter Institute (JCVI) is a non-profit genomics research institute founded by J. Craig Venter, Ph.D. in October 2006. The institute was the result of consolidating four organizations: the Center for the Advancement of Ge ...
(JCVI) conducted a study to find all the
essential gene Essential genes are indispensable genes for organisms to grow and reproduce offspring under certain environment. However, being ''essential'' is highly dependent on the circumstances in which an organism lives. For instance, a gene required to dige ...
s of ''M. genitalium'' through global
transposon A transposable element (TE), also transposon, or jumping gene, is a type of mobile genetic element, a nucleic acid sequence in DNA that can change its position within a genome. The discovery of mobile genetic elements earned Barbara McClinto ...
mutagenesis. As a result, they found that 382 out of 482 protein coding genes were essential. Genes encoding proteins of unknown function constitute 28% of the essential protein coding genes set. Before conducting this study the JCVI had performed another study on the non-essential genes, genes not required for growth, of ''M.genitalium'', where they reported the use of transposon
mutagenesis Mutagenesis () is a process by which the genetic information of an organism is changed by the production of a mutation. It may occur spontaneously in nature, or as a result of exposure to mutagens. It can also be achieved experimentally using lab ...
. Despite figuring out the non-essential genes, it is not confirmed that the products that these genes make have any important biological functions. It was only through gene essentiality studies of bacteria that JCVI has been able to compose a hypothetical minimal gene sets.


1999 and 2005 publications

In JCVI's 1999 study among the two organisms, ''M. genitalium'' and ''
Mycoplasma pneumoniae ''Mycoplasma pneumoniae'' is a species of very small-cell bacteria that lack a cell wall, in the class Mollicutes. ''M. pneumoniae'' is a human pathogen that causes the disease Mycoplasma pneumonia, a form of atypical bacterial pneumonia related ...
'' they mapped around 2,200 transposon insertion sites and identified 130 putative non-essentials genes in ''M. genitalium'' protein coding genes or ''M. pneumoniae'' orthologs of ''M. genitalium'' genes. In their experiment they grew a set of Tn4001 transformed cells for many weeks and isolated the genomic DNA from these mixture of
mutants In biology, and especially in genetics, a mutant is an organism or a new genetic character arising or resulting from an instance of mutation, which is generally an alteration of the DNA sequence of the genome or chromosome of an organism. It i ...
. Amplicons were sequenced to detect the transposon insertion sites in mycoplasma genomes. Genes that contained the transposon insertions were hypothetical proteins or proteins considered non-essential. Meanwhile, during this process some of the disruptive genes once considered non-essential, after more analyses turned out essential. The reason for this error could have been due to genes being tolerant to the transposon insertions and thus not being disrupted; cells may have contained two copies of the same gene; or gene product was supplied by more than one cell in those mixed pools of mutants. Insertion of transposon in a gene meant it was disturbed, hence non-essential, but because they did not confirm the absence of gene products they mistook all disruptive genes as non-essential genes. The same study of 1999 was later expanded and the updated results were then published in 2005. Some of the disruptive genes thought to be essential were isoleucyl and tyrosyl-tRNA synthetases (MG345 and MG455), DNA replication gene ''
dnaA DnaA is a protein that activates initiation of DNA replication in bacteria. Based on the Replicon Model, a positively active initiator molecule contacts with a particular spot on a circular chromosome called the replicator to start DNA replicatio ...
'' (MG469), and
DNA polymerase A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create t ...
III subunit a (MG261). The way they improved this study was by isolating and characterizing ''M. genitalium'' Tn4001 insertions in each colony one by one. The individual analyses of each colony showed more results and estimates of essential genes necessary for life. The key improvement they made in this study was isolating and characterizing individual transposon mutants. Previously, they isolated many colonies containing a mixture of mutants. The filter cloning approach helped in separating the mixtures of mutants. Now, they claim completely different sets of non-essential genes. The 130 non-essential genes claimed at first have now reduced to 67. Of the remaining 63 genes 26 genes were only disrupted in ''M. pneumoniae'' which means that some ''M. genitalium'' orthologs of non-essential ''M. pneumoniae'' genes were actually essential. They have now fully identified almost all of the non-essential genes in ''M. genitalium'', the number of gene disruptions based on colonies analyzed reached a plateau as function and they claim a total of 100 non-essential genes out of the 482 protein coding genes in ''M. genitalium''.


Mycoplasma laboratorium

The ultimate result of this project has now come down to constructing a synthetic organism, ''
Mycoplasma laboratorium ''Mycoplasma laboratorium'' or Synthia refers to a plan to produce a synthetic biology, synthetic strain of bacterium. The project to build the new bacterium has evolved since its inception. Initially the goal was to identify a minimal set of ge ...
'' based on the 387 protein coding region and 43 structural RNA genes found in ''M. genitalium''. This project is currently still going on.


First self replicating synthetic cell

Researchers at the JCVI in 2010 successfully created a synthetic bacterial cell that was capable of replicating itself. The team synthesized a 1.08 million base pair chromosome of a modified ''
Mycoplasma mycoides ''Mycoplasma mycoides'' is a bacterial species of the genus ''Mycoplasma'' in the class Mollicutes. This microorganism is a parasite that lives in ruminants. ''Mycoplasma mycoides'' comprises two subspecies, ''Mycoplasma mycoides subsp. mycoides, ...
''. The synthetic cell is called: ''Mycoplasma mycoides'' JCVI-syn1.0. The DNA was designed in a computer, synthesized, and transplanted into a cell from which the original genome had been removed. The original molecules and on-going reaction networks of the recipient cell then used the artificial DNA to generate daughter cells. These daughter cells are of synthetic origin and capable of further replication, solely controlled by the synthetic genome. The first half of the project took 15 years to complete. The team designed an accurate, digitized genome of ''M. mycoides''. A total of 1,078 cassettes were built, each 1,080 base pairs long. These cassettes were designed in a way that the end of each DNA cassette overlapped by 80 base pairs. The whole assembled genome was transplanted in yeast cells and grown as yeast artificial chromosome.


Future direction and uses

Based on JCVI's progress in the field of synthetic biology, it is possible that in near future scientists will be able to propagate ''M. genitalium's'' genome in the form of naked DNA, into recipient mycoplasmas cells and replace their original genome with a synthetic genome. Since, mycoplasmas have no cell wall, the transfer of a naked DNA into their cell is possible. The only requirement now is the technique to include the synthetic genome of ''M. genitalium'' into mycoplasma cells. To some extent this has become possible, the first replicating synthetic cell has already been developed by the JCVI and they are now on to creating their first synthetic life, consisting of minimal number of essential genes. This new breakthrough in synthetic biology will certainly bring in a new approach to understand biology; and this redesigning and prototyping genomes will later become beneficial to biotechnology companies, enabling them to produce synthetic microbes that produce new, cheaper and better bio-products.


Minimal genome projects

A number of projects have attempted to identify the
essential gene Essential genes are indispensable genes for organisms to grow and reproduce offspring under certain environment. However, being ''essential'' is highly dependent on the circumstances in which an organism lives. For instance, a gene required to dige ...
s of a species. This number should approximate the "minimal genome". For instance, the genome of ''E. coli'' has been reduced by about 30%, demonstrating that this species can live with much fewer genes than the wild-type genome contains. The following table contains a list of such minimal genome projects (including the various techniques used).


Number of essential genes

The number of
essential gene Essential genes are indispensable genes for organisms to grow and reproduce offspring under certain environment. However, being ''essential'' is highly dependent on the circumstances in which an organism lives. For instance, a gene required to dige ...
s is different for each organism. In fact, each organism has a different number of essential genes, depending on which strain (or individual) is tested. In addition, the number depends on the conditions under which an organism is tested. In several bacteria (or other microbes such as yeast) all or most genes have been deleted individually to determine which genes are "essential" for survival. Such tests are usually carried out on rich media which contain all nutrients. However, if all nutrients are provided, genes required for the synthesis of nutrients are not "essential". When cells are grown on minimal media, many more genes are essential as they may be needed to synthesize such nutrients (e.g. vitamins). The numbers provided in the following table typically have been collected using rich media (but consult original references for details). The number of essential genes were collected from the Database of Essential Genes (DEG), except for ''B. subtilis'', where the data comes from Genome News Network The organisms listed in this table have been systematically tested for essential genes. For more information about minimal genome Please refer also to section 'Other Genera' at 'Mycoplasma laboratorium'.


References

{{Reflist, 30em Genetics techniques Genomics