HOME

TheInfoList



OR:

Chloroplast DNA (cpDNA) is the DNA located in chloroplasts, which are photosynthetic organelles located within the cells of some eukaryotic organisms. Chloroplasts, like other types of
plastid The plastid (Greek: πλαστός; plastós: formed, molded – plural plastids) is a membrane-bound organelle found in the Cell (biology), cells of plants, algae, and some other eukaryotic organisms. They are considered to be intracellular endosy ...
, contain a genome separate from that in the cell nucleus. The existence of chloroplast DNA was identified biochemically in 1959, and confirmed by electron microscopy in 1962. The discoveries that the chloroplast contains ribosomes and performs protein synthesis revealed that the chloroplast is genetically semi-autonomous. The first complete chloroplast genome sequences were published in 1986, '' Nicotiana tabacum'' (tobacco) by Sugiura and colleagues and '' Marchantia polymorpha'' (liverwort) by Ozeki et al. Since then, a great number of chloroplast DNAs from various species have been sequenced.


Molecular structure

Chloroplast A chloroplast () is a type of membrane-bound organelle known as a plastid that conducts photosynthesis mostly in plant and algal cells. The photosynthetic pigment chlorophyll captures the energy from sunlight, converts it, and stores it in ...
DNAs are circular, and are typically 120,000–170,000
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
long. They can have a contour length of around 30–60 micrometers, and have a mass of about 80–130 million daltons. Most chloroplasts have their entire chloroplast genome combined into a single large ring, though those of
dinophyte algae The dinoflagellates (Greek δῖνος ''dinos'' "whirling" and Latin ''flagellum'' "whip, scourge") are a monophyletic group of single-celled eukaryotes constituting the phylum Dinoflagellata and are usually considered algae. Dinoflagellates are ...
are a notable exception—their genome is broken up into about forty small
plasmids A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; how ...
, each 2,000–10,000
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
long. Each minicircle contains one to three genes, but blank plasmids, with no coding DNA, have also been found. Chloroplast DNA has long been thought to have a circular structure, but some evidence suggests that chloroplast DNA more commonly takes a linear shape. Over 95% of the chloroplast DNA in
corn Maize ( ; ''Zea mays'' subsp. ''mays'', from es, maíz after tnq, mahiz), also known as corn (North American and Australian English), is a cereal grain first domesticated by indigenous peoples in southern Mexico about 10,000 years ago. Th ...
chloroplasts has been observed to be in branched linear form rather than individual circles.


Inverted repeats

Many chloroplast DNAs contain two ''inverted repeats'', which separate a long single copy section (LSC) from a short single copy section (SSC). The inverted repeats vary wildly in length, ranging from 4,000 to 25,000
base pairs A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
long each. Inverted repeats in plants tend to be at the upper end of this range, each being 20,000–25,000 base pairs long. The inverted repeat regions usually contain three ribosomal RNA and two tRNA genes, but they can be expanded or reduced to contain as few as four or as many as over 150 genes. While a given pair of inverted repeats are rarely completely identical, they are always very similar to each other, apparently resulting from concerted evolution. The inverted repeat regions are highly conserved among land plants, and accumulate few mutations. Similar inverted repeats exist in the genomes of cyanobacteria and the other two chloroplast lineages ( glaucophyta and rhodophyceæ), suggesting that they predate the chloroplast, though some chloroplast DNAs like those of peas and a few
red algae Red algae, or Rhodophyta (, ; ), are one of the oldest groups of eukaryotic algae. The Rhodophyta also comprises one of the largest phyla of algae, containing over 7,000 currently recognized species with taxonomic revisions ongoing. The majority ...
have since lost the inverted repeats. Others, like the red alga '' Porphyra'' flipped one of its inverted repeats (making them direct repeats). It is possible that the inverted repeats help stabilize the rest of the chloroplast genome, as chloroplast DNAs which have lost some of the inverted repeat segments tend to get rearranged more.


Nucleoids

Each chloroplast contains around 100 copies of its DNA in young leaves, declining to 15–20 copies in older leaves. They are usually packed into nucleoids which can contain several identical chloroplast DNA rings. Many nucleoids can be found in each chloroplast. Though chloroplast DNA is not associated with true histones, in
red algae Red algae, or Rhodophyta (, ; ), are one of the oldest groups of eukaryotic algae. The Rhodophyta also comprises one of the largest phyla of algae, containing over 7,000 currently recognized species with taxonomic revisions ongoing. The majority ...
, a histone-like chloroplast protein (HC) coded by the chloroplast DNA that tightly packs each chloroplast DNA ring into a
nucleoid The nucleoid (meaning ''nucleus-like'') is an irregularly shaped region within the prokaryotic cell that contains all or most of the genetic material. The chromosome of a prokaryote is circular, and its length is very large compared to the cell dim ...
has been found. In primitive
red algae Red algae, or Rhodophyta (, ; ), are one of the oldest groups of eukaryotic algae. The Rhodophyta also comprises one of the largest phyla of algae, containing over 7,000 currently recognized species with taxonomic revisions ongoing. The majority ...
, the chloroplast DNA nucleoids are clustered in the center of a chloroplast, while in green plants and
green algae The green algae (singular: green alga) are a group consisting of the Prasinodermophyta and its unnamed sister which contains the Chlorophyta and Charophyta/Streptophyta. The land plants (Embryophytes) have emerged deep in the Charophyte alga as ...
, the nucleoids are dispersed throughout the stroma.


Gene content and plastid gene expression

More than 5000 chloroplast genomes have been sequenced and are accessible via the NCBI organelle genome database. The first chloroplast genomes were sequenced in 1986, from tobacco (''Nicotiana tabacum'') and liverwort (''Marchantia polymorpha''). Comparison of the gene sequences of the cyanobacteria ''Synechocystis'' to those of the chloroplast genome of ''Arabidopsis'' provided confirmation of the endosymbiotic origin of the chloroplast. It also demonstrated the significant extent of gene transfer from the cyanobacterial ancestor to the nuclear genome. In most plant species, the chloroplast genome encodes approximately 120 genes. The genes primarily encode core components of the photosynthetic machinery and factors involved in their expression and assembly. Across species of land plants, the set of genes encoded by the chloroplast genome is fairly conserved. This includes four
ribosomal RNAs Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
, approximately 30
tRNAs Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino a ...
, 21 ribosomal proteins, and 4 subunits of the plastid-encoded
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template. Using the enzyme helicase, RNAP locally opens the ...
complex that are involved in plastid gene expression. The large Rubisco subunit and 28 photosynthetic thylakoid proteins are encoded within the chloroplast genome.


Chloroplast genome reduction and gene transfer

Over time, many parts of the chloroplast genome were transferred to the nuclear genome of the host, a process called '' endosymbiotic gene transfer''. As a result, the chloroplast genome is heavily reduced compared to that of free-living cyanobacteria. Chloroplasts may contain 60–100 genes whereas cyanobacteria often have more than 1500 genes in their genome. Contrarily, there are only a few known instances where genes have been transferred to the chloroplast from various donors, including bacteria. Endosymbiotic gene transfer is how we know about the lost chloroplasts in many chromalveolate lineages. Even if a chloroplast is eventually lost, the genes it donated to the former host's nucleus persist, providing evidence for the lost chloroplast's existence. For example, while diatoms (a
heterokontophyte Stramenopile is a clade of organisms distinguished by the presence of stiff tripartite external hairs. In most species, the hairs are attached to flagella, in some they are attached to other areas of the cellular surface, and in some they have b ...
) now have a
red algal derived chloroplast A chloroplast () is a type of membrane-bound organelle known as a plastid that conducts photosynthesis mostly in plant and algal cells. The photosynthetic pigment chlorophyll captures the energy from sunlight, converts it, and stores it in t ...
, the presence of many
green algal The green algae (singular: green alga) are a group consisting of the Prasinodermophyta and its unnamed sister which contains the Chlorophyta and Charophyta/Streptophyta. The land plants (Embryophytes) have emerged deep in the Charophyte alga as ...
genes in the diatom nucleus provide evidence that the diatom ancestor (probably the ancestor of all chromalveolates too) had a
green algal derived chloroplast A chloroplast () is a type of membrane-bound organelle known as a plastid that conducts photosynthesis mostly in plant and algal cells. The photosynthetic pigment chlorophyll captures the energy from sunlight, converts it, and stores it i ...
at some point, which was subsequently replaced by the red chloroplast. In land plants, some 11–14% of the DNA in their nuclei can be traced back to the chloroplast, up to 18% in ''
Arabidopsis ''Arabidopsis'' (rockcress) is a genus in the family Brassicaceae. They are small flowering plants related to cabbage and mustard. This genus is of great interest since it contains thale cress (''Arabidopsis thaliana''), one of the model organi ...
'', corresponding to about 4,500 protein-coding genes. There have been a few recent transfers of genes from the chloroplast DNA to the nuclear genome in land plants.


Proteins encoded by the chloroplast

Of the approximately three-thousand proteins found in chloroplasts, some 95% of them are encoded by nuclear genes. Many of the chloroplast's protein complexes consist of subunits from both the chloroplast genome and the host's nuclear genome. As a result,
protein synthesis Protein biosynthesis (or protein synthesis) is a core biological process, occurring inside Cell (biology), cells, homeostasis, balancing the loss of cellular proteins (via Proteolysis, degradation or Protein targeting, export) through the product ...
must be coordinated between the chloroplast and the nucleus. The chloroplast is mostly under nuclear control, though chloroplasts can also give out signals regulating
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
in the nucleus, called '' retrograde signaling''.


Protein synthesis

Protein synthesis within chloroplasts relies on an
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template. Using the enzyme helicase, RNAP locally opens the ...
coded by the chloroplast's own genome, which is related to RNA polymerases found in bacteria. Chloroplasts also contain a mysterious second RNA polymerase that is encoded by the plant's nuclear genome. The two RNA polymerases may recognize and bind to different kinds of promoters within the chloroplast genome. The
ribosome Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to ...
s in chloroplasts are similar to bacterial ribosomes.


RNA editing in plastids

RNA editing is the insertion, deletion, and substitution of nucleotides in a mRNA transcript prior to translation to protein. The highly oxidative environment inside chloroplasts increases the rate of mutation so post-transcription repairs are needed to conserve functional sequences. The chloroplast editosome substitutes C -> U and U -> C at very specific locations on the transcript. This can change the codon for an amino acid or restore a non-functional pseudogene by adding an AUG start codon or removing a premature UAA stop codon. The editosome recognizes and binds to cis sequence upstream of the editing site. The distance between the binding site and editing site varies by gene and proteins involved in the editosome. Hundreds of different
PPR protein The pentatricopeptide repeat (PPR) is a 35-amino acid sequence motif. Pentatricopeptide-repeat-containing proteins are a family of proteins commonly found in the plant kingdom. They are distinguished by the presence of tandem degenerate PPR moti ...
s from the nuclear genome are involved in the RNA editing process. These proteins consist of 35-mer repeated amino acids, the sequence of which determines the cis binding site for the edited transcript. Basal land plants such as liverworts, mosses and ferns have hundreds of different editing sites while flowering plants typically have between thirty and forty. Parasitic plants such as
Epifagus virginiana ''Epifagus virginiana'', commonly called beech drops (or beech-drops), is an obligate parasitic plant which grows and subsists on the roots of American beech. It is a member of the family Orobanchaceae. The genus ''Epifagus'' is monotypic—cont ...
show a loss of RNA editing resulting in a loss of function for photosynthesis genes.


DNA replication


Leading model of cpDNA replication

The mechanism for chloroplast DNA (cpDNA) replication has not been conclusively determined, but two main models have been proposed. Scientists have attempted to observe chloroplast replication via
electron microscopy An electron microscope is a microscope that uses a beam of accelerated electrons as a source of illumination. As the wavelength of an electron can be up to 100,000 times shorter than that of visible light photons, electron microscopes have a hi ...
since the 1970s. The results of the microscopy experiments led to the idea that chloroplast DNA replicates using a double displacement loop (D-loop). As the D-loop moves through the circular DNA, it adopts a theta intermediary form, also known as a Cairns replication intermediate, and completes replication with a rolling circle mechanism. Replication starts at specific points of origin. Multiple replication forks open up, allowing replication machinery to replicate the DNA. As replication continues, the forks grow and eventually converge. The new cpDNA structures separate, creating daughter cpDNA chromosomes. In addition to the early microscopy experiments, this model is also supported by the amounts of deamination seen in cpDNA. Deamination occurs when an amino group is lost and is a mutation that often results in base changes. When adenine is deaminated, it becomes hypoxanthine (H). Hypoxanthine can bind to cytosine, and when the HC base pair is replicated, it becomes a GC (thus, an A → G base change). In cpDNA, there are several A → G deamination gradients. DNA becomes susceptible to deamination events when it is single stranded. When replication forks form, the strand not being copied is single stranded, and thus at risk for A → G deamination. Therefore, gradients in deamination indicate that replication forks were most likely present and the direction that they initially opened (the highest gradient is most likely nearest the start site because it was single stranded for the longest amount of time). This mechanism is still the leading theory today; however, a second theory suggests that most cpDNA is actually linear and replicates through homologous recombination. It further contends that only a minority of the genetic material is kept in circular chromosomes while the rest is in branched, linear, or other complex structures.


Alternative model of replication

One of the main competing models for cpDNA asserts that most cpDNA is linear and participates in homologous recombination and replication structures similar to bacteriophage T4. It has been established that some plants have linear cpDNA, such as maize, and that more still contain complex structures that scientists do not yet understand; however, the predominant view today is that most cpDNA is circular. When the original experiments on cpDNA were performed, scientists did notice linear structures; however, they attributed these linear forms to broken circles. If the branched and complex structures seen in cpDNA experiments are real and not artifacts of concatenated circular DNA or broken circles, then a D-loop mechanism of replication is insufficient to explain how those structures would replicate. At the same time, homologous recombination does not explain the multiple A → G gradients seen in plastomes. This shortcoming is one of the biggest for the linear structure theory.


Protein targeting and import

The movement of so many chloroplast genes to the nucleus means that many chloroplast
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
that were supposed to be translated in the chloroplast are now synthesized in the cytoplasm. This means that these proteins must be directed back to the chloroplast, and imported through at least two chloroplast membranes. Curiously, around half of the protein products of transferred genes aren't even targeted back to the chloroplast. Many became exaptations, taking on new functions like participating in cell division,
protein routing :''This article deals with protein targeting in eukaryotes unless specified otherwise.'' Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations within or outside the ce ...
, and even disease resistance. A few chloroplast genes found new homes in the mitochondrial genome—most became nonfunctional pseudogenes, though a few tRNA genes still work in the
mitochondrion A mitochondrion (; ) is an organelle found in the cells of most Eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is used ...
. Some transferred chloroplast DNA protein products get directed to the secretory pathway (though many
secondary plastids A chloroplast () is a type of membrane-bound organelle known as a plastid that conducts photosynthesis mostly in plant cell, plant and algae, algal cells. The photosynthetic pigment chlorophyll captures the energy from sunlight, converts it, ...
are bounded by an outermost membrane derived from the host's cell membrane, and therefore
topologically In mathematics, topology (from the Greek words , and ) is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling, and bending; that is, without closing ho ...
outside of the cell, because to reach the chloroplast from the cytosol, you have to cross the cell membrane, just like if you were headed for the extracellular space. In those cases, chloroplast-targeted proteins do initially travel along the secretory pathway). Because the cell acquiring a chloroplast already had
mitochondria A mitochondrion (; ) is an organelle found in the Cell (biology), cells of most Eukaryotes, such as animals, plants and Fungus, fungi. Mitochondria have a double lipid bilayer, membrane structure and use aerobic respiration to generate adenosi ...
(and peroxisomes, and a cell membrane for secretion), the new chloroplast host had to develop a unique
protein targeting system :''This article deals with protein targeting in eukaryotes unless specified otherwise.'' Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations within or outside the ce ...
to avoid having chloroplast proteins being sent to the wrong
organelle In cell biology, an organelle is a specialized subunit, usually within a cell, that has a specific function. The name ''organelle'' comes from the idea that these structures are parts of cells, as organs are to the body, hence ''organelle,'' the ...
.


Cytoplasmic translation and N-terminal transit sequences

Polypeptides, the precursors of
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
, are chains of
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
. The two ends of a polypeptide are called the
N-terminus The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
, or ''amino end'', and the
C-terminus The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
, or ''carboxyl end''. For many (but not all) chloroplast proteins encoded by
nuclear Nuclear may refer to: Physics Relating to the nucleus of the atom: *Nuclear engineering *Nuclear physics *Nuclear power *Nuclear reactor *Nuclear weapon *Nuclear medicine *Radiation therapy *Nuclear warfare Mathematics *Nuclear space * Nuclear ...
genes, ''
cleavable transit peptides A signal peptide (sometimes referred to as signal sequence, targeting signal, localization signal, localization sequence, transit peptide, leader sequence or leader peptide) is a short peptide (usually 16-30 amino acids long) present at the N-ter ...
'' are added to the N-termini of the polypeptides, which are used to help direct the polypeptide to the chloroplast for import (N-terminal transit peptides are also used to direct polypeptides to plant
mitochondria A mitochondrion (; ) is an organelle found in the Cell (biology), cells of most Eukaryotes, such as animals, plants and Fungus, fungi. Mitochondria have a double lipid bilayer, membrane structure and use aerobic respiration to generate adenosi ...
). N-terminal transit sequences are also called ''presequences'' because they are located at the "front" end of a polypeptide— ribosomes synthesize polypeptides from the N-terminus to the C-terminus. Chloroplast transit peptides exhibit huge variation in length and amino acid sequence. They can be from 20 to 150 amino acids long—an unusually long length, suggesting that transit peptides are actually collections of domains with different functions. Transit peptides tend to be positively charged, rich in hydroxylated amino acids such as
serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
,
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COO� ...
, and
proline Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the prot ...
, and poor in acidic amino acids like
aspartic acid Aspartic acid (symbol Asp or D; the ionic form is known as aspartate), is an α-amino acid that is used in the biosynthesis of proteins. Like all other amino acids, it contains an amino group and a carboxylic acid. Its α-amino group is in the pro ...
and
glutamic acid Glutamic acid (symbol Glu or E; the ionic form is known as glutamate) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a non-essential nutrient for humans, meaning that the human body can synt ...
. In an
aqueous solution An aqueous solution is a solution in which the solvent is water. It is mostly shown in chemical equations by appending (aq) to the relevant chemical formula. For example, a solution of table salt, or sodium chloride (NaCl), in water would be re ...
, the transit sequence forms a random coil. Not all chloroplast proteins include a N-terminal cleavable transit peptide though. Some include the transit sequence within the functional part of the protein itself. A few have their transit sequence appended to their
C-terminus The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
instead. Most of the polypeptides that lack N-terminal targeting sequences are the ones that are sent to the
outer chloroplast membrane Chloroplasts contain several important membranes, vital for their function. Like mitochondria, chloroplasts have a double-membrane envelope, called the chloroplast envelope, but unlike mitochondria, chloroplasts also have internal membrane structu ...
, plus at least one sent to the
inner chloroplast membrane Chloroplasts contain several important membranes, vital for their function. Like mitochondria, chloroplasts have a double-membrane envelope, called the chloroplast envelope, but unlike mitochondria, chloroplasts also have internal membrane str ...
.


Phosphorylation, chaperones, and transport

After a chloroplast
polypeptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. A p ...
is synthesized on a
ribosome Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to ...
in the cytosol,
ATP ATP may refer to: Companies and organizations * Association of Tennis Professionals, men's professional tennis governing body * American Technical Publishers, employee-owned publishing company * ', a Danish pension * Armenia Tree Project, non ...
energy can be used to phosphorylate, or add a phosphate group to many (but not all) of them in their transit sequences.
Serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
and
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COO� ...
(both very common in chloroplast transit sequences—making up 20–30% of the sequence) are often the
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
that accept the phosphate group. The enzyme that carries out the phosphorylation is specific for chloroplast polypeptides, and ignores ones meant for
mitochondria A mitochondrion (; ) is an organelle found in the Cell (biology), cells of most Eukaryotes, such as animals, plants and Fungus, fungi. Mitochondria have a double lipid bilayer, membrane structure and use aerobic respiration to generate adenosi ...
or peroxisomes. Phosphorylation changes the polypeptide's shape, making it easier for
14-3-3 proteins 14-3-3 proteins are a family of conserved regulatory molecules that are expressed in all eukaryotic cells. 14-3-3 proteins have the ability to bind a multitude of functionally diverse signaling proteins, including kinases, phosphatases, and trans ...
to attach to the polypeptide. In plants,
14-3-3 proteins 14-3-3 proteins are a family of conserved regulatory molecules that are expressed in all eukaryotic cells. 14-3-3 proteins have the ability to bind a multitude of functionally diverse signaling proteins, including kinases, phosphatases, and trans ...
only bind to chloroplast preproteins. It is also bound by the heat shock protein Hsp70 that keeps the polypeptide from folding prematurely. This is important because it prevents chloroplast proteins from assuming their active form and carrying out their chloroplast functions in the wrong place—the cytosol. At the same time, they have to keep just enough shape so that they can be recognized and imported into the chloroplast. The heat shock protein and the 14-3-3 proteins together form a cytosolic guidance complex that makes it easier for the chloroplast polypeptide to get imported into the chloroplast. Alternatively, if a chloroplast preprotein's transit peptide is not phosphorylated, a chloroplast preprotein can still attach to a heat shock protein or Toc159. These complexes can bind to the TOC complex on the outer chloroplast membrane using GTP energy.


The translocon on the outer chloroplast membrane (TOC)

The TOC complex, or '' translocon on the outer chloroplast membrane'', is a collection of proteins that imports preproteins across the
outer chloroplast envelope Chloroplasts contain several important membranes, vital for their function. Like mitochondria, chloroplasts have a double-membrane envelope, called the chloroplast envelope, but unlike mitochondria, chloroplasts also have internal membrane str ...
. Five subunits of the TOC complex have been identified—two GTP-binding proteins Toc34 and Toc159, the protein import tunnel Toc75, plus the proteins Toc64 and
Toc12 TOC1 may refer to: *The TOC protocol, or Talk to OSCAR protocol, a computing protocol *TOC1 (gene) Timing of CAB expression 1 is a protein that in '' Arabidopsis thaliana'' is encoded by the TOC1 gene. TOC1 is also known as two-component respon ...
. The first three proteins form a core complex that consists of one Toc159, four to five Toc34s, and four Toc75s that form four holes in a disk 13 nanometers across. The whole core complex weighs about 500
kilodaltons The dalton or unified atomic mass unit (symbols: Da or u) is a non-SI unit of mass widely used in physics and chemistry. It is defined as of the mass of an unbound neutral atom of carbon-12 in its nuclear and electronic ground state and at ...
. The other two proteins, Toc64 and Toc12, are associated with the core complex but are not part of it.


Toc34 and 33

Toc34 is an integral protein in the outer chloroplast membrane that's anchored into it by its hydrophobic C-terminal tail. Most of the protein, however, including its large guanosine triphosphate (GTP)-binding domain projects out into the stroma. Toc34's job is to catch some chloroplast
preproteins A protein precursor, also called a pro-protein or pro-peptide, is an inactive protein (or peptide) that can be turned into an active form by post-translational modification, such as breaking off a piece of the molecule or adding on another molecule ...
in the cytosol and hand them off to the rest of the TOC complex. When GTP, an energy molecule similar to
ATP ATP may refer to: Companies and organizations * Association of Tennis Professionals, men's professional tennis governing body * American Technical Publishers, employee-owned publishing company * ', a Danish pension * Armenia Tree Project, non ...
attaches to Toc34, the protein becomes much more able to bind to many chloroplast preproteins in the cytosol. The chloroplast preprotein's presence causes Toc34 to break GTP into guanosine diphosphate (GDP) and inorganic phosphate. This loss of GTP makes the Toc34 protein release the chloroplast preprotein, handing it off to the next TOC protein. Toc34 then releases the depleted GDP molecule, probably with the help of an unknown GDP exchange factor. A domain of Toc159 might be the exchange factor that carry out the GDP removal. The Toc34 protein can then take up another molecule of GTP and begin the cycle again. Toc34 can be turned off through
phosphorylation In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
. A protein kinase drifting around on the outer chloroplast membrane can use
ATP ATP may refer to: Companies and organizations * Association of Tennis Professionals, men's professional tennis governing body * American Technical Publishers, employee-owned publishing company * ', a Danish pension * Armenia Tree Project, non ...
to add a phosphate group to the Toc34 protein, preventing it from being able to receive another GTP molecule, inhibiting the protein's activity. This might provide a way to regulate protein import into chloroplasts. ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. ''A. thaliana'' is considered a weed; it is found along the shoulders of roads and in disturbed land. A winter a ...
'' has two
homologous Homology may refer to: Sciences Biology *Homology (biology), any characteristic of biological organisms that is derived from a common ancestor *Sequence homology, biological homology between DNA, RNA, or protein sequences * Homologous chrom ...
proteins, AtToc33 and AtToc34 (The ''At'' stands for ''Arabidopsis thaliana''), which are each about 60% identical in amino acid sequence to Toc34 in peas (called ''ps''Toc34). AtToc33 is the most common in ''Arabidopsis'', and it is the functional analogue of Toc34 because it can be turned off by phosphorylation. AtToc34 on the other hand cannot be phosphorylated.


Toc159

Toc159 is another GTP binding TOC
subunit Subunit may refer to: *Subunit HIV vaccine, a class of HIV vaccine *Protein subunit, a protein molecule that assembles with other protein molecules *Monomer, a molecule that may bind chemically to other molecules to form a polymer *Sub-subunit, a ...
, like Toc34. Toc159 has three domains. At the
N-terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
end is the A-domain, which is rich in acidic amino acids and takes up about half the protein length. The A-domain is often cleaved off, leaving an 86 kilodalton fragment called Toc86. In the middle is its GTP binding domain, which is very similar to the
homologous Homology may refer to: Sciences Biology *Homology (biology), any characteristic of biological organisms that is derived from a common ancestor *Sequence homology, biological homology between DNA, RNA, or protein sequences * Homologous chrom ...
GTP-binding domain in Toc34. At the C-terminal end is the hydrophilic M-domain, which anchors the protein to the outer chloroplast membrane. Toc159 probably works a lot like Toc34, recognizing proteins in the cytosol using GTP. It can be regulated through
phosphorylation In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
, but by a different protein kinase than the one that phosphorylates Toc34. Its M-domain forms part of the tunnel that chloroplast preproteins travel through, and seems to provide the force that pushes preproteins through, using the energy from GTP. Toc159 is not always found as part of the TOC complex—it has also been found dissolved in the cytosol. This suggests that it might act as a shuttle that finds chloroplast preproteins in the cytosol and carries them back to the TOC complex. There isn't a lot of direct evidence for this behavior though. A family of Toc159 proteins, Toc159, Toc132, Toc120, and Toc90 have been found in ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. ''A. thaliana'' is considered a weed; it is found along the shoulders of roads and in disturbed land. A winter a ...
''. They vary in the length of their A-domains, which is completely gone in Toc90. Toc132, Toc120, and Toc90 seem to have specialized functions in importing stuff like nonphotosynthetic preproteins, and can't replace Toc159.


Toc75

Toc75 is the most abundant protein on the outer chloroplast envelope. It is a transmembrane tube that forms most of the TOC pore itself. Toc75 is a
β-barrel In protein structures, a beta barrel is a beta sheet composed of tandem repeats that twists and coils to form a closed toroidal structure in which the first strand is bonded to the last strand (hydrogen bond). Beta-strands in many beta-barrels are ...
channel lined by 16
β-pleated sheets The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a gen ...
. The hole it forms is about 2.5 nanometers wide at the ends, and shrinks to about 1.4–1.6 nanometers in diameter at its narrowest point—wide enough to allow partially folded chloroplast preproteins to pass through. Toc75 can also bind to chloroplast preproteins, but is a lot worse at this than Toc34 or Toc159. ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. ''A. thaliana'' is considered a weed; it is found along the shoulders of roads and in disturbed land. A winter a ...
'' has multiple isoforms of Toc75 that are named by the chromosomal positions of the
genes In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
that code for them. AtToc75 III is the most abundant of these.


The translocon on the inner chloroplast membrane (TIC)

The
TIC translocon The TIC and TOC complexes are translocons located in the chloroplast of a eukaryotic cell, that is, protein complexes that facilitate the transfer of proteins in and out through the chloroplast's membrane. It mainly transports proteins made in the ...
, or ''translocon on the inner chloroplast membrane translocon'' is another protein complex that imports proteins across the
inner chloroplast envelope Chloroplasts contain several important membranes, vital for their function. Like mitochondria, chloroplasts have a double-membrane envelope, called the chloroplast envelope, but unlike mitochondria, chloroplasts also have internal membrane structu ...
. Chloroplast polypeptide chains probably often travel through the two complexes at the same time, but the TIC complex can also retrieve preproteins lost in the intermembrane space. Like the TOC translocon, the TIC translocon has a large core complex surrounded by some loosely associated peripheral proteins like Tic110, Tic40, and
Tic21 A tic is a sudden, repetitive, nonrhythmic motor movement or vocalization involving discrete muscle groups.American Psychiatric Association (2000)DSM-IV-TR: Tourette's Disorder.''Diagnostic and Statistical Manual of Mental Disorders'', 4th ed., ...
. The core complex weighs about one million daltons and contains Tic214, Tic100, Tic56, and Tic20 I, possibly three of each.


Tic20

Tic20 A tic is a sudden, repetitive, nonrhythmic motor movement or vocalization involving discrete muscle groups.American Psychiatric Association (2000)DSM-IV-TR: Tourette's Disorder.''Diagnostic and Statistical Manual of Mental Disorders'', 4th ed., ...
is an integral protein thought to have four transmembrane α-helices. It is found in the 1 million dalton TIC complex. Because it is similar to bacterial amino acid transporters and the
mitochondrial A mitochondrion (; ) is an organelle found in the cells of most Eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is use ...
import protein Tim17 ('' translocase on the inner mitochondrial membrane''), it has been proposed to be part of the TIC import channel. There is no '' in vitro'' evidence for this though. In ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small flowering plant native to Eurasia and Africa. ''A. thaliana'' is considered a weed; it is found along the shoulders of roads and in disturbed land. A winter a ...
'', it is known that for about every five Toc75 proteins in the outer chloroplast membrane, there are two Tic20 I proteins (the main
form Form is the shape, visual appearance, or configuration of an object. In a wider sense, the form is the way something happens. Form also refers to: * Form (document), a document (printed or electronic) with spaces in which to write or enter dat ...
of Tic20 in ''Arabidopsis'') in the inner chloroplast membrane. Unlike Tic214, Tic100, or Tic56, Tic20 has
homologous Homology may refer to: Sciences Biology *Homology (biology), any characteristic of biological organisms that is derived from a common ancestor *Sequence homology, biological homology between DNA, RNA, or protein sequences * Homologous chrom ...
relatives in
cyanobacteria Cyanobacteria (), also known as Cyanophyta, are a phylum of gram-negative bacteria that obtain energy via photosynthesis. The name ''cyanobacteria'' refers to their color (), which similarly forms the basis of cyanobacteria's common name, blu ...
and nearly all chloroplast lineages, suggesting it evolved before the first chloroplast endosymbiosis. Tic214, Tic100, and Tic56 are unique to
chloroplastidan Viridiplantae (literally "green plants") are a clade of eukaryotic organisms that comprise approximately 450,000–500,000 species and play important roles in both terrestrial and aquatic ecosystems. They are made up of the green algae, which ar ...
chloroplasts, suggesting that they evolved later.


Tic214

Tic214 is another TIC core complex protein, named because it weighs just under 214
kilodaltons The dalton or unified atomic mass unit (symbols: Da or u) is a non-SI unit of mass widely used in physics and chemistry. It is defined as of the mass of an unbound neutral atom of carbon-12 in its nuclear and electronic ground state and at ...
. It is 1786
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
long and is thought to have six transmembrane domains on its
N-terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
end. Tic214 is notable for being coded for by chloroplast DNA, more specifically the first open reading frame '' ycf1''. Tic214 and
Tic20 A tic is a sudden, repetitive, nonrhythmic motor movement or vocalization involving discrete muscle groups.American Psychiatric Association (2000)DSM-IV-TR: Tourette's Disorder.''Diagnostic and Statistical Manual of Mental Disorders'', 4th ed., ...
together probably make up the part of the one million dalton TIC complex that spans the entire membrane. Tic20 is buried inside the complex while Tic214 is exposed on both sides of the
inner chloroplast membrane Chloroplasts contain several important membranes, vital for their function. Like mitochondria, chloroplasts have a double-membrane envelope, called the chloroplast envelope, but unlike mitochondria, chloroplasts also have internal membrane str ...
.


Tic100

Tic100 is a nuclear encoded protein that's 871
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
long. The 871 amino acids collectively weigh slightly less than 100 thousand daltons, and since the mature protein probably doesn't lose any amino acids when itself imported into the chloroplast (it has no cleavable transit peptide), it was named Tic100. Tic100 is found at the edges of the 1 million dalton complex on the side that faces the chloroplast intermembrane space.


Tic56

Tic56 is also a nuclear encoded protein. The
preprotein A protein precursor, also called a pro-protein or pro-peptide, is an inactive protein (or peptide) that can be turned into an active form by post-translational modification, such as breaking off a piece of the molecule or adding on another molecule ...
its gene encodes is 527 amino acids long, weighing close to 62 thousand daltons; the mature form probably undergoes processing that trims it down to something that weighs 56 thousand daltons when it gets imported into the chloroplast. Tic56 is largely embedded inside the 1 million dalton complex. Tic56 and Tic100 are highly conserved among land plants, but they don't resemble any protein whose function is known. Neither has any transmembrane domains.


See also

* List of sequenced plastomes *
Mitochondrial DNA Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial D ...


References

{{Authority control Cell anatomy Chromosomes DNA Plant genes Photosynthesis