C5-methylcytosine
   HOME

TheInfoList



OR:

5-Methylcytosine (5mC) is a
methylated Methylation, in the chemical sciences, is the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These term ...
form of the
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
base
cytosine Cytosine () (symbol C or Cyt) is one of the four nucleotide bases found in DNA and RNA, along with adenine, guanine, and thymine ( uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attac ...
(C) that regulates gene transcription and takes several other biological roles. When cytosine is methylated, the DNA maintains the same sequence, but the expression of methylated genes can be altered (the study of this is part of the field of
epigenetics In biology, epigenetics is the study of changes in gene expression that happen without changes to the DNA sequence. The Greek prefix ''epi-'' (ἐπι- "over, outside of, around") in ''epigenetics'' implies features that are "on top of" or "in ...
). 5-Methylcytosine is incorporated in the
nucleoside Nucleosides are glycosylamines that can be thought of as nucleotides without a phosphate group. A nucleoside consists simply of a nucleobase (also termed a nitrogenous base) and a five-carbon sugar (ribose or 2'-deoxyribose) whereas a nucleotid ...
5-methylcytidine.


Discovery

While trying to isolate the bacterial
toxin A toxin is a naturally occurring poison produced by metabolic activities of living cells or organisms. They occur especially as proteins, often conjugated. The term was first used by organic chemist Ludwig Brieger (1849–1919), derived ...
responsible for
tuberculosis Tuberculosis (TB), also known colloquially as the "white death", or historically as consumption, is a contagious disease usually caused by ''Mycobacterium tuberculosis'' (MTB) bacteria. Tuberculosis generally affects the lungs, but it can al ...
, W.G. Ruppel isolated a novel
nucleic acid Nucleic acids are large biomolecules that are crucial in all cells and viruses. They are composed of nucleotides, which are the monomer components: a pentose, 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nuclei ...
named tuberculinic acid in 1898 from ''
Tubercle bacillus ''Mycobacterium tuberculosis'' (M. tb), also known as Koch's bacillus, is a species of pathogenic bacteria in the family Mycobacteriaceae and the causative agent of tuberculosis. First discovered in 1882 by Robert Koch, ''M. tuberculosis'' ha ...
''. The nucleic acid was found to be unusual, in that it contained in addition to
thymine Thymine () (symbol T or Thy) is one of the four nucleotide bases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine ...
,
guanine Guanine () (symbol G or Gua) is one of the four main nucleotide bases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine ( uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside ...
and
cytosine Cytosine () (symbol C or Cyt) is one of the four nucleotide bases found in DNA and RNA, along with adenine, guanine, and thymine ( uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attac ...
, a methylated nucleotide. In 1925,
Johnson Johnson may refer to: People and fictional characters *Johnson (surname), a common surname in English * Johnson (given name), a list of people * List of people with surname Johnson, including fictional characters *Johnson (composer) (1953–2011) ...
and Coghill successfully detected a minor amount of a methylated cytosine derivative as a product of
hydrolysis Hydrolysis (; ) is any chemical reaction in which a molecule of water breaks one or more chemical bonds. The term is used broadly for substitution reaction, substitution, elimination reaction, elimination, and solvation reactions in which water ...
of tuberculinic acid with
sulfuric acid Sulfuric acid (American spelling and the preferred IUPAC name) or sulphuric acid (English in the Commonwealth of Nations, Commonwealth spelling), known in antiquity as oil of vitriol, is a mineral acid composed of the elements sulfur, oxygen, ...
. This report was severely criticized because their identification was based solely on the optical properties of the crystalline picrate, and other scientists failed to reproduce the same result. But its existence was ultimately proven in 1948, when Hotchkiss separated the nucleic acids of
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
from calf
thymus The thymus (: thymuses or thymi) is a specialized primary lymphoid organ of the immune system. Within the thymus, T cells mature. T cells are critical to the adaptive immune system, where the body adapts to specific foreign invaders. The thymus ...
using
paper chromatography Paper chromatography is an analytical method used to separate colored chemicals or substances. It can also be used for colorless chemicals that can be located by a stain or other visualisation method after separation. It is now primarily used as ...
, by which he detected a unique methylated cytosine, quite distinct from conventional cytosine and
uracil Uracil () (nucleoside#List of nucleosides and corresponding nucleobases, symbol U or Ura) is one of the four nucleotide bases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via ...
. After seven decades, it turned out that it is also a common feature in different
RNA Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
molecules, although the precise role is uncertain.


''In vivo''

The function of this chemical varies significantly among species: * In bacteria, 5-methylcytosine can be found at a variety of sites, and is often used as a marker to protect DNA from being cut by native methylation-sensitive restriction enzymes. * In plants, 5-methylcytosine occurs at CpG, CpHpG and CpHpH sequences (where H = A, C or T). * In fungi and animals, 5-methylcytosine predominantly occurs at CpG dinucleotides. Most
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s methylate only a small percentage of these sites, but 70-80% of CpG cytosines are methylated in
vertebrate Vertebrates () are animals with a vertebral column (backbone or spine), and a cranium, or skull. The vertebral column surrounds and protects the spinal cord, while the cranium protects the brain. The vertebrates make up the subphylum Vertebra ...
s. In mammalian cells, clusters of CpG at the 5' ends of genes are termed CpG islands. 1% of all mammalian DNA is 5mC. While spontaneous
deamination Deamination is the removal of an amino group from a molecule. Enzymes that catalysis, catalyse this reaction are called deaminases. In the human body, deamination takes place primarily in the liver; however, it can also occur in the kidney. In s ...
of cytosine forms
uracil Uracil () (nucleoside#List of nucleosides and corresponding nucleobases, symbol U or Ura) is one of the four nucleotide bases in the nucleic acid RNA. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via ...
, which is recognized and removed by DNA repair enzymes, deamination of 5-methylcytosine forms
thymine Thymine () (symbol T or Thy) is one of the four nucleotide bases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine ...
. This conversion of a DNA base from cytosine (C) to thymine (T) can result in a transition mutation. In addition, active enzymatic deamination of cytosine or 5-methylcytosine by the APOBEC family of cytosine deaminases could have beneficial implications on various cellular processes as well as on organismal evolution. The implications of deamination on
5-hydroxymethylcytosine 5-Hydroxymethylcytosine (5hmC) is a DNA pyrimidine nitrogen base derived from cytosine. It is potentially important in epigenetics, because the hydroxymethyl group on the cytosine can possibly switch a gene on and off. It was first seen in bact ...
, on the other hand, remains less understood.


''In vitro''

The NH2 group can be removed (deamination) from 5-methylcytosine to form
thymine Thymine () (symbol T or Thy) is one of the four nucleotide bases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine ...
with use of reagents such as
nitrous acid Nitrous acid (molecular formula ) is a weak and monoprotic acid known only in solution, in the gas phase, and in the form of nitrite () salts. It was discovered by Carl Wilhelm Scheele, who called it " phlogisticated acid of niter". Nitrous ac ...
; cytosine deaminates to uracil (U) under similar conditions. 5-methylcytosine is resistant to deamination by
bisulfite The bisulfite ion (IUPAC-recommended nomenclature: hydrogensulfite) is the ion . Salts containing the ion are also known as "sulfite lyes". Sodium bisulfite is used interchangeably with sodium metabisulfite (Na2S2O5). Sodium metabisulfite diss ...
treatment, which deaminates cytosine residues. This property is often exploited to analyze DNA cytosine methylation patterns with bisulfite sequencing.


Addition and regulation with DNMTs (Eukaryotes)

5mC marks are placed on genomic DNA via
DNA methyltransferase In biochemistry, the DNA methyltransferase (DNA MTase, DNMT) family of enzymes catalyze the transfer of a methyl group to DNA. DNA methylation serves a wide variety of biological functions. All the known DNA methyltransferases use S-adenosyl ...
s (DNMTs). There are 5 DNMTs in humans: DNMT1, DNMT2, DNMT3A, DNMT3B, and DNMT3L, and in algae and fungi 3 more are present (DNMT4, DNMT5, and DNMT6). DNMT1 contains the replication foci targeting sequence (RFTS) and the CXXC domain which catalyze the addition of 5mC marks. RFTS directs DNMT1 to loci of DNA replication to assist in the maintenance of 5mC on daughter strands during DNA replication, whereas CXXC contains a
zinc finger A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
domain for ''de novo'' addition of methylation to the DNA. DNMT1 was found to be the predominant DNA methyltransferase in all human tissue. Primarily, DNMT3A and DNMT3B are responsible for ''de novo'' methylation, and DNMT1 maintains the 5mC mark after replication. DNMTs can interact with each other to increase methylating capability. For example, 2 DNMT3L can form a complex with 2 DNMT3A to improve interactions with the DNA, facilitating the methylation. Changes in the expression of DNMT results in aberrant methylation. Overexpression produces increased methylation, whereas disruption of the enzyme decreased levels of methylation. The mechanism of the addition is as follows: first a cysteine residue on the DNMT's PCQ motif creates a nucleophillic attack at carbon 6 on the cytosine nucleotide that is to be methylated. S-Adenosylmethionine then donates a methyl group to carbon 5. A base in the DNMT enzyme deprotonates the residual hydrogen on carbon 5 restoring the double bond between carbon 5 and 6 in the ring, producing the 5-methylcytosine base pair.


Demethylation

After a cytosine is methylated to 5mC, it can be reversed back to its initial state via multiple mechanisms. Passive DNA demethylation by dilution eliminates the mark gradually through replication by a lack of maintenance by DNMT. In active DNA demethylation, a series of oxidations converts it to
5-hydroxymethylcytosine 5-Hydroxymethylcytosine (5hmC) is a DNA pyrimidine nitrogen base derived from cytosine. It is potentially important in epigenetics, because the hydroxymethyl group on the cytosine can possibly switch a gene on and off. It was first seen in bact ...
(5hmC),
5-formylcytosine 5-Formylcytosine (5fC) is a pyrimidine nitrogen base derived from cytosine. In the context of nucleic acid chemistry and biology, it is regarded as an epigenetic marker. Discovered in 2011 in mammalian embryonic stem cells by Thomas Carell's res ...
(5fC), and 5-carboxylcytosine (5caC), and the latter two are eventually excised by thymine DNA glycosylase (TDG), followed by base excision repair (BER) to restore the cytosine. TDG knockout produced a 2-fold increase of 5fC without any statistically significant change to levels of 5hmC, indicating 5mC must be iteratively oxidized at least twice before its full demethylation. The oxidation occurs through the TET (Ten-eleven translocation) family dioxygenases (
TET enzymes The TET enzymes are a family of ten-eleven translocation (TET) 5-Methylcytosine, methylcytosine dioxygenases. They are instrumental in DNA demethylation. 5-Methylcytosine (see first Figure) is a methylation, methylated form of the DNA base cytosin ...
) which can convert 5mC, 5hmC, and 5fC to their oxidized forms. However, the enzyme has the greatest preference for 5mC and the initial reaction rate for 5hmC and 5fC conversions with TET2 are 4.9-7.6 fold slower. TET requires Fe(II) as cofactor, and oxygen and α-ketoglutarate (α-KG) as substrates, and the latter substrate is generated from isocitrate by the enzyme
isocitrate dehydrogenase Isocitrate dehydrogenase (IDH) () and () is an enzyme that catalyzes the oxidative decarboxylation of isocitrate, producing alpha-ketoglutarate (α-ketoglutarate) and CO2. This is a two-step process, which involves oxidation of isocitrate ( ...
(IDH). Cancer however can produce 2-hydroxyglutarate (2HG) which competes with α-KG, reducing TET activity, and in turn reducing conversion of 5mC to 5hmC.


Role in humans


In cancer

In cancer, DNA can become both overly methylated, termed
hypermethylation Methylation, in the chemical sciences, is the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These terms ...
, and under-methylated, termed hypomethylation. CpG islands overlapping gene promoters are ''de novo'' methylated resulting in aberrant inactivation of genes normally associated with growth inhibition of tumors (an example of hypermethylation). Comparing tumor and normal tissue, the former had elevated levels of the methyltransferases DNMT1, DNMT3A, and mostly DNMT3B, all of which are associated with the abnormal levels of 5mC in cancer. Repeat sequences in the genome, including satellite DNA, Alu, and long interspersed elements (LINE), are often seen hypomethylated in cancer, resulting in expression of these normally silenced genes, and levels are often significant markers of tumor progression. It has been hypothesized that there a connection between the hypermethylation and hypomethylation; over activity of DNA methyltransferases that produce the abnormal ''de novo'' 5mC methylation may be compensated by the removal of methylation, a type of epigenetic repair. However, the removal of methylation is inefficient resulting in an overshoot of genome-wide hypomethylation. The contrary may also be possible; over expression of hypomethylation may be silenced by genome-wide hypermethylation. Cancer hallmark capabilities are likely acquired through epigenetic changes that alter the 5mC in both the cancer cells and in surrounding tumor-associated stroma within the tumor microenvironment. The anticancer drug
Cisplatin Cisplatin is a chemical compound with chemical formula, formula ''cis''-. It is a coordination complex of platinum that is used as a chemotherapy medication used to treat a number of cancers. These include testicular cancer, ovarian cancer, c ...
has been reported to react with 5mC.


As a biomarker of aging

"Epigenetic age" refers to the connection between chronological age and levels of DNA methylation in the genome. Coupling the levels of DNA methylation, in specific sets of CpGs called "clock CpGs", with algorithms that regress the typical levels of collective genome-wide methylation at a given chronological age, allow for epigenetic age prediction. During youth (0–20 years old), changes in DNA methylation occur at a faster rate as development and growth progresses, and the changes begin to slow down at older ages. Multiple epigenetic age estimators exist. Horvath's clock measures a multi-tissue set of 353 CpGs, half of which positively correlate with age, and the other half negatively, to estimate the epigenetic age. Hannum's clock utilizes adult blood samples to calculate age based on an orthogonal basis of 71 CpGs. Levine's clock, known as DNAm PhenoAge, depends on 513 CpGs and surpasses the other age estimators in predicting mortality and lifespan, yet displays bias with non-blood tissues. There are reports of age estimators with the methylation state of only one CpG in the gene ELOVL2. Estimation of age allows for prediction lifespan through expectations of age related conditions that individuals may be subject to based on their 5mC methylation markers.


References


Literature

*
available online
at the United States
National Center for Biotechnology Information The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is lo ...
) {{DEFAULTSORT:Methylcytosine5 Nucleobases Pyrimidones Biomarkers Methyl compounds