Proteinogenic
   HOME

TheInfoList



OR:

Proteinogenic amino acids are
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s that are incorporated biosynthetically into
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s during translation from RNA. The word "proteinogenic" means "protein creating". Throughout known
life Life, also known as biota, refers to matter that has biological processes, such as Cell signaling, signaling and self-sustaining processes. It is defined descriptively by the capacity for homeostasis, Structure#Biological, organisation, met ...
, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard
genetic code Genetic code is a set of rules used by living cell (biology), cells to Translation (biology), translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished ...
and an additional 2 (
selenocysteine Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
and
pyrrolysine Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
) that can be incorporated by special translation mechanisms. In contrast,
non-proteinogenic amino acids In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids (21 in eukaryotesplus formylmethionine in eukaryotes with prokaryote organelles like mitochondria), which are naturally encoded in the ...
are amino acids that are either not incorporated into proteins (like
GABA GABA (gamma-aminobutyric acid, γ-aminobutyric acid) is the chief inhibitory neurotransmitter in the developmentally mature mammalian central nervous system. Its principal role is reducing neuronal excitability throughout the nervous system. GA ...
, L-DOPA, or
triiodothyronine Triiodothyronine, also known as T3, is a thyroid hormone. It affects almost every physiological process in the body, including growth and development, metabolism, body temperature, and heart rate. Production of T3 and its prohormone thyroxi ...
), misincorporated in place of a genetically encoded amino acid, or not produced directly and in isolation by standard cellular machinery (like
hydroxyproline (2''S'',4''R'')-4-Hydroxyproline, or L-hydroxyproline ( C5 H9 O3 N), is an amino acid, abbreviated as Hyp or O, ''e.g.'', in Protein Data Bank. Structure and discovery In 1902, Hermann Emil Fischer isolated hydroxyproline from hydrolyzed gela ...
). The latter often results from
post-translational modification In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translation (biolog ...
of proteins. Some non-proteinogenic amino acids are incorporated into
nonribosomal peptide Nonribosomal peptides (NRP) are a class of peptide secondary metabolites, usually produced by microorganisms like bacterium, bacteria and fungi. Nonribosomal peptides are also found in higher organisms, such as nudibranchs, but are thought to be ma ...
s which are synthesized by non-ribosomal peptide synthetases. Both
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s and
prokaryote A prokaryote (; less commonly spelled procaryote) is a unicellular organism, single-celled organism whose cell (biology), cell lacks a cell nucleus, nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Gree ...
s can incorporate
selenocysteine Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
into their proteins via a nucleotide sequence known as a
SECIS element In biology, the SECIS element (SECIS: ''selenocysteine insertion sequence'') is an RNA element around 60 nucleotides in length that adopts a stem-loop structure. This structural motif (pattern of nucleotides) directs the cell to translate ...
, which directs the cell to translate a nearby UGA
codon Genetic code is a set of rules used by living cells to translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished by the ribosome, which links prote ...
as
selenocysteine Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
(UGA is normally a
stop codon In molecular biology, a stop codon (or termination codon) is a codon (nucleotide triplet within messenger RNA) that signals the termination of the translation process of the current protein. Most codons in messenger RNA correspond to the additio ...
). In some methanogenic prokaryotes, the UAG codon (normally a stop codon) can also be translated to
pyrrolysine Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
. In eukaryotes, there are only 21 proteinogenic amino acids, the 20 of the standard genetic code, plus
selenocysteine Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
. Humans can synthesize 12 of these from each other or from other molecules of intermediary metabolism. The other nine must be consumed (usually as their protein derivatives), and so they are called
essential amino acid An essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized from scratch by the organism fast enough to supply its demand, and must therefore come from the diet. Of the 21 amino acids common to all life forms ...
s. The essential amino acids are
histidine Histidine (symbol His or H) is an essential amino acid that is used in the biosynthesis of proteins. It contains an Amine, α-amino group (which is in the protonated –NH3+ form under Physiological condition, biological conditions), a carboxylic ...
,
isoleucine Isoleucine (symbol Ile or I) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), an α-carboxylic acid group (which is in the depro ...
,
leucine Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α-Car ...
,
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. Lysine contains an α-amino group (which is in the protonated form when the lysine is dissolved in water at physiological pH), an α-carboxylic acid group ( ...
,
methionine Methionine (symbol Met or M) () is an essential amino acid in humans. As the precursor of other non-essential amino acids such as cysteine and taurine, versatile compounds such as SAM-e, and the important antioxidant glutathione, methionine play ...
,
phenylalanine Phenylalanine (symbol Phe or F) is an essential α-amino acid with the chemical formula, formula . It can be viewed as a benzyl group substituent, substituted for the methyl group of alanine, or a phenyl group in place of a terminal hydrogen of ...
,
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form when dissolved in water), a carboxyl group (which is in the deprotonated −COO− ...
,
tryptophan Tryptophan (symbol Trp or W) is an α-amino acid that is used in the biosynthesis of proteins. Tryptophan contains an α-amino group, an α-carboxylic acid group, and a side chain indole, making it a polar molecule with a non-polar aromat ...
, and
valine Valine (symbol Val or V) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated −NH3+ form under biological conditions), an α- carboxylic acid group (which is in the deproton ...
(i.e. H, I, L, K, M, F, T, W, V). The proteinogenic amino acids have been found to be related to the set of
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s that can be recognized by
ribozyme Ribozymes (ribonucleic acid enzymes) are RNA molecules that have the ability to Catalysis, catalyze specific biochemical reactions, including RNA splicing in gene expression, similar to the action of protein enzymes. The 1982 discovery of ribozy ...
autoaminoacylation systems. Thus, non-proteinogenic amino acids would have been excluded by the contingent evolutionary success of nucleotide-based life forms. Other reasons have been offered to explain why certain specific non-proteinogenic amino acids are not generally incorporated into proteins; for example,
ornithine Ornithine is a non-proteinogenic α-amino acid that plays a role in the urea cycle. It is not incorporated into proteins during translation. Ornithine is abnormally accumulated in the body in ornithine transcarbamylase deficiency, a disorder of th ...
and homoserine cyclize against the peptide backbone and fragment the protein with relatively short
half-lives Half-life is a mathematical and scientific description of exponential or gradual decay. Half-life, half life or halflife may also refer to: Film * ''Half-Life'' (film), a 2008 independent film by Jennifer Phang * '' Half Life: A Parable for t ...
, while others are toxic because they can be mistakenly incorporated into proteins, such as the arginine analog
canavanine L-(+)-(''S'')-Canavanine is a non-proteinogenic amino acid found in certain leguminous plants. It is structurally related to the proteinogenic α-amino acid L-arginine, the sole difference being the replacement of a methylene bridge ( unit) in ...
. The
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
ary selection of certain proteinogenic amino acids from the
primordial soup Primordial soup, also known as prebiotic soup and Haldane soup, is the hypothetical set of conditions present on the Earth around 3.7 to 4.0 billion years ago. It is an aspect of the heterotrophic theory (also known as the Oparin–Haldane hypothes ...
has been suggested to be because of their better incorporation into a polypeptide chain as opposed to non-proteinogenic amino acids.


Structures

The following illustrates the structures and abbreviations of the 21 amino acids that are directly encoded for protein synthesis by the genetic code of eukaryotes. The structures given below are standard chemical structures, not the typical
zwitterion In chemistry, a zwitterion ( ; ), also called an inner salt or dipolar ion, is a molecule that contains an equal number of positively and negatively charged functional groups. : (1,2- dipolar compounds, such as ylides, are sometimes excluded from ...
forms that exist in aqueous solutions. image:L-Alanin - L-Alanine.svg, L-Alanine
(Ala / A) image:Arginin - Arginine.svg, L-Arginine
(Arg / R) image:L-Asparagin - L-Asparagine.svg, L-Asparagine
(Asn / N) image:L-Asparaginsäure - L-Aspartic_acid.svg, L-Aspartic acid
(Asp / D) image:L-Cystein - L-Cysteine.svg, L-Cysteine
(Cys / C) image:L-Glutaminsäure - L-Glutamic_acid.svg, L-Glutamic acid
(Glu / E) image:L-Glutamin - L-Glutamine.svg, L-Glutamine
(Gln / Q) image:Glycine-2D-skeletal.svg,
Glycine Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (G ...

(Gly / G) image:L-histidine-skeletal.png, L-Histidine
(His / H) image:L-Isoleucin_-_L-Isoleucine.svg, L-Isoleucine
(Ile / I) image:L-Leucine.svg, L-Leucine
(Leu / L) image:L-Lysin_-_L-Lysine.svg, L-Lysine
(Lys / K) image:Methionin_-_Methionine.svg, L-Methionine
(Met / M) image:L-Phenylalanin_-_L-Phenylalanine.svg, L-Phenylalanine
(Phe / F) image:Prolin_-_Proline.svg, L-Proline
(Pro / P) image:L-Serin_-_L-Serine.svg, L-Serine
(Ser / S) image:L-Threonin_-_L-Threonine.svg, L-Threonine
(Thr / T) image:L-Tryptophan_-_L-Tryptophan.svg, L-Tryptophan
(Trp / W) image:L-Tyrosin_-_L-Tyrosine.svg, L-Tyrosine
(Tyr / Y) image:L-valine-skeletal.png, L-Valine
(Val / V)
IUPAC The International Union of Pure and Applied Chemistry (IUPAC ) is an international federation of National Adhering Organizations working for the advancement of the chemical sciences, especially by developing nomenclature and terminology. It is ...
/ IUBMB now also recommends standard abbreviations for the following two amino acids: image:L-selenocysteine-2D-skeletal.png, L-Selenocysteine
(Sec / U) image:Pyrrolysine.svg, L-Pyrrolysine
(Pyl / O)


Chemical properties

Following is a table listing the one-letter symbols, the three-letter symbols, and the chemical properties of the side chains of the standard amino acids. The masses listed are based on weighted averages of the elemental
isotope Isotopes are distinct nuclear species (or ''nuclides'') of the same chemical element. They have the same atomic number (number of protons in their Atomic nucleus, nuclei) and position in the periodic table (and hence belong to the same chemica ...
s at their
natural abundance In physics, natural abundance (NA) refers to the abundance of isotopes of a chemical element as naturally found on a planet. The relative atomic mass (a weighted average, weighted by mole-fraction abundance figures) of these isotopes is the ato ...
s. Forming a
peptide bond In organic chemistry, a peptide bond is an amide type of covalent chemical bond linking two consecutive alpha-amino acids from C1 (carbon number one) of one alpha-amino acid and N2 (nitrogen number two) of another, along a peptide or protein cha ...
results in elimination of a molecule of
water Water is an inorganic compound with the chemical formula . It is a transparent, tasteless, odorless, and Color of water, nearly colorless chemical substance. It is the main constituent of Earth's hydrosphere and the fluids of all known liv ...
. Therefore, the protein's mass is equal to the mass of amino acids the protein is composed of minus 18.01524 Da per peptide bond.


General chemical properties


Side-chain properties

§: Only ionizable residues have a meaningful pKa. Values for Asp, Cys, Glu, His, Lys & Tyr were determined using the amino acid residue placed centrally in an alanine pentapeptide. The value for Arg is from Pace ''et al.'' (2009). The value for Sec is from Byun & Kang (2011). Note: the pKa value of an amino-acid residue in a small peptide is typically slightly different when it is inside a protein. Protein pKa calculations are sometimes used to calculate the change in the pKa value of an amino-acid residue in this situation.


Gene expression and biochemistry

* UAG is normally the amber stop codon, but in organisms containing the biological machinery encoded by the pylTSBCD cluster of genes the amino acid pyrrolysine will be incorporated.
** UGA is normally the opal (or umber) stop codon, but encodes selenocysteine if a
SECIS element In biology, the SECIS element (SECIS: ''selenocysteine insertion sequence'') is an RNA element around 60 nucleotides in length that adopts a stem-loop structure. This structural motif (pattern of nucleotides) directs the cell to translate ...
is present.
The
stop codon In molecular biology, a stop codon (or termination codon) is a codon (nucleotide triplet within messenger RNA) that signals the termination of the translation process of the current protein. Most codons in messenger RNA correspond to the additio ...
is not an amino acid, but is included for completeness.
†† UAG and UGA do not always act as stop codons (see above).
An essential amino acid cannot be synthesized in humans and must, therefore, be supplied in the diet. Conditionally essential amino acids are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize it in adequate amounts.
& Occurrence of amino acids is based on 135 Archaea, 3775 Bacteria, 614 Eukaryota proteomes and human proteome (21 006 proteins) respectively.


Mass spectrometry

In
mass spectrometry Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used ...
of peptides and proteins, knowledge of the masses of the residues is useful. The mass of the peptide or protein is the sum of the residue masses plus the mass of
water Water is an inorganic compound with the chemical formula . It is a transparent, tasteless, odorless, and Color of water, nearly colorless chemical substance. It is the main constituent of Earth's hydrosphere and the fluids of all known liv ...
( Monoisotopic mass = 18.01056 Da; average mass = 18.0153 Da). The residue masses are calculated from the tabulated chemical formulas and atomic weights. In
mass spectrometry Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used ...
, ions may also include one or more
protons A proton is a stable subatomic particle, symbol , H+, or 1H+ with a positive electric charge of +1 ''e'' ( elementary charge). Its mass is slightly less than the mass of a neutron and approximately times the mass of an electron (the pro ...
( Monoisotopic mass = 1.00728 Da; average mass* = 1.0074 Da). *Protons cannot have an average mass, this confusingly infers to Deuterons as a valid isotope, but they should be a different species (see
Hydron (chemistry) In chemistry, the hydron, informally called proton, is the cationic form of atomic hydrogen, represented with the symbol . The general term "hydron", endorsed by IUPAC, encompasses cations of hydrogen regardless of isotope: thus it refers colle ...
) § Monoisotopic mass


Stoichiometry and metabolic cost in cell

The table below lists the abundance of amino acids in ''E.coli'' cells and the metabolic cost (ATP) for synthesis of the amino acids. Negative numbers indicate the metabolic processes are energy favorable and do not cost net ATP of the cell. The abundance of amino acids includes amino acids in free form and in polymerization form (proteins).


Remarks


Catabolism

Amino acids can be classified according to the properties of their main products: * Glucogenic, with the products having the ability to form
glucose Glucose is a sugar with the Chemical formula#Molecular formula, molecular formula , which is often abbreviated as Glc. It is overall the most abundant monosaccharide, a subcategory of carbohydrates. It is mainly made by plants and most algae d ...
by
gluconeogenesis Gluconeogenesis (GNG) is a metabolic pathway that results in the biosynthesis of glucose from certain non-carbohydrate carbon substrates. It is a ubiquitous process, present in plants, animals, fungi, bacteria, and other microorganisms. In verte ...
* Ketogenic, with the products not having the ability to form glucose: These products may still be used for ketogenesis or lipid synthesis. * Amino acids catabolized into both glucogenic and ketogenic products


See also

* Glucogenic amino acid *
Ketogenic amino acid A ketogenic amino acid is an amino acid that can be degraded directly into acetyl-CoA, which is the precursor of ketone bodies and myelin, particularly during early childhood, when the developing brain requires high rates of myelin synthesis. This ...


References


General references

* * * *


External links


The origin of the single-letter code for the amino acids
{{Amino acids Alpha-Amino acids Nitrogen cycle Nutrition