
Amino acids are
organic compound
Some chemical authorities define an organic compound as a chemical compound that contains a carbon–hydrogen or carbon–carbon bond; others consider an organic compound to be any chemical compound that contains carbon. For example, carbon-co ...
s that contain both
amino
In chemistry, amines (, ) are organic compounds that contain carbon-nitrogen bonds. Amines are formed when one or more hydrogen atoms in ammonia are replaced by alkyl or aryl groups. The nitrogen atom in an amine possesses a lone pair of elec ...
and
carboxylic acid
In organic chemistry, a carboxylic acid is an organic acid that contains a carboxyl group () attached to an Substituent, R-group. The general formula of a carboxylic acid is often written as or , sometimes as with R referring to an organyl ...
functional group
In organic chemistry, a functional group is any substituent or moiety (chemistry), moiety in a molecule that causes the molecule's characteristic chemical reactions. The same functional group will undergo the same or similar chemical reactions r ...
s. Although over 500 amino acids exist in nature, by far the most important are the
22 α-amino acids incorporated into
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s. Only these 22 appear in the
genetic code
Genetic code is a set of rules used by living cell (biology), cells to Translation (biology), translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished ...
of life.
Amino acids can be classified according to the locations of the core structural functional groups (
alpha- (α-), beta- (β-), gamma- (γ-) amino acids, etc.); other categories relate to
polarity,
ionization
Ionization or ionisation is the process by which an atom or a molecule acquires a negative or positive Electric charge, charge by gaining or losing electrons, often in conjunction with other chemical changes. The resulting electrically charged at ...
, and side-chain group type (
aliphatic
In organic chemistry, hydrocarbons ( compounds composed solely of carbon and hydrogen) are divided into two classes: aromatic compounds and aliphatic compounds (; G. ''aleiphar'', fat, oil). Aliphatic compounds can be saturated (in which all ...
,
acyclic,
aromatic
In organic chemistry, aromaticity is a chemical property describing the way in which a conjugated system, conjugated ring of unsaturated bonds, lone pairs, or empty orbitals exhibits a stabilization stronger than would be expected from conjugati ...
,
polar, etc.). In the form of proteins, amino-acid ''
residues'' form the second-largest component (
water
Water is an inorganic compound with the chemical formula . It is a transparent, tasteless, odorless, and Color of water, nearly colorless chemical substance. It is the main constituent of Earth's hydrosphere and the fluids of all known liv ...
being the largest) of human
muscle
Muscle is a soft tissue, one of the four basic types of animal tissue. There are three types of muscle tissue in vertebrates: skeletal muscle, cardiac muscle, and smooth muscle. Muscle tissue gives skeletal muscles the ability to muscle contra ...
s and other
tissues. Beyond their role as residues in proteins, amino acids participate in a number of processes such as
neurotransmitter
A neurotransmitter is a signaling molecule secreted by a neuron to affect another cell across a Chemical synapse, synapse. The cell receiving the signal, or target cell, may be another neuron, but could also be a gland or muscle cell.
Neurotra ...
transport and
biosynthesis
Biosynthesis, i.e., chemical synthesis occurring in biological contexts, is a term most often referring to multi-step, enzyme-Catalysis, catalyzed processes where chemical substances absorbed as nutrients (or previously converted through biosynthe ...
. It is thought that they played a key role in
enabling life on Earth and its emergence.
Amino acids are formally named by the
IUPAC
The International Union of Pure and Applied Chemistry (IUPAC ) is an international federation of National Adhering Organizations working for the advancement of the chemical sciences, especially by developing nomenclature and terminology. It is ...
-
IUBMB Joint Commission
The Joint Commission is a United States-based nonprofit tax-exempt 501(c) organization that accredits more than 22,000 US health care organizations and programs. The international branch accredits medical services from around the world.
A majori ...
on Biochemical Nomenclature in terms of the fictitious "neutral" structure shown in the illustration. For example, the systematic name of alanine is 2-aminopropanoic acid, based on the formula . The Commission justified this approach as follows:
The systematic names and formulas given refer to hypothetical forms in which amino groups are unprotonated and carboxyl groups are undissociated. This convention is useful to avoid various nomenclatural problems but should not be taken to imply that these structures represent an appreciable fraction of the amino-acid molecules.
History
The first few amino acids were discovered in the early 1800s. In 1806, French chemists
Louis-Nicolas Vauquelin
Louis Nicolas Vauquelin FRS(For) HFRSE (; 16 May 1763 – 14 November 1829) was a French pharmacist and chemist. He was the discoverer of chromium and beryllium.
Early life
Vauquelin was born at Saint-André-d'Hébertot in Normandy, France, th ...
and
Pierre Jean Robiquet
Pierre Jean Robiquet (; 13 January 1780 – 29 April 1840) was a French chemist. He laid founding work in identifying amino acids, the fundamental building blocks of proteins. He did this through recognizing the first of them, asparagine, in 18 ...
isolated a compound from
asparagus
Asparagus (''Asparagus officinalis'') is a perennial flowering plant species in the genus ''Asparagus (genus), Asparagus'' native to Eurasia. Widely cultivated as a vegetable crop, its young shoots are used as a spring vegetable.
Description ...
that was subsequently named
asparagine
Asparagine (symbol Asn or N) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), an α-carboxylic acid group (which is in the depro ...
, the first amino acid to be discovered.
Cystine
Cystine is the oxidized derivative of the amino acid cysteine and has the formula (SCH2CH(NH2)CO2H)2. It is a white solid that is poorly soluble in water. As a residue in proteins, cystine serves two functions: a site of redox reactions and a mec ...
was discovered in 1810, although its monomer,
cysteine
Cysteine (; symbol Cys or C) is a semiessential proteinogenic amino acid with the chemical formula, formula . The thiol side chain in cysteine enables the formation of Disulfide, disulfide bonds, and often participates in enzymatic reactions as ...
, remained undiscovered until 1884.
[ ]Glycine
Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (G ...
and leucine
Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α-Car ...
were discovered in 1820. The last of the 20 common amino acids to be discovered was threonine
Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form when dissolved in water), a carboxyl group (which is in the deprotonated −COO− ...
in 1935 by William Cumming Rose
William Cumming Rose (April 4, 1887 – September 25, 1985) was an American biochemist and nutritionist. He discovered the amino acid threonine, and his research determined the necessity for essential amino acids in diet and the minimum daily re ...
, who also determined the essential amino acid
An essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized from scratch by the organism fast enough to supply its demand, and must therefore come from the diet. Of the 21 amino acids common to all life forms ...
s and established the minimum daily requirements of all amino acids for optimal growth.
The unity of the chemical category was recognized by Wurtz in 1865, but he gave no particular name to it. The first use of the term "amino acid" in the English language dates from 1898, while the German term, , was used earlier. Protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s were found to yield amino acids after enzymatic digestion or acid hydrolysis
Hydrolysis (; ) is any chemical reaction in which a molecule of water breaks one or more chemical bonds. The term is used broadly for substitution reaction, substitution, elimination reaction, elimination, and solvation reactions in which water ...
. In 1902, Emil Fischer
Hermann Emil Louis Fischer (; 9 October 1852 – 15 July 1919) was a German chemist and List of Nobel laureates in Chemistry, 1902 recipient of the Nobel Prize in Chemistry. He discovered the Fischer esterification. He also developed the Fisch ...
and Franz Hofmeister
Franz Hofmeister (30 August 1850, in Prague – 26 July 1922, in Würzburg) was an early protein scientist, and is famous for his studies of salts that influence the solubility and conformational stability of proteins. In 1902, Hofmeister became t ...
independently proposed that proteins are formed from many amino acids, whereby bonds are formed between the amino group of one amino acid with the carboxyl group of another, resulting in a linear structure that Fischer termed "peptide
Peptides are short chains of amino acids linked by peptide bonds. A polypeptide is a longer, continuous, unbranched peptide chain. Polypeptides that have a molecular mass of 10,000 Da or more are called proteins. Chains of fewer than twenty am ...
".
General structure
2-, alpha-, or α-amino acids have the generic formula
In science, a formula is a concise way of expressing information symbolically, as in a mathematical formula or a ''chemical formula''. The informal use of the term ''formula'' in science refers to the general construct of a relationship betwe ...
in most cases, where R is an organic substituent
In organic chemistry, a substituent is one or a group of atoms that replaces (one or more) atoms, thereby becoming a moiety in the resultant (new) molecule.
The suffix ''-yl'' is used when naming organic compounds that contain a single bond r ...
known as a "side chain
In organic chemistry and biochemistry, a side chain is a substituent, chemical group that is attached to a core part of the molecule called the "main chain" or backbone chain, backbone. The side chain is a hydrocarbon branching element of a mo ...
".
Of the many hundreds of described amino acids, 22 are proteinogenic
Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation from RNA. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) ...
("protein-building"). It is these 22 compounds that combine to give a vast array of peptides and proteins assembled by ribosome
Ribosomes () are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (messenger RNA translation). Ribosomes link amino acids together in the order s ...
s. Non-proteinogenic or modified amino acids may arise from post-translational modification
In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translation (biolog ...
or during nonribosomal peptide
Nonribosomal peptides (NRP) are a class of peptide secondary metabolites, usually produced by microorganisms like bacterium, bacteria and fungi. Nonribosomal peptides are also found in higher organisms, such as nudibranchs, but are thought to be ma ...
synthesis.
Chirality
The carbon
Carbon () is a chemical element; it has chemical symbol, symbol C and atomic number 6. It is nonmetallic and tetravalence, tetravalent—meaning that its atoms are able to form up to four covalent bonds due to its valence shell exhibiting 4 ...
atom next to the carboxyl group
In organic chemistry, a carboxylic acid is an organic acid that contains a carboxyl group () attached to an R-group. The general formula of a carboxylic acid is often written as or , sometimes as with R referring to an organyl group (e.g. ...
is called the α–carbon. In proteinogenic amino acids, it bears the amine and the R group or side chain
In organic chemistry and biochemistry, a side chain is a substituent, chemical group that is attached to a core part of the molecule called the "main chain" or backbone chain, backbone. The side chain is a hydrocarbon branching element of a mo ...
specific to each amino acid, as well as a hydrogen atom. With the exception of glycine, for which the side chain is also a hydrogen atom, the α–carbon is stereogenic
In stereochemistry, a stereocenter of a molecule is an atom (center), axis or plane that is the focus of stereoisomerism; that is, when having at least three different groups bound to the stereocenter, interchanging any two different groups cr ...
. All chiral
Chirality () is a property of asymmetry important in several branches of science. The word ''chirality'' is derived from the Greek language, Greek (''kheir''), "hand", a familiar chiral object.
An object or a system is ''chiral'' if it is dist ...
proteogenic amino acids have the L configuration. They are "left-handed" enantiomer
In chemistry, an enantiomer (Help:IPA/English, /ɪˈnænti.əmər, ɛ-, -oʊ-/ Help:Pronunciation respelling key, ''ih-NAN-tee-ə-mər''), also known as an optical isomer, antipode, or optical antipode, is one of a pair of molecular entities whi ...
s, which refers to the stereoisomers
In stereochemistry, stereoisomerism, or spatial isomerism, is a form of isomerism in which molecules have the same molecular formula and sequence of bonded atoms (constitution), but differ in the three-dimensional orientations of their atoms i ...
of the alpha carbon.
A few D-amino acids ("right-handed") have been found in nature, e.g., in bacterial envelopes, as a neuromodulator
Neuromodulation is the physiological process by which a given neuron uses one or more chemicals to regulate diverse populations of neurons. Neuromodulators typically bind to metabotropic, G-protein coupled receptors (GPCRs) to initiate a secon ...
(D-serine
Serine
(symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − ...
), and in some antibiotic
An antibiotic is a type of antimicrobial substance active against bacteria. It is the most important type of antibacterial agent for fighting pathogenic bacteria, bacterial infections, and antibiotic medications are widely used in the therapy ...
s. Rarely, D-amino acid residues are found in proteins, and are converted from the L-amino acid as a post-translational modification
In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translation (biolog ...
.
Side chains
Polar charged side chains
Five amino acids possess a charge at neutral pH. Often these side chains appear at the surfaces on proteins to enable their solubility in water, and side chains with opposite charges form important electrostatic contacts called salt bridges that maintain structures within a single protein or between interfacing proteins. Many proteins bind metal into their structures specifically, and these interactions are commonly mediated by charged side chains such as aspartate
Aspartic acid (symbol Asp or D; the ionic form is known as aspartate), is an α-amino acid that is used in the biosynthesis of proteins. The L-isomer of aspartic acid is one of the 22 proteinogenic amino acids, i.e., the building blocks of protein ...
, glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
and histidine
Histidine (symbol His or H) is an essential amino acid that is used in the biosynthesis of proteins. It contains an Amine, α-amino group (which is in the protonated –NH3+ form under Physiological condition, biological conditions), a carboxylic ...
. Under certain conditions, each ion-forming group can be charged, forming double salts.
The two negatively charged amino acids at neutral pH are aspartate
Aspartic acid (symbol Asp or D; the ionic form is known as aspartate), is an α-amino acid that is used in the biosynthesis of proteins. The L-isomer of aspartic acid is one of the 22 proteinogenic amino acids, i.e., the building blocks of protein ...
(Asp, D) and glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
(Glu, E). The anionic carboxylate groups behave as Brønsted bases in most circumstances. Enzymes in very low pH environments, like the aspartic protease pepsin
Pepsin is an endopeptidase that breaks down proteins into smaller peptides and amino acids. It is one of the main digestive enzymes in the digestive systems of humans and many other animals, where it helps digest the proteins in food. Pe ...
in mammalian stomachs, may have catalytic aspartate or glutamate residues that act as Brønsted acids.
There are three amino acids with side chains that are cations at neutral pH: arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidinium, guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) a ...
(Arg, R), lysine
Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. Lysine contains an α-amino group (which is in the protonated form when the lysine is dissolved in water at physiological pH), an α-carboxylic acid group ( ...
(Lys, K) and histidine
Histidine (symbol His or H) is an essential amino acid that is used in the biosynthesis of proteins. It contains an Amine, α-amino group (which is in the protonated –NH3+ form under Physiological condition, biological conditions), a carboxylic ...
(His, H). Arginine has a charged guanidino group and lysine a charged alkyl amino group, and are fully protonated at pH 7. Histidine's imidazole group has a pKa of 6.0, and is only around 10% protonated at neutral pH. Because histidine is easily found in its basic and conjugate acid forms it often participates in catalytic proton transfers in enzyme reactions.
Polar uncharged side chains
The polar, uncharged amino acids serine
Serine
(symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − ...
(Ser, S), threonine
Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form when dissolved in water), a carboxyl group (which is in the deprotonated −COO− ...
(Thr, T), asparagine
Asparagine (symbol Asn or N) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), an α-carboxylic acid group (which is in the depro ...
(Asn, N) and glutamine
Glutamine (symbol Gln or Q) is an α-amino acid that is used in the biosynthesis of proteins. Its side chain is similar to that of glutamic acid, except the carboxylic acid group is replaced by an amide. It is classified as a charge-neutral ...
(Gln, Q) readily form hydrogen bonds with water and other amino acids. They do not ionize in normal conditions, a prominent exception being the catalytic serine in serine proteases
Serine proteases (or serine endopeptidases) are enzymes that cleave peptide bonds in proteins. Serine serves as the nucleophilic amino acid at the (enzyme's) active site.
They are found ubiquitously in both eukaryotes and prokaryotes. Seri ...
. This is an example of severe perturbation, and is not characteristic of serine residues in general. Threonine has two chiral centers, not only the L (2''S'') chiral center at the α-carbon shared by all amino acids apart from achiral glycine, but also (3''R'') at the β-carbon. The full stereochemical
Stereochemistry, a subdiscipline of chemistry, studies the spatial arrangement of atoms that form the structure of molecules and their manipulation. The study of stereochemistry focuses on the relationships between stereoisomers, which are defined ...
specification is (2''S'',3''R'')-L-threonine
Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form when dissolved in water), a carboxyl group (which is in the deprotonated −COO− ...
.
Hydrophobic side chains
Nonpolar amino acid interactions are the primary driving force behind the processes that fold proteins into their functional three dimensional structures. None of these amino acids' side chains ionize easily, and therefore do not have pKas, with the exception of tyrosine
-Tyrosine or tyrosine (symbol Tyr or Y) or 4-hydroxyphenylalanine is one of the 20 standard amino acids that are used by cells to synthesize proteins. It is a conditionally essential amino acid with a polar side group. The word "tyrosine" is ...
(Tyr, Y). The hydroxyl of tyrosine can deprotonate at high pH forming the negatively charged phenolate. Because of this one could place tyrosine into the polar, uncharged amino acid category, but its very low solubility in water matches the characteristics of hydrophobic amino acids well.
Special case side chains
Several side chains are not described well by the charged, polar and hydrophobic categories. Glycine
Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (G ...
(Gly, G) could be considered a polar amino acid since its small size means that its solubility is largely determined by the amino and carboxylate groups. However, the lack of any side chain provides glycine with a unique flexibility among amino acids with large ramifications to protein folding. Cysteine
Cysteine (; symbol Cys or C) is a semiessential proteinogenic amino acid with the chemical formula, formula . The thiol side chain in cysteine enables the formation of Disulfide, disulfide bonds, and often participates in enzymatic reactions as ...
(Cys, C) can also form hydrogen bonds readily, which would place it in the polar amino acid category, though it can often be found in protein structures forming covalent bonds, called disulphide bonds
In chemistry, a disulfide (or disulphide in British English) is a compound containing a functional group or the anion. The linkage is also called an SS-bond or sometimes a disulfide bridge and usually derived from two thiol groups.
In inorg ...
, with other cysteines. These bonds influence the folding and stability of proteins, and are essential in the formation of antibodies
An antibody (Ab) or immunoglobulin (Ig) is a large, Y-shaped protein belonging to the immunoglobulin superfamily which is used by the immune system to identify and neutralize antigens such as bacteria and viruses, including those that caus ...
. Proline
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the p ...
(Pro, P) has an alkyl side chain and could be considered hydrophobic, but because the side chain joins back onto the alpha amino group it becomes particularly inflexible when incorporated into proteins. Similar to glycine this influences protein structure in a way unique among amino acids. Selenocysteine
Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
(Sec, U) is a rare amino acid not directly encoded by DNA, but is incorporated into proteins via the ribosome. Selenocysteine has a lower redox potential compared to the similar cysteine, and participates in several unique enzymatic reactions. Pyrrolysine
Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
(Pyl, O) is another amino acid not encoded in DNA, but synthesized into protein by ribosomes. It is found in archaeal species where it participates in the catalytic activity of several methyltransferases.
β- and γ-amino acids
Amino acids with the structure , such as β-alanine, a component of carnosine
Carnosine (''beta''-alanyl-L-histidine) is a dipeptide molecule, made up of the amino acids beta-alanine and histidine. It is highly concentrated in muscle and brain tissues. Carnosine was discovered by Russian chemist Vladimir Gulevich.
Ca ...
and a few other peptides, are β-amino acids. Ones with the structure are γ-amino acids, and so on, where X and Y are two substituents (one of which is normally H).
Zwitterions
The common natural forms of amino acids have a zwitterionic
In chemistry, a zwitterion ( ; ), also called an inner salt or dipolar ion, is a molecule that contains an equal number of positively and negatively charged functional groups.
:
(1,2-dipolar compounds, such as ylides, are sometimes excluded from t ...
structure, with ( in the case of proline) and functional groups attached to the same C atom, and are thus α-amino acids, and are the only ones found in proteins during translation in the ribosome.
In aqueous solution at pH close to neutrality, amino acids exist as zwitterion
In chemistry, a zwitterion ( ; ), also called an inner salt or dipolar ion, is a molecule that contains an equal number of positively and negatively charged functional groups.
:
(1,2- dipolar compounds, such as ylides, are sometimes excluded from ...
s, i.e. as dipolar ions with both and in charged states, so the overall structure is . At physiological pH
Physiology (; ) is the scientific study of functions and mechanisms in a living system. As a subdiscipline of biology, physiology focuses on how organisms, organ systems, individual organs, cells, and biomolecules carry out chemical and ...
the so-called "neutral forms" are not present to any measurable degree. Although the two charges in the zwitterion structure add up to zero it is misleading to call a species with a net charge of zero "uncharged".
In strongly acidic conditions (pH below 3), the carboxylate group becomes protonated and the structure becomes an ammonio carboxylic acid, . This is relevant for enzymes like pepsin that are active in acidic environments such as the mammalian stomach and lysosomes
A lysosome () is a membrane-bound organelle that is found in all mammalian cells, with the exception of red blood cells (erythrocytes). There are normally hundreds of lysosomes in the cytosol, where they function as the cell’s degradation cent ...
, but does not significantly apply to intracellular enzymes. In highly basic conditions (pH greater than 10, not normally seen in physiological conditions), the ammonio group is deprotonated to give .
Although various definitions of acids and bases are used in chemistry, the only one that is useful for chemistry in aqueous solution is that of Brønsted: an acid is a species that can donate a proton to another species, and a base is one that can accept a proton. This criterion is used to label the groups in the above illustration. The carboxylate side chains of aspartate and glutamate residues are the principal Brønsted bases in proteins. Likewise, lysine, tyrosine and cysteine will typically act as a Brønsted acid. Histidine under these conditions can act both as a Brønsted acid and a base.
Isoelectric point
For amino acids with uncharged side-chains the zwitterion predominates at pH values between the two p''K''a values, but coexists in equilibrium
Equilibrium may refer to:
Film and television
* ''Equilibrium'' (film), a 2002 science fiction film
* '' The Story of Three Loves'', also known as ''Equilibrium'', a 1953 romantic anthology film
* "Equilibrium" (''seaQuest 2032'')
* ''Equilibr ...
with small amounts of net negative and net positive ions. At the midpoint between the two p''K''a values, the trace amount of net negative and trace of net positive ions balance, so that average net charge of all forms present is zero. This pH is known as the isoelectric point
The isoelectric point (pI, pH(I), IEP), is the pH at which a molecule carries no net electric charge, electrical charge or is electrically neutral in the statistical mean. The standard nomenclature to represent the isoelectric point is pH(I). Howe ...
p''I'', so p''I'' = (p''K''a1 + p''K''a2).
For amino acids with charged side chains, the p''K''a of the side chain is involved. Thus for aspartate or glutamate with negative side chains, the terminal amino group is essentially entirely in the charged form , but this positive charge needs to be balanced by the state with just one C-terminal carboxylate group is negatively charged. This occurs halfway between the two carboxylate p''K''a values: p''I'' = (p''K''a1 + p''K''a(R)), where p''K''a(R) is the side chain p''K''a.
Similar considerations apply to other amino acids with ionizable side-chains, including not only glutamate (similar to aspartate), but also cysteine, histidine, lysine, tyrosine and arginine with positive side chains.
Amino acids have zero mobility in electrophoresis
Electrophoresis is the motion of charged dispersed particles or dissolved charged molecules relative to a fluid under the influence of a spatially uniform electric field. As a rule, these are zwitterions with a positive or negative net ch ...
at their isoelectric point, although this behaviour is more usually exploited for peptides and proteins than single amino acids. Zwitterions have minimum solubility at their isoelectric point, and some amino acids (in particular, with nonpolar side chains) can be isolated by precipitation from water by adjusting the pH to the required isoelectric point.
Physicochemical properties
The 20 canonical amino acids can be classified according to their properties. Important factors are charge, hydrophilicity or hydrophobicity
In chemistry, hydrophobicity is the chemical property of a molecule (called a hydrophobe) that is seemingly intermolecular force, repelled from a mass of water. In contrast, hydrophiles are attracted to water.
Hydrophobic molecules tend to b ...
, size, and functional groups. These properties influence protein structure
Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid ...
and protein–protein interaction
Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and t ...
s. The water-soluble proteins tend to have their hydrophobic residues ( Leu, Ile
Ile or ILE may refer to:
Ile
* Ile, a Puerto Rican singer
* Ile District (disambiguation), multiple places
* Ilé-Ifẹ̀, an ancient Yoruba city in south-western Nigeria
* Interlingue (ISO 639:ile), a planned language
* Isoleucine, an amino a ...
, Val, Phe, and Trp) buried in the middle of the protein, whereas hydrophilic side chains are exposed to the aqueous solvent. (In biochemistry
Biochemistry, or biological chemistry, is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology, a ...
, a residue refers to a specific monomer
A monomer ( ; ''mono-'', "one" + '' -mer'', "part") is a molecule that can react together with other monomer molecules to form a larger polymer chain or two- or three-dimensional network in a process called polymerization.
Classification
Chemis ...
''within'' the polymer
A polymer () is a chemical substance, substance or material that consists of very large molecules, or macromolecules, that are constituted by many repeat unit, repeating subunits derived from one or more species of monomers. Due to their br ...
ic chain of a polysaccharide
Polysaccharides (), or polycarbohydrates, are the most abundant carbohydrates found in food. They are long-chain polymeric carbohydrates composed of monosaccharide units bound together by glycosidic linkages. This carbohydrate can react with wat ...
, protein or nucleic acid
Nucleic acids are large biomolecules that are crucial in all cells and viruses. They are composed of nucleotides, which are the monomer components: a pentose, 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nuclei ...
.) The integral membrane protein
An integral, or intrinsic, membrane protein (IMP) is a type of membrane protein that is permanently attached to the biological membrane. All transmembrane proteins can be classified as IMPs, but not all IMPs are transmembrane proteins. IMPs comp ...
s tend to have outer rings of exposed hydrophobic
In chemistry, hydrophobicity is the chemical property of a molecule (called a hydrophobe) that is seemingly repelled from a mass of water. In contrast, hydrophiles are attracted to water.
Hydrophobic molecules tend to be nonpolar and, thu ...
amino acids that anchor them in the lipid bilayer
The lipid bilayer (or phospholipid bilayer) is a thin polar membrane made of two layers of lipid molecules. These membranes form a continuous barrier around all cell (biology), cells. The cell membranes of almost all organisms and many viruses a ...
. Some peripheral membrane protein
Peripheral membrane proteins, or extrinsic membrane proteins, are membrane proteins that adhere only temporarily to the biological membrane with which they are associated. These proteins attach to integral membrane proteins, or penetrate the periph ...
s have a patch of hydrophobic amino acids on their surface that sticks to the membrane. In a similar fashion, proteins that have to bind to positively charged molecules have surfaces rich in negatively charged amino acids such as glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
and aspartate
Aspartic acid (symbol Asp or D; the ionic form is known as aspartate), is an α-amino acid that is used in the biosynthesis of proteins. The L-isomer of aspartic acid is one of the 22 proteinogenic amino acids, i.e., the building blocks of protein ...
, while proteins binding to negatively charged molecules have surfaces rich in positively charged amino acids like lysine
Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. Lysine contains an α-amino group (which is in the protonated form when the lysine is dissolved in water at physiological pH), an α-carboxylic acid group ( ...
and arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidinium, guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) a ...
. For example, lysine and arginine are present in large amounts in the low-complexity regions of nucleic-acid binding proteins. There are various hydrophobicity scales of amino acid residues.
Some amino acids have special properties. Cysteine can form covalent disulfide bond
In chemistry, a disulfide (or disulphide in British English) is a compound containing a functional group or the anion. The linkage is also called an SS-bond or sometimes a disulfide bridge and usually derived from two thiol groups.
In inor ...
s to other cysteine residues. Proline
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the p ...
forms a cycle to the polypeptide backbone, and glycine is more flexible than other amino acids.
Glycine and proline are strongly present within low complexity regions of both eukaryotic and prokaryotic proteins, whereas the opposite is the case with cysteine, phenylalanine, tryptophan, methionine, valine, leucine, isoleucine, which are highly reactive, or complex, or hydrophobic.
Many proteins undergo a range of posttranslational modification
In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translate mRNA ...
s, whereby additional chemical groups are attached to the amino acid residue side chains sometimes producing lipoprotein
A lipoprotein is a biochemical assembly whose primary function is to transport hydrophobic lipid (also known as fat) molecules in water, as in blood plasma or other extracellular fluids. They consist of a triglyceride and cholesterol center, sur ...
s (that are hydrophobic), or glycoprotein
Glycoproteins are proteins which contain oligosaccharide (sugar) chains covalently attached to amino acid side-chains. The carbohydrate is attached to the protein in a cotranslational or posttranslational modification. This process is known a ...
s (that are hydrophilic) allowing the protein to attach temporarily to a membrane. For example, a signaling protein can attach and then detach from a cell membrane, because it contains cysteine residues that can have the fatty acid palmitic acid
Palmitic acid (hexadecanoic acid in IUPAC nomenclature) is a fatty acid with a 16-carbon chain. It is the most common saturated fatty acid found in animals, plants and microorganisms.Gunstone, F. D., John L. Harwood, and Albert J. Dijkstra. The ...
added to them and subsequently removed.
Table of standard amino acid abbreviations and properties
Although one-letter symbols are included in the table, IUPAC–IUBMB recommend[ that "Use of the one-letter symbols should be restricted to the comparison of long sequences".
The one-letter notation was chosen by IUPAC-IUB based on the following rules:]
* Initial letters are used where there is no ambiguity: C cysteine, H histidine, I isoleucine, M methionine, S serine, V valine,
* Where arbitrary assignment is needed, the structurally simpler amino acids are given precedence: A Alanine, G glycine, L leucine, P proline, T threonine,
* F ''PH''enylalanine and R a''R''ginine are assigned by being phonetically suggestive,
* W tryptophan is assigned based on the double ring being visually suggestive to the bulky letter W,
* K lysine and Y tyrosine are assigned as alphabetically nearest to their initials L and T (note that U was avoided for its similarity with V, while X was reserved for undetermined or atypical amino acids); for tyrosine the mnemonic t''Y''rosine was also proposed,
* D aspartate was assigned arbitrarily, with the proposed mnemonic aspar''D''ic acid; E glutamate was assigned in alphabetical sequence being larger by merely one methylene –CH2– group,
* N asparagine was assigned arbitrarily, with the proposed mnemonic asparagi''N''e; Q glutamine was assigned in alphabetical sequence of those still available (note again that O was avoided due to similarity with D), with the proposed mnemonic ''Q''lutamine.
Two additional amino acids are in some species coded for by codons
Genetic code is a set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished by the ribosome, which links pro ...
that are usually interpreted as stop codon
In molecular biology, a stop codon (or termination codon) is a codon (nucleotide triplet within messenger RNA) that signals the termination of the translation process of the current protein. Most codons in messenger RNA correspond to the additio ...
s:
In addition to the specific amino acid codes, placeholders are used in cases where chemical
A chemical substance is a unique form of matter with constant chemical composition and characteristic properties. Chemical substances may take the form of a single element or chemical compounds. If two or more chemical substances can be combin ...
or crystallographic
Crystallography is the branch of science devoted to the study of molecular and crystalline structure and properties. The word ''crystallography'' is derived from the Ancient Greek word (; "clear ice, rock-crystal"), and (; "to write"). In J ...
analysis of a peptide or protein cannot conclusively determine the identity of a residue. They are also used to summarize conserved protein sequence motifs. The use of single letters to indicate sets of similar residues is similar to the use of abbreviation codes for degenerate bases.
Unk is sometimes used instead of Xaa, but is less standard.
Ter or * (from termination) is used in notation for mutations in proteins when a stop codon occurs. It corresponds to no amino acid at all.
In addition, many nonstandard amino acids have a specific code. For example, several peptide drugs, such as Bortezomib
Bortezomib, sold under the brand name Velcade among others, is an anti-cancer medication used to treat multiple myeloma and mantle cell lymphoma. This includes multiple myeloma in those who have and have not previously received treatment. It is ...
and MG132, are artificially synthesized and retain their protecting group
A protecting group or protective group is introduced into a molecule by chemical modification of a functional group to obtain chemoselectivity in a subsequent chemical reaction. It plays an important role in multistep organic synthesis.
In man ...
s, which have specific codes. Bortezomib is Pyz–Phe–boroLeu, and MG132 is Z–Leu–Leu–Leu–al. To aid in the analysis of protein structure, photo-reactive amino acid analog Photo-reactive amino acid analogs are artificial analogs of natural amino acids that can be used for crosslinking of protein complexes. Photo-reactive amino acid analogs may be incorporated into proteins and peptides ''in vivo'' or in ''vitro''. Pho ...
s are available. These include photoleucine (pLeu) and photomethionine (pMet).
Occurrence and functions in biochemistry
Proteinogenic amino acids
Amino acids are the precursors to proteins. They join by condensation reactions to form short polymer chains called peptides or longer chains called either polypeptides or proteins. These chains are linear and unbranched, with each amino acid residue within the chain attached to two neighboring amino acids. In nature, the process of making proteins encoded by RNA genetic material is called ''translation
Translation is the communication of the semantics, meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English la ...
'' and involves the step-by-step addition of amino acids to a growing protein chain by a ribozyme
Ribozymes (ribonucleic acid enzymes) are RNA molecules that have the ability to Catalysis, catalyze specific biochemical reactions, including RNA splicing in gene expression, similar to the action of protein enzymes. The 1982 discovery of ribozy ...
that is called a ribosome
Ribosomes () are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (messenger RNA translation). Ribosomes link amino acids together in the order s ...
. The order in which the amino acids are added is read through the genetic code
Genetic code is a set of rules used by living cell (biology), cells to Translation (biology), translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished ...
from an mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
template, which is an RNA
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
derived from one of the organism's gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s.
Twenty-two amino acids are naturally incorporated into polypeptides and are called proteinogenic
Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation from RNA. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) ...
or natural amino acids. Of these, 20 are encoded by the universal genetic code. The remaining 2, selenocysteine
Selenocysteine (symbol Sec or U, in older publications also as Se-Cys) is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the ...
and pyrrolysine
Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
, are incorporated into proteins by unique synthetic mechanisms. Selenocysteine is incorporated when the mRNA being translated includes a SECIS element
In biology, the SECIS element (SECIS: ''selenocysteine insertion sequence'') is an RNA element around 60 nucleotides in length that adopts a stem-loop structure. This structural motif (pattern of nucleotides) directs the cell to translate ...
, which causes the UGA codon to encode selenocysteine instead of a stop codon. Pyrrolysine
Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
is used by some methanogen
Methanogens are anaerobic archaea that produce methane as a byproduct of their energy metabolism, i.e., catabolism. Methane production, or methanogenesis, is the only biochemical pathway for Adenosine triphosphate, ATP generation in methanogens. A ...
ic archaea
Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
in enzymes that they use to produce methane
Methane ( , ) is a chemical compound with the chemical formula (one carbon atom bonded to four hydrogen atoms). It is a group-14 hydride, the simplest alkane, and the main constituent of natural gas. The abundance of methane on Earth makes ...
. It is coded for with the codon UAG, which is normally a stop codon in other organisms.
Several independent evolutionary studies have suggested that Gly, Ala, Asp, Val, Ser, Pro, Glu, Leu, Thr may belong to a group of amino acids that constituted the early genetic code, whereas Cys, Met, Tyr, Trp, His, Phe may belong to a group of amino acids that constituted later additions of the genetic code.
Standard vs nonstandard amino acids
The 20 amino acids that are encoded directly by the codons of the universal genetic code are called ''standard'' or ''canonical'' amino acids. A modified form of methionine ( ''N''-formylmethionine) is often incorporated in place of methionine as the initial amino acid of proteins in bacteria, mitochondria and plastid
A plastid is a membrane-bound organelle found in the Cell (biology), cells of plants, algae, and some other eukaryotic organisms. Plastids are considered to be intracellular endosymbiotic cyanobacteria.
Examples of plastids include chloroplasts ...
s (including chloroplasts). Other amino acids are called ''nonstandard'' or ''non-canonical''. Most of the nonstandard amino acids are also non-proteinogenic (i.e. they cannot be incorporated into proteins during translation), but two of them are proteinogenic, as they can be incorporated translationally into proteins by exploiting information not encoded in the universal genetic code.
The two nonstandard proteinogenic amino acids are selenocysteine (present in many non-eukaryotes as well as most eukaryotes, but not coded directly by DNA) and pyrrolysine
Pyrrolysine (symbol Pyl or O), encoded by the 'amber' stop codon UAG, is a proteinogenic amino acid that is used in some methanogenic archaea and in bacteria. It consists of lysine with a 4-methylpyrroline-5-carboxylate in amide linkage with the ...
(found only in some archaea
Archaea ( ) is a Domain (biology), domain of organisms. Traditionally, Archaea only included its Prokaryote, prokaryotic members, but this has since been found to be paraphyletic, as eukaryotes are known to have evolved from archaea. Even thou ...
and at least one bacterium
Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among the ...
). The incorporation of these nonstandard amino acids is rare. For example, 25 human proteins include selenocysteine in their primary structure, and the structurally characterized enzymes (selenoenzymes) employ selenocysteine as the catalytic moiety in their active sites. Pyrrolysine and selenocysteine are encoded via variant codons. For example, selenocysteine is encoded by stop codon and SECIS element
In biology, the SECIS element (SECIS: ''selenocysteine insertion sequence'') is an RNA element around 60 nucleotides in length that adopts a stem-loop structure. This structural motif (pattern of nucleotides) directs the cell to translate ...
.
''N''-formylmethionine (which is often the initial amino acid of proteins in bacteria, mitochondria
A mitochondrion () is an organelle found in the cells of most eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is us ...
, and chloroplast
A chloroplast () is a type of membrane-bound organelle, organelle known as a plastid that conducts photosynthesis mostly in plant cell, plant and algae, algal cells. Chloroplasts have a high concentration of chlorophyll pigments which captur ...
s) is generally considered as a form of methionine
Methionine (symbol Met or M) () is an essential amino acid in humans.
As the precursor of other non-essential amino acids such as cysteine and taurine, versatile compounds such as SAM-e, and the important antioxidant glutathione, methionine play ...
rather than as a separate proteinogenic amino acid. Codon–tRNA
Transfer ribonucleic acid (tRNA), formerly referred to as soluble ribonucleic acid (sRNA), is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes). In a cell, it provides the physical link between the gene ...
combinations not found in nature can also be used to "expand" the genetic code and form novel proteins known as alloproteins incorporating non-proteinogenic amino acid
In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids (21 in eukaryotesplus formylmethionine in eukaryotes with prokaryote organelles like mitochondria), which are naturally encoded in the ...
s.
Non-proteinogenic amino acids
Aside from the 22 proteinogenic amino acid
Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation from RNA. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) ...
s, many ''non-proteinogenic'' amino acids are known. Those either are not found in proteins (for example carnitine
Carnitine is a quaternary ammonium compound involved in metabolism in most mammals, plants, and some bacteria. In support of energy metabolism, carnitine transports long-chain fatty acids from the cytosol into mitochondria to be oxidized for f ...
, GABA
GABA (gamma-aminobutyric acid, γ-aminobutyric acid) is the chief inhibitory neurotransmitter in the developmentally mature mammalian central nervous system. Its principal role is reducing neuronal excitability throughout the nervous system.
GA ...
, levothyroxine
Levothyroxine, also known as -thyroxine, is a synthetic form of the thyroid hormone thyroxine (T4). It is used to treat thyroid hormone deficiency (hypothyroidism), including a severe form known as myxedema coma. It may also be used to tre ...
) or are not produced directly and in isolation by standard cellular machinery. For example, hydroxyproline
(2''S'',4''R'')-4-Hydroxyproline, or L-hydroxyproline ( C5 H9 O3 N), is an amino acid, abbreviated as Hyp or O, ''e.g.'', in Protein Data Bank.
Structure and discovery
In 1902, Hermann Emil Fischer isolated hydroxyproline from hydrolyzed gela ...
, is synthesised from proline
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the p ...
. Another example is selenomethionine).
Non-proteinogenic amino acids that are found in proteins are formed by post-translational modification
In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translation (biolog ...
. Such modifications can also determine the localization of the protein, e.g., the addition of long hydrophobic groups can cause a protein to bind to a phospholipid
Phospholipids are a class of lipids whose molecule has a hydrophilic "head" containing a phosphate group and two hydrophobic "tails" derived from fatty acids, joined by an alcohol residue (usually a glycerol molecule). Marine phospholipids typ ...
membrane. Examples:
*the carboxylation
Carboxylation is a chemical reaction in which a carboxylic acid is produced by treating a substrate with carbon dioxide. The opposite reaction is decarboxylation. In chemistry, the term carbonation is sometimes used synonymously with carboxylation ...
of glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
allows for better binding of calcium cations,
*Hydroxyproline
(2''S'',4''R'')-4-Hydroxyproline, or L-hydroxyproline ( C5 H9 O3 N), is an amino acid, abbreviated as Hyp or O, ''e.g.'', in Protein Data Bank.
Structure and discovery
In 1902, Hermann Emil Fischer isolated hydroxyproline from hydrolyzed gela ...
, generated by hydroxylation
In chemistry, hydroxylation refers to the installation of a hydroxyl group () into an organic compound. Hydroxylations generate alcohols and phenols, which are very common functional groups. Hydroxylation confers some degree of water-solubility ...
of proline
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the p ...
, is a major component of the connective tissue
Connective tissue is one of the four primary types of animal tissue, a group of cells that are similar in structure, along with epithelial tissue, muscle tissue, and nervous tissue. It develops mostly from the mesenchyme, derived from the mesod ...
collagen
Collagen () is the main structural protein in the extracellular matrix of the connective tissues of many animals. It is the most abundant protein in mammals, making up 25% to 35% of protein content. Amino acids are bound together to form a trip ...
.
* Hypusine
Hypusine is an uncommon amino acid found in all eukaryotes and in some archaea, but not in bacteria. The only known proteins containing the hypusine residue is eukaryotic translation initiation factor 5A (eIF-5A) and a similar protein found in ar ...
in the translation initiation factor
Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
EIF5A
Eukaryotic translation initiation factor 5A-1 is a protein that in humans is encoded by the ''EIF5A'' gene.
It is the only known protein to contain the unusual amino acid hypusine 'N''ε-(4-amino-2-hydroxybutyl)-lysine which is synthesized on e ...
, contains a modification of lysine.
Some non-proteinogenic amino acids are not found in proteins. Examples include 2-aminoisobutyric acid and the neurotransmitter gamma-aminobutyric acid
GABA (gamma-aminobutyric acid, γ-aminobutyric acid) is the chief inhibitory neurotransmitter in the developmentally mature mammalian central nervous system. Its principal role is reducing neuronal excitability throughout the nervous system.
GA ...
. Non-proteinogenic amino acids often occur as intermediates in the metabolic pathway
In biochemistry, a metabolic pathway is a linked series of chemical reactions occurring within a cell (biology), cell. The reactants, products, and Metabolic intermediate, intermediates of an enzymatic reaction are known as metabolites, which are ...
s for standard amino acids – for example, ornithine
Ornithine is a non-proteinogenic α-amino acid that plays a role in the urea cycle. It is not incorporated into proteins during translation. Ornithine is abnormally accumulated in the body in ornithine transcarbamylase deficiency, a disorder of th ...
and citrulline
The organic compound citrulline is an α-amino acid. Its name is derived from '' citrullus'', the Latin word for watermelon. Although named and described by gastroenterologists since the late 19th century, it was first isolated from watermelon in ...
occur in the urea cycle
The urea cycle (also known as the ornithine cycle) is a cycle of biochemical reactions that produces urea (NH2)2CO from ammonia (NH3). Animals that use this cycle, mainly amphibians and mammals, are called ureotelic.
The urea cycle converts highl ...
, part of amino acid catabolism
Catabolism () is the set of metabolic pathways that breaks down molecules into smaller units that are either oxidized to release energy or used in other anabolic reactions. Catabolism breaks down large molecules (such as polysaccharides, lipid ...
(see below). A rare exception to the dominance of α-amino acids in biology is the β-amino acid beta alanine (3-aminopropanoic acid), which is used in plants and microorganisms in the synthesis of pantothenic acid
Pantothenic acid (vitamin B5) is a B vitamin and an essential nutrient. All animals need pantothenic acid in order to synthesize coenzyme A (CoA), which is essential for cellular energy production and for the synthesis and degradation of prote ...
(vitamin B5), a component of coenzyme A
Coenzyme A (CoA, SHCoA, CoASH) is a coenzyme, notable for its role in the Fatty acid metabolism#Synthesis, synthesis and Fatty acid metabolism#.CE.B2-Oxidation, oxidation of fatty acids, and the oxidation of pyruvic acid, pyruvate in the citric ac ...
.
In mammalian nutrition
Animals ingest amino acids in the form of protein. The protein is broken down into its constituent amino acids in the process of digestion. The amino acids are then used to synthesize new proteins and other nitrogenous
Nitrogen is a chemical element; it has symbol N and atomic number 7. Nitrogen is a nonmetal and the lightest member of group 15 of the periodic table, often called the pnictogens. It is a common element in the universe, estimated at seventh ...
biomolecules, or they are further catabolized through oxidation
Redox ( , , reduction–oxidation or oxidation–reduction) is a type of chemical reaction in which the oxidation states of the reactants change. Oxidation is the loss of electrons or an increase in the oxidation state, while reduction is ...
to provide a source of energy. The oxidation pathway starts with the removal of the amino group by a transaminase
Transaminases or aminotransferases are enzymes that catalyze a transamination reaction between an amino acid and an α-keto acid. They are important in the synthesis of amino acids, which form proteins.
Function and mechanism
An amino acid con ...
; the amino group is then fed into the urea cycle
The urea cycle (also known as the ornithine cycle) is a cycle of biochemical reactions that produces urea (NH2)2CO from ammonia (NH3). Animals that use this cycle, mainly amphibians and mammals, are called ureotelic.
The urea cycle converts highl ...
. The other product of transamidation is a keto acid
In organic chemistry, keto acids or ketoacids (also called oxo acids or oxoacids) are organic compounds that contain a carboxylic acid group () and a ketone group ().Franz Dietrich Klingler, Wolfgang Ebertz "Oxocarboxylic Acids" in Ullmann's En ...
that enters the citric acid cycle
The citric acid cycle—also known as the Krebs cycle, Szent–Györgyi–Krebs cycle, or TCA cycle (tricarboxylic acid cycle)—is a series of chemical reaction, biochemical reactions that release the energy stored in nutrients through acetyl-Co ...
. Glucogenic amino acid
A glucogenic amino acid (or glucoplastic amino acid) is an amino acid that can be converted into glucose through gluconeogenesis. This is in contrast to the ketogenic amino acids, which are converted into ketone bodies.
The production of glucose ...
s can also be converted into glucose, through gluconeogenesis
Gluconeogenesis (GNG) is a metabolic pathway that results in the biosynthesis of glucose from certain non-carbohydrate carbon substrates. It is a ubiquitous process, present in plants, animals, fungi, bacteria, and other microorganisms. In verte ...
.
Of the 20 standard amino acids, nine (His
His or HIS may refer to:
Computing
* Hightech Information System, a Hong Kong graphics card company
* Honeywell Information Systems
* Hybrid intelligent system
* Microsoft Host Integration Server
Education
* Hangzhou International School, ...
, Ile
Ile or ILE may refer to:
Ile
* Ile, a Puerto Rican singer
* Ile District (disambiguation), multiple places
* Ilé-Ifẹ̀, an ancient Yoruba city in south-western Nigeria
* Interlingue (ISO 639:ile), a planned language
* Isoleucine, an amino a ...
, Leu, Lys, Met, Phe, Thr, Trp and Val) are called essential amino acid
An essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized from scratch by the organism fast enough to supply its demand, and must therefore come from the diet. Of the 21 amino acids common to all life forms ...
s because the human body
The human body is the entire structure of a Human, human being. It is composed of many different types of Cell (biology), cells that together create Tissue (biology), tissues and subsequently Organ (biology), organs and then Organ system, org ...
cannot synthesize them from other compounds at the level needed for normal growth, so they must be obtained from food.
Semi-essential and conditionally essential amino acids, and juvenile requirements
In addition, cysteine, tyrosine
-Tyrosine or tyrosine (symbol Tyr or Y) or 4-hydroxyphenylalanine is one of the 20 standard amino acids that are used by cells to synthesize proteins. It is a conditionally essential amino acid with a polar side group. The word "tyrosine" is ...
, and arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidinium, guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) a ...
are considered semiessential amino acids, and taurine
Taurine (), or 2-aminoethanesulfonic acid, is a naturally occurring amino sulfonic acid that is widely distributed in animal tissues. It is a major constituent of bile and can be found in the large intestine. It is named after Latin (cogna ...
a semi-essential aminosulfonic acid in children. Some amino acids are conditionally essential for certain ages or medical conditions. Essential amino acids may also vary from species
A species () is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction. It is the basic unit of Taxonomy (biology), ...
to species. The metabolic pathways that synthesize these monomers are not fully developed.
Non-protein functions
Many proteinogenic and non-proteinogenic amino acids have biological functions beyond being precursors to proteins and peptides. In humans, amino acids also have important roles in diverse biosynthetic pathways. Defenses against herbivores in plants sometimes employ amino acids. Examples:
Standard amino acids
* Tryptophan
Tryptophan (symbol Trp or W)
is an α-amino acid that is used in the biosynthesis of proteins. Tryptophan contains an α-amino group, an α-carboxylic acid group, and a side chain indole, making it a polar molecule with a non-polar aromat ...
is a precursor of the neurotransmitter serotonin
Serotonin (), also known as 5-hydroxytryptamine (5-HT), is a monoamine neurotransmitter with a wide range of functions in both the central nervous system (CNS) and also peripheral tissues. It is involved in mood, cognition, reward, learning, ...
.
* Tyrosine (and its precursor phenylalanine) are precursors of the catecholamine neurotransmitter
A neurotransmitter is a signaling molecule secreted by a neuron to affect another cell across a Chemical synapse, synapse. The cell receiving the signal, or target cell, may be another neuron, but could also be a gland or muscle cell.
Neurotra ...
s dopamine, epinephrine and norepinephrine and various trace amines.
* Phenylalanine is a precursor of phenethylamine and tyrosine in humans. In plants, it is a precursor of various phenylpropanoids, which are important in plant metabolism.
* Glycine
Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (G ...
is a precursor of porphyrins such as heme.
* Arginine is a precursor of nitric oxide.
* Ornithine and S-Adenosyl methionine, ''S''-adenosylmethionine are precursors of polyamines.
* Aspartate, glycine, and glutamine
Glutamine (symbol Gln or Q) is an α-amino acid that is used in the biosynthesis of proteins. Its side chain is similar to that of glutamic acid, except the carboxylic acid group is replaced by an amide. It is classified as a charge-neutral ...
are precursors of nucleotides.
Roles for nonstandard amino acids
*Carnitine is used in lipid, lipid transport.
*gamma-aminobutyric acid
GABA (gamma-aminobutyric acid, γ-aminobutyric acid) is the chief inhibitory neurotransmitter in the developmentally mature mammalian central nervous system. Its principal role is reducing neuronal excitability throughout the nervous system.
GA ...
is a neurotransmitter.
*5-HTP (5-hydroxytryptophan) is used for experimental treatment of depression.
*L-DOPA, L-DOPA (L-dihydroxyphenylalanine) for Parkinson's treatment,
*Eflornithine inhibits ornithine decarboxylase and used in the treatment of African trypanosomiasis, sleeping sickness.
*Canavanine, an analogue of arginine
Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidinium, guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) a ...
found in many legumes is an antifeedant, protecting the plant from predators.
*Mimosine found in some legumes, is another possible antifeedant. This compound is an analogue of tyrosine
-Tyrosine or tyrosine (symbol Tyr or Y) or 4-hydroxyphenylalanine is one of the 20 standard amino acids that are used by cells to synthesize proteins. It is a conditionally essential amino acid with a polar side group. The word "tyrosine" is ...
and can poison animals that graze on these plants.
However, not all of the functions of other abundant nonstandard amino acids are known.
Uses in industry
Animal feed
Amino acids are sometimes added to Compound feed, animal feed because some of the components of these feeds, such as soybeans, have low levels of some of the essential amino acid
An essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized from scratch by the organism fast enough to supply its demand, and must therefore come from the diet. Of the 21 amino acids common to all life forms ...
s, especially of lysine, methionine, threonine, and tryptophan. Likewise amino acids are used to chelate metal cations in order to improve the absorption of minerals from feed supplements.
Food
The food industry is a major consumer of amino acids, especially glutamic acid, which is used as a flavor enhancer, and aspartame (aspartylphenylalanine 1-methyl ester), which is used as an artificial sweetener. Amino acids are sometimes added to food by manufacturers to alleviate symptoms of mineral deficiencies, such as anemia, by improving mineral absorption and reducing negative side effects from inorganic mineral supplementation.[
]
Chemical building blocks
Amino acids are low-cost feedstocks used in chiral pool synthesis as enantiomer, enantiomerically pure building blocks.
Amino acids are used in the synthesis of some cosmetics.[
]
Aspirational uses
Fertilizer
The Chelation, chelating ability of amino acids is sometimes used in fertilizers to facilitate the delivery of minerals to plants in order to correct mineral deficiencies, such as iron chlorosis. These fertilizers are also used to prevent deficiencies from occurring and to improve the overall health of the plants.
Biodegradable plastics
Amino acids have been considered as components of biodegradable polymers, which have applications as environmentally friendly packaging and in medicine in drug delivery and the construction of prosthetic implants. An interesting example of such materials is polyaspartate, a water-soluble biodegradable polymer that may have applications in disposable diapers and agriculture. Due to its solubility and ability to chelate metal ions, polyaspartate is also being used as a biodegradable antiFouling, scaling agent and a corrosion inhibitor.
Synthesis
Chemical synthesis
The commercial production of amino acids usually relies on mutant bacteria that overproduce individual amino acids using glucose as a carbon source. Some amino acids are produced by enzymatic conversions of synthetic intermediates. 2-Aminothiazoline-4-carboxylic acid is an intermediate in one industrial synthesis of cysteine, L-cysteine for example. Aspartic acid is produced by the addition of ammonia to fumarate using a lyase.
Biosynthesis
In plants, nitrogen is first assimilated into organic compounds in the form of glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
, formed from alpha-ketoglutarate and ammonia in the mitochondrion. For other amino acids, plants use transaminase
Transaminases or aminotransferases are enzymes that catalyze a transamination reaction between an amino acid and an α-keto acid. They are important in the synthesis of amino acids, which form proteins.
Function and mechanism
An amino acid con ...
s to move the amino group from glutamate to another alpha-keto acid. For example, aspartate aminotransferase converts glutamate and oxaloacetate to alpha-ketoglutarate and aspartate. Other organisms use transaminases for amino acid synthesis, too.
Nonstandard amino acids are usually formed through modifications to standard amino acids. For example, homocysteine is formed through the transsulfuration pathway or by the demethylation of methionine via the intermediate metabolite S-adenosylmethionine, ''S''-adenosylmethionine, while hydroxyproline
(2''S'',4''R'')-4-Hydroxyproline, or L-hydroxyproline ( C5 H9 O3 N), is an amino acid, abbreviated as Hyp or O, ''e.g.'', in Protein Data Bank.
Structure and discovery
In 1902, Hermann Emil Fischer isolated hydroxyproline from hydrolyzed gela ...
is made by a post translational modification of proline
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group but is rather a secondary amine. The secondary amine nitrogen is in the p ...
.
Microorganisms and plants synthesize many uncommon amino acids. For example, some microbes make 2-aminoisobutyric acid and lanthionine, which is a sulfide-bridged derivative of alanine. Both of these amino acids are found in peptidic lantibiotics such as alamethicin. However, in plants, 1-aminocyclopropane-1-carboxylic acid is a small disubstituted cyclic amino acid that is an intermediate in the production of the plant hormone ethylene#Ethylene as a plant hormone, ethylene.
Primordial synthesis
The formation of amino acids and peptides is assumed to have preceded and perhaps induced the abiogenesis, emergence of life on earth. Amino acids can form from simple precursors under various conditions. Surface-based chemical metabolism of amino acids and very small compounds may have led to the build-up of amino acids, coenzymes and phosphate-based small carbon molecules. Amino acids and similar building blocks could have been elaborated into proto-peptide
Peptides are short chains of amino acids linked by peptide bonds. A polypeptide is a longer, continuous, unbranched peptide chain. Polypeptides that have a molecular mass of 10,000 Da or more are called proteins. Chains of fewer than twenty am ...
s, with peptides being considered key players in the origin of life.
In the famous Urey-Miller experiment, the passage of an electric arc through a mixture of methane, hydrogen, and ammonia produces a large number of amino acids. Since then, scientists have discovered a range of ways and components by which the potentially prebiotic formation and chemical evolution of peptides may have occurred, such as condensing agents, the design of self-replicating peptides and a number of non-enzymatic mechanisms by which amino acids could have emerged and elaborated into peptides. Several hypotheses invoke the Strecker synthesis whereby hydrogen cyanide, simple aldehydes, ammonia, and water produce amino acids.
According to a review, amino acids, and even peptides, "turn up fairly regularly in the primordial soup, various experimental broths that have been allowed to be cooked from simple chemicals. This is because nucleotides are far more difficult to synthesize chemically than amino acids." For a chronological order, it suggests that there must have been a 'protein world' or at least a 'polypeptide world', possibly later followed by the 'RNA world' and the 'DNA world'. Codon–amino acids mappings may be the biology, biological information system at the primordial origin of life on Earth. While amino acids and consequently simple peptides must have formed under different experimentally probed geochemical scenarios, the transition from an abiotic world to the first life forms is to a large extent still unresolved.
Reactions
Amino acids undergo the reactions expected of the constituent functional groups.
Peptide bond formation
As both the amine and carboxylic acid groups of amino acids can react to form amide bonds, one amino acid molecule can react with another and become joined through an amide linkage. This polymerization of amino acids is what creates proteins. This condensation reaction yields the newly formed peptide bond and a molecule of water. In cells, this reaction does not occur directly; instead, the amino acid is first activated by attachment to a transfer RNA molecule through an ester bond. This aminoacyl-tRNA is produced in an Adenosine triphosphate, ATP-dependent reaction carried out by an aminoacyl tRNA synthetase. This aminoacyl-tRNA is then a substrate for the ribosome, which catalyzes the attack of the amino group of the elongating protein chain on the ester bond. As a result of this mechanism, all proteins made by ribosomes are synthesized starting at their ''N''-terminus and moving toward their ''C''-terminus.
However, not all peptide bonds are formed in this way. In a few cases, peptides are synthesized by specific enzymes. For example, the tripeptide glutathione is an essential part of the defenses of cells against oxidative stress. This peptide is synthesized in two steps from free amino acids. In the first step, gamma-glutamylcysteine synthetase condenses cysteine and glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
through a peptide bond formed between the side chain carboxyl of the glutamate (the gamma carbon of this side chain) and the amino group of the cysteine. This dipeptide is then condensed with glycine by glutathione synthetase to form glutathione.
In chemistry, peptides are synthesized by a variety of reactions. One of the most-used in peptide synthesis, solid-phase peptide synthesis uses the aromatic oxime derivatives of amino acids as activated units. These are added in sequence onto the growing peptide chain, which is attached to a solid resin support. Libraries of peptides are used in drug discovery through high-throughput screening.
The combination of functional groups allow amino acids to be effective polydentate ligands for metal–amino acid chelates.
The multiple side chains of amino acids can also undergo chemical reactions.
Catabolism
Degradation of an amino acid often involves deamination by moving its amino group to α-ketoglutarate, forming glutamate
Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
. This process involves transaminases, often the same as those used in amination during synthesis. In many vertebrates, the amino group is then removed through the urea cycle
The urea cycle (also known as the ornithine cycle) is a cycle of biochemical reactions that produces urea (NH2)2CO from ammonia (NH3). Animals that use this cycle, mainly amphibians and mammals, are called ureotelic.
The urea cycle converts highl ...
and is excreted in the form of urea. However, amino acid degradation can produce uric acid or ammonia instead. For example, serine dehydratase converts serine to pyruvate and ammonia. After removal of one or more amino groups, the remainder of the molecule can sometimes be used to synthesize new amino acids, or it can be used for energy by entering glycolysis or the citric acid cycle
The citric acid cycle—also known as the Krebs cycle, Szent–Györgyi–Krebs cycle, or TCA cycle (tricarboxylic acid cycle)—is a series of chemical reaction, biochemical reactions that release the energy stored in nutrients through acetyl-Co ...
, as detailed in image at right.
Complexation
Amino acids are bidentate ligands, forming transition metal amino acid complexes.
Chemical analysis
The total nitrogen content of organic matter is mainly formed by the amino groups in proteins. The Kjeldahl method#Applications, total Kjeldahl nitrogen (TKN) is a measure of nitrogen widely used in the analysis of (waste) water, soil, food, feed and organic matter in general. As the name suggests, the Kjeldahl method is applied. More sensitive methods are available.
See also
* Amino acid dating
* Beta-peptide
* Degron
* Erepsin
* Homochirality
* Hyperaminoacidemia
* Leucines
* Miller–Urey experiment
* Nucleic acid sequence
* RNA codon table
Notes
References
Further reading
*
*
*
*
External links
*
{{DEFAULTSORT:Amino Acid
Amino acids,
Nitrogen cycle
Zwitterions