HOME

TheInfoList



OR:

Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a
molecule A molecule is a group of two or more atoms held together by attractive forces known as chemical bonds; depending on context, the term may or may not include ions which satisfy this criterion. In quantum physics, organic chemistry, and b ...
of
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length scales ranging from the level of individual
atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, a ...
s to the relationships among entire protein subunits. This useful distinction among scales is often expressed as a decomposition of molecular structure into four levels: primary, secondary, tertiary, and quaternary. The scaffold for this multiscale organization of the molecule arises at the secondary level, where the fundamental structural elements are the molecule's various
hydrogen bond In chemistry, a hydrogen bond (or H-bond) is a primarily electrostatic force of attraction between a hydrogen (H) atom which is covalently bound to a more electronegative "donor" atom or group (Dn), and another electronegative atom bearing a l ...
s. This leads to several recognizable ''domains'' of
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monom ...
and
nucleic acid structure Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quater ...
, including such secondary-structure features as
alpha helix The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand- helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ...
es and
beta sheet The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a ge ...
s for proteins, and hairpin loops, bulges, and internal loops for nucleic acids. The terms ''primary'', ''secondary'', ''tertiary'', and ''quaternary structure'' were introduced by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at
Stanford University Stanford University, officially Leland Stanford Junior University, is a private research university in Stanford, California. The campus occupies , among the largest in the United States, and enrolls over 17,000 students. Stanford is conside ...
.


Primary structure

The primary structure of a
biopolymer Biopolymers are natural polymers produced by the cells of living organisms. Like other polymers, biopolymers consist of monomeric units that are covalently bonded in chains to form larger molecules. There are three main classes of biopolymers, ...
is the exact specification of its atomic composition and the chemical bonds connecting those atoms (including
stereochemistry Stereochemistry, a subdiscipline of chemistry, involves the study of the relative spatial arrangement of atoms that form the structure of molecules and their manipulation. The study of stereochemistry focuses on the relationships between stereoi ...
). For a typical unbranched, un-crosslinked
biopolymer Biopolymers are natural polymers produced by the cells of living organisms. Like other polymers, biopolymers consist of monomeric units that are covalently bonded in chains to form larger molecules. There are three main classes of biopolymers, ...
(such as a
molecule A molecule is a group of two or more atoms held together by attractive forces known as chemical bonds; depending on context, the term may or may not include ions which satisfy this criterion. In quantum physics, organic chemistry, and b ...
of a typical intracellular
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
, or of DNA or RNA), the primary structure is equivalent to specifying the sequence of its
monomer In chemistry, a monomer ( ; '' mono-'', "one" + '' -mer'', "part") is a molecule that can react together with other monomer molecules to form a larger polymer chain or three-dimensional network in a process called polymerization. Classification ...
ic subunits, such as
amino acids Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
or nucleotides. The primary structure of a protein is reported starting from the amino
N-terminus The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
to the carboxyl
C-terminus The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein i ...
, while the primary structure of DNA or RNA molecule is known as the
nucleic acid sequence A nucleic acid sequence is a succession of bases signified by a series of a set of five different letters that indicate the order of nucleotides forming alleles within a DNA (using GACT) or RNA (GACU) molecule. By convention, sequences are us ...
reported from the
5' end Directionality, in molecular biology and biochemistry, is the end-to-end chemical orientation of a single strand of nucleic acid. In a single strand of DNA or RNA, the chemical convention of naming carbon atoms in the nucleotide pentose-sugar- ...
to the 3' end. The nucleic acid sequence refers to the exact sequence of nucleotides that comprise the whole molecule. Often, the primary structure encodes
sequence motif In biology, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and usually assumed to be related to biological function of the macromolecule. For example, an ''N''-glycosylation site motif can be defined as ' ...
s that are of functional importance. Some examples of such motifs are: the C/D and H/ACA boxes of
snoRNA In molecular biology, Small nucleolar RNAs (snoRNAs) are a class of small RNA molecules that primarily guide chemical modifications of other RNAs, mainly ribosomal RNAs, transfer RNAs and small nuclear RNAs. There are two main classes of snoRN ...
s, LSm binding site found in spliceosomal RNAs such as U1, U2, U4, U5, U6, U12 and U3, the Shine-Dalgarno sequence, the Kozak consensus sequence and the RNA polymerase III terminator.


Secondary structure

The secondary structure of a protein is the pattern of hydrogen bonds in a biopolymer. These determine the general three-dimensional form of ''local segments'' of the biopolymers, but does not describe the global structure of specific atomic positions in three-dimensional space, which are considered to be
tertiary structure Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may i ...
. Secondary structure is formally defined by the hydrogen bonds of the biopolymer, as observed in an atomic-resolution structure. In proteins, the secondary structure is defined by patterns of hydrogen bonds between backbone amine and carboxyl groups (sidechain–mainchain and sidechain–sidechain hydrogen bonds are irrelevant), where the DSSP definition of a hydrogen bond is used. The secondary structure of a nucleic acid is defined by the hydrogen bonding between the nitrogenous bases. For proteins, however, the hydrogen bonding is correlated with other structural features, which has given rise to less formal definitions of secondary structure. For example, helices can adopt backbone
dihedral angle A dihedral angle is the angle between two intersecting planes or half-planes. In chemistry, it is the clockwise angle between half-planes through two sets of three atoms, having two atoms in common. In solid geometry, it is defined as the un ...
s in some regions of the
Ramachandran plot In biochemistry, a Ramachandran plot (also known as a Rama plot, a Ramachandran diagram or a �,ψplot), originally developed in 1963 by G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way to visualize energetically allowed regions ...
; thus, a segment of residues with such dihedral angles is often called a ''helix'', regardless of whether it has the correct hydrogen bonds. Many other less formal definitions have been proposed, often applying concepts from the
differential geometry Differential geometry is a mathematical discipline that studies the geometry of smooth shapes and smooth spaces, otherwise known as smooth manifolds. It uses the techniques of differential calculus, integral calculus, linear algebra and mult ...
of curves, such as
curvature In mathematics, curvature is any of several strongly related concepts in geometry. Intuitively, the curvature is the amount by which a curve deviates from being a straight line, or a surface deviates from being a plane. For curves, the can ...
and torsion. Structural biologists solving a new atomic-resolution structure will sometimes assign its secondary structure ''by eye'' and record their assignments in the corresponding
Protein Data Bank The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cr ...
(PDB) file. The secondary structure of a nucleic acid molecule refers to the
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both D ...
ing interactions within one molecule or set of interacting molecules. The secondary structure of biological RNA's can often be uniquely decomposed into stems and loops. Often, these elements or combinations of them can be further classified, e.g. tetraloops,
pseudoknot __NOTOC__ A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. The pseudoknot was first recognized in the Turnip yellow m ...
s and stem loops. There are many secondary structure elements of functional importance to biological RNA. Famous examples include the Rho-independent terminator stem loops and the
transfer RNA Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ...
(tRNA) cloverleaf. There is a minor industry of researchers attempting to determine the secondary structure of RNA molecules. Approaches include both experimental and
computational Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm). Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An espe ...
methods (see also the List of RNA structure prediction software).


Tertiary structure

The ''
tertiary structure Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may i ...
'' of a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
or any other
macromolecule A macromolecule is a very large molecule important to biophysical processes, such as a protein or nucleic acid. It is composed of thousands of covalently bonded atoms. Many macromolecules are polymers of smaller molecules called monomers. The ...
is its three-dimensional structure, as defined by the atomic coordinates. Proteins and nucleic acids fold into complex three-dimensional structures which result in the molecules' functions. While such structures are diverse and complex, they are often composed of recurring, recognizable tertiary structure motifs and domains that serve as molecular building blocks. Tertiary structure is considered to be largely determined by the biomolecule's
primary structure Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynth ...
(its sequence of
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
s or
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
s).


Quaternary structure

The ''protein quaternary structure'' refers to the number and arrangement of multiple protein molecules in a multi-subunit complex. For nucleic acids, the term is less common, but can refer to the higher-level organization of DNA in
chromatin Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important ...
, including its interactions with
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
s, or to the interactions between separate RNA units in the
ribosome Ribosomes ( ) are macromolecular machines, found within all cells, that perform biological protein synthesis (mRNA translation). Ribosomes link amino acids together in the order specified by the codons of messenger RNA (mRNA) molecules to fo ...
or spliceosome.


Structure determination

Structure probing is the process by which biochemical techniques are used to determine biomolecular structure. This analysis can be used to define the patterns that can be used to infer the molecular structure, experimental analysis of molecular structure and function, and further understanding on development of smaller molecules for further biological research. Structure probing analysis can be done through many different methods, which include chemical probing, hydroxyl radical probing, nucleotide analog interference mapping (NAIM), and in-line probing.
Protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
and
nucleic acid Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomers made of three components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main ...
structures can be determined using either nuclear magnetic resonance spectroscopy ( NMR) or
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
or single-particle cryo electron microscopy (
cryoEM Cryogenic electron microscopy (cryo-EM) is a cryomicroscopy technique applied on samples cooled to cryogenic temperatures. For biological specimens, the structure is preserved by embedding in an environment of vitreous ice. An aqueous sample so ...
). The first published reports for DNA (by
Rosalind Franklin Rosalind Elsie Franklin (25 July 192016 April 1958) was a British chemist and X-ray crystallographer whose work was central to the understanding of the molecular structures of DNA (deoxyribonucleic acid), RNA (ribonucleic acid), viruses, ...
and
Raymond Gosling Raymond George Gosling (15 July 1926 – 18 May 2015) was a British scientist. While a PhD student at King's College, London he worked under the supervision of Maurice Wilkins and Rosalind Franklin. The crystallographic experiments of Frankli ...
in 1953) of A-DNA X-ray diffraction patterns—and also B-DNA—used analyses based on Patterson function transforms that provided only a limited amount of structural information for oriented fibers of DNA isolated from calf
thymus The thymus is a specialized primary lymphoid organ of the immune system. Within the thymus, thymus cell lymphocytes or '' T cells'' mature. T cells are critical to the adaptive immune system, where the body adapts to specific foreign invaders ...
. An alternate analysis was then proposed by Wilkins et al. in 1953 for B-DNA X-ray diffraction and scattering patterns of hydrated, bacterial-oriented DNA fibers and trout sperm heads in terms of squares of
Bessel function Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions of Bessel's differential equation x^2 \frac + x \frac + \left(x^2 - \alpha^2 \right)y = 0 for an arbitrar ...
s. Although the ''B-DNA form' is most common under the conditions found in cells, it is not a well-defined conformation but a family or fuzzy set of DNA conformations that occur at the high hydration levels present in a wide variety of living cells. Their corresponding X-ray diffraction & scattering patterns are characteristic of molecular paracrystals with a significant degree of disorder (over 20%), and the structure is not tractable using only the standard analysis. In contrast, the standard analysis, involving only
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
s of
Bessel function Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions of Bessel's differential equation x^2 \frac + x \frac + \left(x^2 - \alpha^2 \right)y = 0 for an arbitrar ...
s and DNA molecular models, is still routinely used to analyze A-DNA and Z-DNA X-ray diffraction patterns.


Structure prediction

Biomolecular structure prediction is the prediction of the three-dimensional structure of a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
from its
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
sequence, or of a
nucleic acid Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomers made of three components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main ...
from its
nucleobase Nucleobases, also known as ''nitrogenous bases'' or often simply ''bases'', are nitrogen-containing biological compounds that form nucleosides, which, in turn, are components of nucleotides, with all of these monomers constituting the basi ...
(base) sequence. In other words, it is the prediction of secondary and tertiary structure from its primary structure. Structure prediction is the inverse of biomolecular design, as in rational design,
protein design Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch (''de novo'' design) or by making calcul ...
,
nucleic acid design Nucleic acid design is the process of generating a set of nucleic acid base sequences that will associate into a desired conformation. Nucleic acid design is central to the fields of DNA nanotechnology and DNA computing. It is necessary because ...
, and biomolecular engineering. Protein structure prediction is one of the most important goals pursued by
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
and theoretical chemistry. Protein structure prediction is of high importance in
medicine Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pr ...
(for example, in drug design) and
biotechnology Biotechnology is the integration of natural sciences and engineering sciences in order to achieve the application of organisms, cells, parts thereof and molecular analogues for products and services. The term ''biotechnology'' was first used ...
(for example, in the design of novel
enzyme Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products ...
s). Every two years, the performance of current methods is assessed in the ''Critical Assessment of protein Structure Prediction'' ( CASP) experiment. There has also been a significant amount of
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
research directed at the RNA structure prediction problem. A common problem for researchers working with RNA is to determine the three-dimensional structure of the molecule given only the nucleic acid sequence. However, in the case of RNA, much of the final structure is determined by the
secondary structure Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
or intra-molecular base-pairing interactions of the molecule. This is shown by the high conservation of
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both D ...
ings across diverse species. Secondary structure of small nucleic acid molecules is determined largely by strong, local interactions such as
hydrogen bond In chemistry, a hydrogen bond (or H-bond) is a primarily electrostatic force of attraction between a hydrogen (H) atom which is covalently bound to a more electronegative "donor" atom or group (Dn), and another electronegative atom bearing a l ...
s and base stacking. Summing the free energy for such interactions, usually using a nearest-neighbor method, provides an approximation for the stability of given structure. The most straightforward way to find the lowest free energy structure would be to generate all possible structures and calculate the free energy for them, but the number of possible structures for a sequence increases exponentially with the length of the molecule. For longer molecules, the number of possible secondary structures is vast. Sequence covariation methods rely on the existence of a data set composed of multiple homologous RNA sequences with related but dissimilar sequences. These methods analyze the covariation of individual base sites in
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
; maintenance at two widely separated sites of a pair of base-pairing
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
s indicates the presence of a structurally required hydrogen bond between those positions. The general problem of pseudoknot prediction has been shown to be NP-complete.


Design

Biomolecular design can be considered the inverse of structure prediction. In structure prediction, the structure is determined from a known sequence, whereas, in protein or nucleic acid design, a sequence that will form a desired structure is generated.


Other biomolecules

Other biomolecules, such as
polysaccharide Polysaccharides (), or polycarbohydrates, are the most abundant carbohydrates found in food. They are long chain polymeric carbohydrates composed of monosaccharide units bound together by glycosidic linkages. This carbohydrate can react with w ...
s,
polyphenol Polyphenols () are a large family of naturally occurring organic compounds characterized by multiples of phenol units. They are abundant in plants and structurally diverse. Polyphenols include flavonoids, tannic acid, and ellagitannin, some ...
s and
lipid Lipids are a broad group of naturally-occurring molecules which includes fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids in ...
s, can also have higher-order structure of biological consequence.


See also

* Biomolecular *
Comparison of nucleic acid simulation software This is a list of notable computer programs that are used for nucleic acid Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomers made of three components: a ...
* Gene structure * List of RNA structure prediction software *
Non-coding RNA A non-coding RNA (ncRNA) is a functional RNA molecule that is not Translation (genetics), translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally im ...


Notes


References

{{DEFAULTSORT:Biomolecular Structure Biomolecules