
In
molecular biology
Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and phys ...
, protein fold classes are broad categories of
protein tertiary structure
Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may i ...
topology. They describe groups of
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respon ...
s that share similar
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
and
secondary structure
Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
proportions. Each class contains multiple, independent
protein superfamilies
A protein superfamily is the largest grouping ( clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence simila ...
(i.e. are not necessarily
evolutionarily related to one another).
Generally recognised classes
Four large classes of protein that are generally agreed upon by the two main structure classification databases (
SCOP
A (
or ) was a poet as represented in Old English literature#Poetry, Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used ...
and
CATH
The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and col ...
).
all-α
All-α proteins are a class of
structural domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of se ...
s in which the
secondary structure
Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
is composed entirely of
α-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
, with the possible exception of a few isolated
β-sheets on the periphery.
Common examples include the
bromodomain, the
globin fold and the
homeodomain fold.
all-β
All-β proteins are a class of
structural domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of se ...
s in which the
secondary structure
Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
is composed entirely of
β-sheets, with the possible exception of a few isolated
α-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
on the periphery.
Common examples include the
SH3 domain
The SRC Homology 3 Domain (or SH3 domain) is a small protein domain of about 60 amino acid residues. Initially, SH3 was described as a conserved sequence in the viral adaptor protein v-Crk. This domain is also present in the molecules of phosp ...
, the
beta-propeller domain, the
immunoglobulin fold and
B3 DNA binding domain
The B3 DNA binding domain (DBD) is a highly conserved domain found exclusively in transcription factors (≥40 species) () combined with other domains (). It consists of 100-120 residues, includes seven beta strands and two alpha helices that fo ...
.
α+β
α+β proteins are a class of
structural domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of se ...
s in which the
secondary structure
Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
is composed of
α-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
and
β-strands that occur separately along the
backbone. The
β-strands are therefore mostly ''antiparallel''.
Common examples include the
ferredoxin fold,
ribonuclease A
Pancreatic ribonuclease family (, ''RNase'', ''RNase I'', ''RNase A'', ''pancreatic RNase'', ''ribonuclease I'', ''endoribonuclease I'', ''ribonucleic phosphatase'', ''alkaline ribonuclease'', ''ribonuclease'', ''gene S glycoproteins'', ''Ceratit ...
, and the
SH2 domain
The SH2 (Src Homology 2) domain is a structurally conserved protein domain contained within the Src oncoprotein and in many other intracellular signal-transducing proteins. SH2 domains allow proteins containing those domains to dock to phosph ...
.
α/β
α/β proteins are a class of
structural domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of se ...
s in which the
secondary structure
Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
is composed of alternating
α-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
and
β-strands along the backbone. The
β-strands are therefore mostly ''parallel''.
Common examples include the
flavodoxin fold, the
TIM barrel
The TIM barrel (triose-phosphate isomerase), also known as an alpha/beta barrel, is a conserved protein fold consisting of eight alpha helices (α-helices) and eight parallel beta strands (β-strands) that alternate along the peptide backbone. ...
and leucine-rich-repeat (LRR) proteins such as
ribonuclease inhibitor
Ribonuclease inhibitor (RI) is a large (~450 residues, ~49 kDa), acidic (pI ~4.7), leucine-rich repeat protein that forms extremely tight complexes with certain ribonucleases. It is a major cellular protein, comprising ~0.1% of all cellular prote ...
.
Additional classes
Membrane proteins
Membrane protein
Membrane proteins are common proteins that are part of, or interact with, biological membranes. Membrane proteins fall into several broad categories depending on their location. Integral membrane proteins are a permanent part of a cell membrane ...
s interact with
biological membrane
A biological membrane, biomembrane or cell membrane is a selectively permeable membrane that separates the interior of a cell from the external environment or creates intracellular compartments by serving as a boundary between one part of the ...
s either by inserting into it, or being tethered via a covalently attached lipid. They are one of the common types of protein along with soluble
globular proteins,
fibrous proteins, and
disordered proteins.
They are targets of over 50% of all modern medicinal drugs.
It is estimated that 20–30% of all
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s in most
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
s encode membrane proteins.
Intrinsically disordered proteins
Intrinsically disordered protein
In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs rang ...
s lack a fixed or ordered
three-dimensional structure.
IDPs cover a spectrum of states from fully unstructured to partially structured and include
random coil
In polymer chemistry, a random coil is a conformation of polymers where the monomer subunits are oriented randomly while still being bonded to adjacent units. It is not one specific shape, but a statistical distribution of shapes for all the ch ...
s, (pre-)
molten globule
In molecular biology, the term molten globule (MG) refers to protein states that are more or less compact (hence the "globule"), but are lacking the specific tight packing of amino acid residues which creates the solid state-like tertiary structur ...
s, and large multi-domain proteins connected by flexible linkers. They constitute one of the main types of protein (alongside
globular,
fibrous
Fiber or fibre (from la, fibra, links=no) is a natural or artificial substance that is significantly longer than it is wide. Fibers are often used in the manufacture of other materials. The strongest engineering materials often incorpora ...
and
membrane protein
Membrane proteins are common proteins that are part of, or interact with, biological membranes. Membrane proteins fall into several broad categories depending on their location. Integral membrane proteins are a permanent part of a cell membrane ...
s).
Coiled coil proteins
Coiled coil proteins form long, insoluble
fibers
Fiber or fibre (from la, fibra, links=no) is a natural or artificial substance that is significantly longer than it is wide. Fibers are often used in the manufacture of other materials. The strongest engineering materials often incorporate ...
involved in the
extracellular matrix
In biology, the extracellular matrix (ECM), also called intercellular matrix, is a three-dimensional network consisting of extracellular macromolecules and minerals, such as collagen, enzymes, glycoproteins and hydroxyapatite that provide struc ...
. There are many scleroprotein
superfamilies including
keratin
Keratin () is one of a family of structural fibrous proteins also known as ''scleroproteins''. Alpha-keratin (α-keratin) is a type of keratin found in vertebrates. It is the key structural material making up scales, hair, nails, feathers, ...
,
collagen,
elastin, and
fibroin
Fibroin is an insoluble protein present in silk produced by numerous insects, such as the larvae of ''Bombyx mori'', and other moth genera such as '' Antheraea'', ''Cricula'', '' Samia'' and '' Gonometa''. Silk in its raw state consists of two ...
. The roles of such proteins include protection and support, forming
connective tissue
Connective tissue is one of the four primary types of animal tissue, along with epithelial tissue, muscle tissue, and nervous tissue. It develops from the mesenchyme derived from the mesoderm the middle embryonic germ layer. Connective tissue ...
,
tendon
A tendon or sinew is a tough, high-tensile-strength band of dense fibrous connective tissue that connects muscle to bone. It is able to transmit the mechanical forces of muscle contraction to the skeletal system without sacrificing its ability ...
s,
bone matrices, and
muscle fiber
A muscle cell is also known as a myocyte when referring to either a cardiac muscle cell (cardiomyocyte), or a smooth muscle cell as these are both small cells. A skeletal muscle cell is long and threadlike with many nuclei and is called a m ...
.
Small proteins
Small proteins typically have a tertiary structure that is maintained by
disulphide bridges
In biochemistry, a disulfide (or disulphide in British English) refers to a functional group with the structure . The linkage is also called an SS-bond or sometimes a disulfide bridge and is usually derived by the coupling of two thiol groups. In ...
(
cysteine-rich proteins),
metal ligands (
metal-binding proteins), and or
cofactors
Cofactor may also refer to:
* Cofactor (biochemistry), a substance that needs to be present in addition to an enzyme for a certain reaction to be catalysed
* A domain parameter in elliptic curve cryptography, defined as the ratio between the orde ...
such as
heme
Heme, or haem (pronounced / hi:m/ ), is a precursor to hemoglobin, which is necessary to bind oxygen in the bloodstream. Heme is biosynthesized in both the bone marrow and the liver.
In biochemical terms, heme is a coordination complex "consis ...
.
Designed proteins
Numerous protein structures are the result of
rational design
In chemical biology and biomolecular engineering, rational design (RD) is an umbrella term which invites the strategy of creating new molecules with a certain functionality, based upon the ability to predict how the molecule's structure (specif ...
and do not exist in nature. Proteins can be designed from scratch (''de novo'' design) or by making calculated variations on a known protein structure and its sequence (known as ''protein redesign''). Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as
peptide synthesis
In organic chemistry, peptide synthesis is the production of peptides, compounds where multiple amino acids are linked via amide bonds, also known as peptide bonds. Peptides are chemically synthesized by the condensation reaction of the carboxy ...
,
site-directed mutagenesis
Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional mutating changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesi ...
, or
Artificial gene synthesis
Artificial gene synthesis, or simply gene synthesis, refers to a group of methods that are used in synthetic biology to construct and assemble genes from nucleotides '' de novo''. Unlike DNA synthesis in living cells, artificial gene synthesis ...
.
See also
*
Protein superfamily
A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarit ...
*
SCOP
A (
or ) was a poet as represented in Old English literature#Poetry, Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used ...
database
*
CATH
The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and col ...
database
*
FSSP database
References
{{reflist
Protein folds