Circular Permutation Proteins
   HOME

TheInfoList



OR:

A circular permutation is a relationship between
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s whereby the proteins have a changed order of
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s in their
peptide sequence Peptides are short chains of amino acids linked by peptide bonds. A polypeptide is a longer, continuous, unbranched peptide chain. Polypeptides that have a molecular mass of 10,000 Da or more are called proteins. Chains of fewer than twenty ami ...
. The result is a
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid ...
with different connectivity, but overall similar three-dimensional (3D) shape. In 1979, the first pair of circularly permuted proteins –
concanavalin A Concanavalin A (ConA) is a lectin (carbohydrate-binding protein) originally extracted from the jack-bean (''Canavalia ensiformis''). It is a member of the legume lectin family. It binds specifically to certain structures found in various sugars, ...
and
lectin Lectins are carbohydrate-binding proteins that are highly specific for sugar Moiety (chemistry), groups that are part of other molecules, so cause agglutination (biology), agglutination of particular cells or precipitation of glycoconjugates an ...
– were discovered; over 2000 such proteins are now known. Circular permutation can occur as the result of
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
ary events,
posttranslational modification In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translate mRNA ...
s, or artificially engineered mutations. The two main models proposed to explain the evolution of circularly permuted proteins are ''permutation by duplication'' and ''fission and fusion''. Permutation by duplication occurs when a
gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
undergoes
duplication Duplication, duplicate, and duplicator may refer to: Biology and genetics * Gene duplication, a process which can result in free mutation * Chromosomal duplication, which can cause Bloom and Rett syndrome * Polyploidy, a phenomenon also known ...
to form a
tandem repeat In genetics, tandem repeats occur in DNA when a pattern of one or more nucleotides is repeated and the repetitions are directly adjacent to each other, e.g. ATTCG ATTCG ATTCG, in which the sequence ATTCG is repeated three times. Several protein ...
, before redundant sections of the protein are removed; this relationship is found between
saposin Prosaposin, also known as PSAP, is a protein which in humans is encoded by the ''PSAP'' gene. This highly conserved glycoprotein is a precursor for 4 cleavage products: saposins A, B, C, and D. Saposin is an acronym for Sphingolipid Activator Pr ...
and swaposin. Fission and fusion occurs when partial proteins fuse to form a single polypeptide, such as in nicotinamide nucleotide transhydrogenases. Circular permutations are routinely engineered in the laboratory to improve their
catalytic activity Catalysis () is the increase in rate of a chemical reaction due to an added substance known as a catalyst (). Catalysts are not consumed by the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recycles quick ...
or
thermostability In materials science and molecular biology, thermostability is the ability of a substance to resist irreversible change in its chemical or physical structure, often by resisting decomposition or polymerization, at a high relative temperature. T ...
, or to investigate properties of the original protein. Traditional
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s for
sequence alignment In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural biology, structural, or evolutionary relationships between ...
and structure alignment are not able to detect circular permutations between proteins. New
non-linear In mathematics and science, a nonlinear system (or a non-linear system) is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathe ...
approaches have been developed that overcome this and are able to detect
topology Topology (from the Greek language, Greek words , and ) is the branch of mathematics concerned with the properties of a Mathematical object, geometric object that are preserved under Continuous function, continuous Deformation theory, deformat ...
-independent similarities.


History

In 1979, Bruce Cunningham and his colleagues discovered the first instance of a circularly permuted protein in nature. After determining the peptide sequence of the
lectin Lectins are carbohydrate-binding proteins that are highly specific for sugar Moiety (chemistry), groups that are part of other molecules, so cause agglutination (biology), agglutination of particular cells or precipitation of glycoconjugates an ...
protein favin, they noticed its similarity to a known protein –
concanavalin A Concanavalin A (ConA) is a lectin (carbohydrate-binding protein) originally extracted from the jack-bean (''Canavalia ensiformis''). It is a member of the legume lectin family. It binds specifically to certain structures found in various sugars, ...
 – except that the ends were circularly permuted. Later work confirmed the circular permutation between the pair and showed that concanavalin A is permuted post-translationally through cleavage and an unusual protein ligation. After the discovery of a natural circularly permuted protein, researchers looked for a way to emulate this process. In 1983, David Goldenberg and Thomas Creighton were able to create a circularly permuted version of a protein by chemically ligating the termini to create a cyclic protein, then introducing new termini elsewhere using
trypsin Trypsin is an enzyme in the first section of the small intestine that starts the digestion of protein molecules by cutting long chains of amino acids into smaller pieces. It is a serine protease from the PA clan superfamily, found in the dig ...
. In 1989,
Karolin Luger Karolin Luger is an Austrian-American biochemist and biophysicist known for her work with nucleosomes and discovery of the three-dimensional structure of chromatin. She is a Distinguished Professor at the University of Colorado Boulder in the Bioc ...
and her colleagues introduced a genetic method for making circular permutations by carefully fragmenting and ligating DNA. This method allowed for permutations to be introduced at arbitrary sites. Despite the early discovery of post-translational circular permutations and the suggestion of a possible genetic mechanism for evolving circular permutants, it was not until 1995 that the first circularly permuted pair of genes were discovered.
Saposin Prosaposin, also known as PSAP, is a protein which in humans is encoded by the ''PSAP'' gene. This highly conserved glycoprotein is a precursor for 4 cleavage products: saposins A, B, C, and D. Saposin is an acronym for Sphingolipid Activator Pr ...
s are a class of proteins involved in
sphingolipid Sphingolipids are a class of lipids containing a backbone of sphingoid bases, which are a set of aliphatic amino alcohols that includes sphingosine. They were discovered in brain extracts in the 1870s and were named after the mythological sp ...
catabolism and
antigen presentation Antigen presentation is a vital immune process that is essential for T cell immune response triggering. Because T cells recognize only fragmented antigens displayed on cell surfaces, antigen processing must occur before the antigen fragment can ...
of
lipid Lipids are a broad group of organic compounds which include fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids include storing ...
s in humans.
Chris Ponting Christopher Paul Ponting is a British computational biologist, specializing in the evolution and function of genes and genomes. He is currently Chair of Medical Bioinformatics at the University of Edinburgh and group leader in the MRC Human ...
and Robert Russell identified a circularly permuted version of a saposin inserted into plant
aspartic proteinase Aspartic proteases (also "aspartyl proteases", "aspartic endopeptidases") are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, ...
, which they nicknamed swaposin. Saposin and swaposin were the first known case of two natural genes related by a circular permutation. Hundreds of examples of protein pairs related by a circular permutation were subsequently discovered in nature or produced in the laboratory. As of February 2012, the Circular Permutation Database contains 2,238 circularly permuted protein pairs with known structures, and many more are known without structures. The CyBase database collects proteins that are cyclic, some of which are permuted variants of cyclic wild-type proteins. SISYPHUS is a database that contains a collection of hand-curated manual alignments of proteins with non-trivial relationships, several of which have circular permutations.


Evolution

There are two main models that are currently being used to explain the evolution of circularly permuted proteins: ''permutation by duplication'' and ''fission and fusion''. The two models have compelling examples supporting them, but the relative contribution of each model in evolution is still under debate. Other, less common, mechanisms have been proposed, such as "cut and paste" or "
exon shuffling Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron st ...
".


Permutation by duplication

The earliest model proposed for the evolution of circular permutations is the permutation by duplication mechanism. In this model, a precursor gene first undergoes a
duplication Duplication, duplicate, and duplicator may refer to: Biology and genetics * Gene duplication, a process which can result in free mutation * Chromosomal duplication, which can cause Bloom and Rett syndrome * Polyploidy, a phenomenon also known ...
and fusion to form a large
tandem repeat In genetics, tandem repeats occur in DNA when a pattern of one or more nucleotides is repeated and the repetitions are directly adjacent to each other, e.g. ATTCG ATTCG ATTCG, in which the sequence ATTCG is repeated three times. Several protein ...
. Next, start and stop codons are introduced at corresponding locations in the duplicated gene, removing redundant sections of the protein. One surprising prediction of the permutation by duplication mechanism is that intermediate permutations can occur. For instance, the duplicated version of the protein should still be functional, since otherwise evolution would quickly select against such proteins. Likewise, partially duplicated intermediates where only one terminus was truncated should be functional. Such intermediates have been extensively documented in protein families such as
DNA methyltransferase In biochemistry, the DNA methyltransferase (DNA MTase, DNMT) family of enzymes catalyze the transfer of a methyl group to DNA. DNA methylation serves a wide variety of biological functions. All the known DNA methyltransferases use S-adenosyl ...
s.


Saposin and swaposin

An example for permutation by duplication is the relationship between saposin and swaposin. Saposins are highly conserved
glycoprotein Glycoproteins are proteins which contain oligosaccharide (sugar) chains covalently attached to amino acid side-chains. The carbohydrate is attached to the protein in a cotranslational or posttranslational modification. This process is known a ...
s, approximately 80 amino acid residues long and forming a four
alpha helical An alpha helix (or α-helix) is a sequence of amino acids in a protein that are twisted into a coil (a helix). The alpha helix is the most common structural arrangement in the secondary structure of proteins. It is also the most extreme type of l ...
structure. They have a nearly identical placement of cysteine residues and glycosylation sites. The
cDNA In genetics, complementary DNA (cDNA) is DNA that was reverse transcribed (via reverse transcriptase) from an RNA (e.g., messenger RNA or microRNA). cDNA exists in both single-stranded and double-stranded forms and in both natural and engin ...
sequence that codes for saposin is called prosaposin. It is a precursor for four cleavage products, the saposins A, B, C, and D. The four saposin domains most likely arose from two tandem duplications of an ancestral gene. This repeat suggests a mechanism for the evolution of the relationship with the plant-specific insert (PSI). The PSI is a domain exclusively found in plants, consisting of approximately 100 residues and found in plant
aspartic proteases Aspartic proteases (also "aspartyl proteases", "aspartic endopeptidases") are a catalytic type of protease enzymes that use an activated water molecule bound to one or more aspartate residues for catalysis of their peptide substrates. In general, ...
. It belongs to the saposin-like protein family (SAPLIP) and has the N- and C- termini "swapped", such that the order of helices is 3-4-1-2 compared with saposin, thus leading to the name "swaposin".


Fission and fusion

Another model for the evolution of circular permutations is the fission and fusion model. The process starts with two partial proteins. These may represent two independent polypeptides (such as two parts of a
heterodimer In biochemistry, a protein dimer is a macromolecular complex or multimer formed by two protein monomers, or single proteins, which are usually non-covalently bound. Many macromolecules, such as proteins or nucleic acids, form dimers. The word ...
), or may have originally been halves of a single protein that underwent a fission event to become two polypeptides. The two proteins can later fuse together to form a single polypeptide. Regardless of which protein comes first, this fusion protein may show similar function. Thus, if a fusion between two proteins occurs twice in evolution (either between
paralogues Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
within the same species or between
orthologues Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
in different species) but in a different order, the resulting fusion proteins will be related by a circular permutation. Evidence for a particular protein having evolved by a fission and fusion mechanism can be provided by observing the halves of the permutation as independent polypeptides in related species, or by demonstrating experimentally that the two halves can function as separate polypeptides.


Transhydrogenases

An example for the fission and fusion mechanism can be found in nicotinamide nucleotide transhydrogenases. These are
membrane A membrane is a selective barrier; it allows some things to pass through but stops others. Such things may be molecules, ions, or other small particles. Membranes can be generally classified into synthetic membranes and biological membranes. Bi ...
-bound
enzyme An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different mol ...
s that catalyze the transfer of a hydride ion between NAD(H) and NADP(H) in a reaction that is coupled to transmembrane proton translocation. They consist of three major functional units (I, II, and III) that can be found in different arrangement in
bacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
,
protozoa Protozoa (: protozoan or protozoon; alternative plural: protozoans) are a polyphyletic group of single-celled eukaryotes, either free-living or parasitic, that feed on organic matter such as other microorganisms or organic debris. Historically ...
, and higher
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s.
Phylogenetic analysis In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical data ...
suggests that the three groups of domain arrangements were acquired and fused independently.


Other processes that can lead to circular permutations


Post-translational modification

The two evolutionary models mentioned above describe ways in which genes may be circularly permuted, resulting in a circularly permuted
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
after
transcription Transcription refers to the process of converting sounds (voice, music etc.) into letters or musical notes, or producing a copy of something in another medium, including: Genetics * Transcription (biology), the copying of DNA into RNA, often th ...
. Proteins can also be circularly permuted via
post-translational modification In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translation (biolog ...
, without permuting the underlying gene. Circular permutations can happen spontaneously through
autocatalysis In chemistry, a chemical reaction is said to be autocatalytic if one of the reaction products is also a catalyst for the same reaction. Many forms of autocatalysis are recognized.Steinfeld J.I., Francisco J.S. and Hase W.L. ''Chemical Kinetics and ...
, as in the case of
concanavalin A Concanavalin A (ConA) is a lectin (carbohydrate-binding protein) originally extracted from the jack-bean (''Canavalia ensiformis''). It is a member of the legume lectin family. It binds specifically to certain structures found in various sugars, ...
. Alternately, permutation may require
restriction enzyme A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
s and
ligase In biochemistry, a ligase is an enzyme that can catalyze the joining ( ligation) of two molecules by forming a new chemical bond. This is typically via hydrolysis of a small pendant chemical group on one of the molecules, typically resulting i ...
s.


Role in protein engineering

Many proteins have their termini located close together in 3D space. Because of this, it is often possible to design circular permutations of proteins. Today, circular permutations are generated routinely in the lab using standard genetics techniques. Although some permutation sites prevent the protein from
folding Fold, folding or foldable may refer to: Arts, entertainment, and media * ''Fold'' (album), the debut release by Australian rock band Epicure * Fold (poker), in the game of poker, to discard one's hand and forfeit interest in the current pot *Abov ...
correctly, many permutants have been created with nearly identical structure and function to the original protein. The motivation for creating a circular permutant of a protein can vary. Scientists may want to improve some property of the protein, such as: * Reduce
proteolytic Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Protein degradation is a major regulatory mechanism of gene expression and contributes substantially to shaping mammalian proteomes. Uncatalysed, the hydrolysis o ...
susceptibility. The rate at which proteins are broken down can have a large impact on their activity in cells. Since termini are often accessible to
protease A protease (also called a peptidase, proteinase, or proteolytic enzyme) is an enzyme that catalysis, catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products ...
s, designing a circularly permuted protein with less-accessible termini can increase the lifespan of that protein in the cell. * Improve
catalytic activity Catalysis () is the increase in rate of a chemical reaction due to an added substance known as a catalyst (). Catalysts are not consumed by the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recycles quick ...
. Circularly permuting a protein can sometimes increase the rate at which it catalyzes a chemical reaction, leading to more efficient proteins. * Alter substrate or
ligand binding In biochemistry and pharmacology, a ligand is a substance that forms a complex with a biomolecule to serve a biological purpose. The etymology stems from Latin ''ligare'', which means 'to bind'. In protein-ligand binding, the ligand is usually a ...
. Circularly permuting a protein can result in the loss of substrate binding, but can occasionally lead to novel ligand binding activity or altered substrate specificity. (primary source) * Improve
thermostability In materials science and molecular biology, thermostability is the ability of a substance to resist irreversible change in its chemical or physical structure, often by resisting decomposition or polymerization, at a high relative temperature. T ...
. Making proteins active over a wider range of temperatures and conditions can improve their utility. (primary source) Alternately, scientists may be interested in properties of the original protein, such as: * Fold order. Determining the order in which different parts of a protein fold is challenging due to the extremely fast time scales involved. Circularly permuted versions of proteins will often fold in a different order, providing information about the folding of the original protein. (primary source) * Essential structural elements. Artificial circularly permuted proteins can allow parts of a protein to be selectively deleted. This gives insight into which structural elements are essential or not. (primary source) * Modify
quaternary structure Protein quaternary structure is the fourth (and highest) classification level of protein structure. Protein quaternary structure refers to the structure of proteins which are themselves composed of two or more smaller protein chains (also refe ...
. Circularly permuted proteins have been shown to take on different quaternary structure than wild-type proteins. * Find insertion sites for other proteins. Inserting one protein as a domain into another protein can be useful. For instance, inserting
calmodulin Calmodulin (CaM) (an abbreviation for calcium-modulated protein) is a multifunctional intermediate calcium-binding messenger protein expressed in all Eukaryote, eukaryotic cells. It is an intracellular target of the Second messenger system, sec ...
into
green fluorescent protein The green fluorescent protein (GFP) is a protein that exhibits green fluorescence when exposed to light in the blue to ultraviolet range. The label ''GFP'' traditionally refers to the protein first isolated from the jellyfish ''Aequorea victo ...
(GFP) allowed researchers to measure the activity of calmodulin via the
fluorescence Fluorescence is one of two kinds of photoluminescence, the emission of light by a substance that has absorbed light or other electromagnetic radiation. When exposed to ultraviolet radiation, many substances will glow (fluoresce) with colore ...
of the split-GFP. Regions of GFP that tolerate the introduction of circular permutation are more likely to accept the addition of another protein while retaining the function of both proteins. * Design of novel
biocatalyst An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as produc ...
s and biosensors. Introducing circular permutations can be used to design proteins to catalyze specific chemical reactions, or to detect the presence of certain molecules using proteins. For instance, the GFP-calmodulin fusion described above can be used to detect the level of calcium ions in a sample.


Algorithmic detection

Many
sequence alignment In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural biology, structural, or evolutionary relationships between ...
and protein structure alignment algorithms have been developed assuming linear data representations and as such are not able to detect circular permutations between proteins. Two examples of frequently used methods that have problems correctly aligning proteins related by circular permutation are dynamic programming and many
hidden Markov model A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent (or ''hidden'') Markov process (referred to as X). An HMM requires that there be an observable process Y whose outcomes depend on the outcomes of X ...
s. As an alternative to these, a number of algorithms are built on top of non-linear approaches and are able to detect
topology Topology (from the Greek language, Greek words , and ) is the branch of mathematics concerned with the properties of a Mathematical object, geometric object that are preserved under Continuous function, continuous Deformation theory, deformat ...
-independent similarities, or employ modifications allowing them to circumvent the limitations of dynamic programming. The table below is a collection of such methods. The algorithms are classified according to the type of input they require. ''Sequence''-based algorithms require only the sequence of two proteins in order to create an alignment. Sequence methods are generally fast and suitable for searching whole genomes for circularly permuted pairs of proteins. ''Structure''-based methods require 3D structures of both proteins being considered. They are often slower than sequence-based methods, but are able to detect circular permutations between distantly related proteins with low sequence similarity. Some structural methods are ''topology independent'', meaning that they are also able to detect more complex rearrangements than circular permutation.


References


Further reading

* David Goodsell (April 2010
''Concanavalin A and Circular Permutation''
Protein Data Bank (PDB) ''Molecule of the Month''


External links

* {{PDBe-KB2, P02866, Concanavalin-A Proteins Permutations