Protein Expression (biotechnology)
   HOME

TheInfoList



OR:

Protein production is the biotechnological process of generating a specific
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
. It is typically achieved by the manipulation of
gene expression Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
in an organism such that it expresses large amounts of a recombinant gene. This includes the transcription of the
recombinant DNA Recombinant DNA (rDNA) molecules are DNA molecules formed by laboratory methods of genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be fo ...
to messenger
RNA Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
(
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
), the
translation Translation is the communication of the semantics, meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English la ...
of mRNA into
polypeptide Peptides are short chains of amino acids linked by peptide bonds. A polypeptide is a longer, continuous, unbranched peptide chain. Polypeptides that have a molecular mass of 10,000 Da or more are called proteins. Chains of fewer than twenty ...
chains, which are ultimately folded into functional
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, re ...
and may be targeted to specific subcellular or extracellular locations. Protein production systems (also known as expression systems) are used in the
life sciences This list of life sciences comprises the branches of science that involve the scientific study of life – such as microorganisms, plants, and animals including human beings. This science is one of the two major branches of natural science, ...
,
biotechnology Biotechnology is a multidisciplinary field that involves the integration of natural sciences and Engineering Science, engineering sciences in order to achieve the application of organisms and parts thereof for products and services. Specialists ...
, and
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
.
Molecular biology Molecular biology is a branch of biology that seeks to understand the molecule, molecular basis of biological activity in and between Cell (biology), cells, including biomolecule, biomolecular synthesis, modification, mechanisms, and interactio ...
research uses numerous proteins and enzymes, many of which are from expression systems; particularly
DNA polymerase A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create t ...
for PCR,
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
for RNA analysis,
restriction endonuclease A restriction enzyme, restriction endonuclease, REase, ENase or'' restrictase '' is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class o ...
s for cloning, and to make proteins that are screened in
drug discovery In the fields of medicine, biotechnology, and pharmacology, drug discovery is the process by which new candidate medications are discovered. Historically, drugs were discovered by identifying the active ingredient from traditional remedies or ...
as
biological target A biological target is anything within a living organism to which some other entity (like an endogenous ligand or a drug) is directed and/or binds, resulting in a change in its behavior or function. Examples of common classes of biological targets ...
s or as potential drugs themselves. There are also significant applications for expression systems in
industrial fermentation Industrial fermentation is the intentional use of fermentation in manufacturing processes. In addition to the mass production of fermented foods and drinks, industrial fermentation has widespread applications in chemical industry. Commodity ch ...
, notably the production of biopharmaceuticals such as human
insulin Insulin (, from Latin ''insula'', 'island') is a peptide hormone produced by beta cells of the pancreatic islets encoded in humans by the insulin (''INS)'' gene. It is the main Anabolism, anabolic hormone of the body. It regulates the metabol ...
to treat
diabetes Diabetes mellitus, commonly known as diabetes, is a group of common endocrine diseases characterized by sustained high blood sugar levels. Diabetes is due to either the pancreas not producing enough of the hormone insulin, or the cells of th ...
, and to manufacture
enzymes An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as pro ...
.


Protein production systems

Commonly used protein production systems include those derived from
bacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
,
yeast Yeasts are eukaryotic, single-celled microorganisms classified as members of the fungus kingdom (biology), kingdom. The first yeast originated hundreds of millions of years ago, and at least 1,500 species are currently recognized. They are est ...
,
baculovirus ''Baculoviridae'' is a family of viruses. Arthropods, among the most studied being Lepidoptera, Hymenoptera and Diptera, serve as natural hosts. Currently, 85 species are placed in this family, assigned to four genera. Baculoviruses are known ...
/
insect Insects (from Latin ') are Hexapoda, hexapod invertebrates of the class (biology), class Insecta. They are the largest group within the arthropod phylum. Insects have a chitinous exoskeleton, a three-part body (Insect morphology#Head, head, ...
,
mammalian A mammal () is a vertebrate animal of the Class (biology), class Mammalia (). Mammals are characterised by the presence of milk-producing mammary glands for feeding their young, a broad neocortex region of the brain, fur or hair, and three ...
cells, and more recently filamentous fungi such as '' Myceliophthora thermophila''. When biopharmaceuticals are produced with one of these systems, process-related impurities termed host cell proteins also arrive in the final product in trace amounts.


Cell-based systems

The oldest and most widely used expression systems are cell-based and may be defined as the "''combination of an
expression vector An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for gene expression in cells. The vector (molecular biology), vector is used to introduce a specific gene into a target cell, and can command ...
, its cloned DNA, and the host for the vector that provide a context to allow foreign gene function in a host cell, that is, produce proteins at a high level''". Overexpression is an abnormally and excessively high level of
gene expression Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
which produces a pronounced gene-related
phenotype In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
. There are many ways to introduce foreign
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
to a cell for expression, and many different host cells may be used for expression — each expression system has distinct advantages and liabilities. Expression systems are normally referred to by the
host A host is a person responsible for guests at an event or for providing hospitality during it. Host may also refer to: Places * Host, Pennsylvania, a village in Berks County * Host Island, in the Wilhelm Archipelago, Antarctica People * ...
and the DNA source or the delivery mechanism for the genetic material. For example, common hosts are
bacteria Bacteria (; : bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of Prokaryote, prokaryotic microorganisms. Typically a few micr ...
(such as '' E. coli'', '' B. subtilis''),
yeast Yeasts are eukaryotic, single-celled microorganisms classified as members of the fungus kingdom (biology), kingdom. The first yeast originated hundreds of millions of years ago, and at least 1,500 species are currently recognized. They are est ...
(such as '' S. cerevisiae'') or eukaryotic
cell lines An immortalised cell line is a population of cells from a multicellular organism that would normally not proliferate indefinitely but, due to mutation, have evaded normal cellular senescence and instead can keep undergoing division. The cells ...
. Common DNA sources and delivery mechanisms are
virus A virus is a submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are ...
es (such as
baculovirus ''Baculoviridae'' is a family of viruses. Arthropods, among the most studied being Lepidoptera, Hymenoptera and Diptera, serve as natural hosts. Currently, 85 species are placed in this family, assigned to four genera. Baculoviruses are known ...
,
retrovirus A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. After invading a host cell's cytoplasm, the virus uses its own reverse transcriptase e ...
,
adenovirus Adenoviruses (members of the family ''Adenoviridae'') are medium-sized (90–100 nm), nonenveloped (without an outer lipid bilayer) viruses with an icosahedral nucleocapsid containing a double-stranded DNA genome. Their name derives from t ...
),
plasmid A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria and ...
s, artificial chromosomes and
bacteriophage A bacteriophage (), also known informally as a phage (), is a virus that infects and replicates within bacteria. The term is derived . Bacteriophages are composed of proteins that Capsid, encapsulate a DNA or RNA genome, and may have structu ...
(such as
lambda Lambda (; uppercase , lowercase ; , ''lám(b)da'') is the eleventh letter of the Greek alphabet, representing the voiced alveolar lateral approximant . In the system of Greek numerals, lambda has a value of 30. Lambda is derived from the Phoen ...
). The best expression system depends on the
gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
involved, for example the ''
Saccharomyces cerevisiae ''Saccharomyces cerevisiae'' () (brewer's yeast or baker's yeast) is a species of yeast (single-celled fungal microorganisms). The species has been instrumental in winemaking, baking, and brewing since ancient times. It is believed to have be ...
'' is often preferred for proteins that require significant
posttranslational modification In molecular biology, post-translational modification (PTM) is the covalent process of changing proteins following protein biosynthesis. PTMs may involve enzymes or occur spontaneously. Proteins are created by ribosomes, which translate mRNA ...
.
Insect Insects (from Latin ') are Hexapoda, hexapod invertebrates of the class (biology), class Insecta. They are the largest group within the arthropod phylum. Insects have a chitinous exoskeleton, a three-part body (Insect morphology#Head, head, ...
or
mammal A mammal () is a vertebrate animal of the Class (biology), class Mammalia (). Mammals are characterised by the presence of milk-producing mammary glands for feeding their young, a broad neocortex region of the brain, fur or hair, and three ...
cell lines are used when human-like splicing of mRNA is required. Nonetheless, bacterial expression has the advantage of easily producing large amounts of protein, which is required for
X-ray crystallography X-ray crystallography is the experimental science of determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to Diffraction, diffract in specific directions. By measuring th ...
or
nuclear magnetic resonance Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are disturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
experiments for structure determination. Because bacteria are
prokaryote A prokaryote (; less commonly spelled procaryote) is a unicellular organism, single-celled organism whose cell (biology), cell lacks a cell nucleus, nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Gree ...
s, they are not equipped with the full enzymatic machinery to accomplish the required post-translational modifications or molecular folding. Hence, multi-domain eukaryotic proteins expressed in bacteria often are non-functional. Also, many proteins become insoluble as inclusion bodies that are difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding. To address these concerns, expressions systems using multiple eukaryotic cells were developed for applications requiring the proteins be conformed as in, or closer to eukaryotic organisms: cells of plants (i.e. tobacco), of insects or mammalians (i.e. bovines) are transfected with genes and cultured in suspension and even as tissues or whole organisms, to produce fully folded proteins. Mammalian ''
in vivo Studies that are ''in vivo'' (Latin for "within the living"; often not italicized in English) are those in which the effects of various biological entities are tested on whole, living organisms or cells, usually animals, including humans, an ...
'' expression systems have however low yield and other limitations (time-consuming, toxicity to host cells,..). To combine the high yield/productivity and scalable protein features of bacteria and yeast, and advanced epigenetic features of plants, insects and mammalians systems, other protein production systems are developed using unicellular eukaryotes (i.e. non-pathogenic '''
Leishmania ''Leishmania'' () is a genus of parasitic protozoans, single-celled eukaryotic organisms of the trypanosomatid group that are responsible for the disease leishmaniasis. The parasites are transmitted by sandflies of the genus '' Phlebotomus'' ...
''' cells).


Bacterial systems


= ''Escherichia coli''

= '' E. coli'' is one of the most widely used expression hosts, and DNA is normally introduced in a
plasmid A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria and ...
expression vector. The techniques for overexpression in ''E. coli'' are well developed and work by increasing the number of copies of the gene or increasing the binding strength of the promoter region so assisting transcription. For example, a DNA sequence for a protein of interest could be cloned or subcloned into a high copy-number plasmid containing the '' lac'' (often LacUV5) promoter, which is then transformed into the bacterium ''E. coli''. Addition of IPTG (a
lactose Lactose is a disaccharide composed of galactose and glucose and has the molecular formula C12H22O11. Lactose makes up around 2–8% of milk (by mass). The name comes from (Genitive case, gen. ), the Latin word for milk, plus the suffix ''-o ...
analog) activates the lac promoter and causes the bacteria to express the protein of interest. ''E. coli'' strain BL21 and BL21(DE3) are two strains commonly used for protein production. As members of the B lineage, they lack ''
lon Lon or LON may refer to: People * Lon (photographer), pseudonym of Alonzo Hanagan, also known as "Lon of New York" * Lon (name), a list of people with the given name, nickname or surname Fictional characters * Nero Wolfe supporting characters#Lon ...
'' and '' OmpT'' proteases, protecting the produced proteins from degradation. The DE3 prophage found in BL21(DE3) provides
T7 RNA polymerase T7 RNA Polymerase is an RNA polymerase from the T7 bacteriophage that catalyzes the formation of RNA from DNA in the 5'→ 3' direction. Activity T7 polymerase is extremely promoter-specific and transcribes only DNA downstream of a T7 promo ...
(driven by the LacUV5 promoter), allowing for vectors with the T7 promoter to be used instead.


= ''Corynebacterium''

= Non-pathogenic species of the gram-positive ''
Corynebacterium ''Corynebacterium'' () is a genus of Gram-positive bacteria and most are aerobic. They are bacilli (rod-shaped), and in some phases of life they are, more specifically, club-shaped, which inspired the genus name ('' coryneform'' means "club-s ...
'' are used for the commercial production of various amino acids. The '' C. glutamicum'' species is widely used for producing
glutamate Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a Essential amino acid, non-essential nutrient for humans, meaning that ...
and
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. Lysine contains an α-amino group (which is in the protonated form when the lysine is dissolved in water at physiological pH), an α-carboxylic acid group ( ...
, components of human food, animal feed and pharmaceutical products. Expression of functionally active human epidermal growth factor has been done in ''C. glutamicum'', thus demonstrating a potential for industrial-scale production of human proteins. Expressed proteins can be targeted for secretion through either the general,
secretory pathway Secretion is the movement of material from one point to another, such as a secreted chemical substance from a cell (biology), cell or gland. In contrast, excretion is the removal of certain substances or waste products from a cell or organism. Th ...
(Sec) or the twin-arginine translocation pathway (Tat). Unlike
gram-negative bacteria Gram-negative bacteria are bacteria that, unlike gram-positive bacteria, do not retain the Crystal violet, crystal violet stain used in the Gram staining method of bacterial differentiation. Their defining characteristic is that their cell envelo ...
, the gram-positive ''Corynebacterium'' lack lipopolysaccharides that function as antigenic endotoxins in humans.


= ''Pseudomonas fluorescens''

= The non-pathogenic and gram-negative bacteria, ''
Pseudomonas fluorescens ''Pseudomonas fluorescens'' is a common Gram-negative, rod-shaped bacterium. It belongs to the ''Pseudomonas'' genus; 16S rRNA analysis as well as phylogenomic analysis has placed ''P. fluorescens'' in the ''P. fluorescens'' group within the genu ...
'', is used for high level production of recombinant proteins; commonly for the development bio-therapeutics and vaccines. ''P. fluorescens'' is a metabolically versatile organism, allowing for high throughput screening and rapid development of complex proteins. ''P. fluorescens'' is most well known for its ability to rapid and successfully produce high titers of active, soluble protein.


Eukaryotic systems


= Yeasts

= Expression systems using either '' S. cerevisiae'' or ''
Pichia pastoris ''Komagataella'' is a methylotrophic yeast within the order Saccharomycetales. It was found in the 1960s as ''Pichia pastoris'', with its feature of using methanol as a source of carbon and energy. In 1995, ''P. pastoris'' was reassigned into t ...
'' allow stable and lasting production of proteins that are processed similarly to mammalian cells, at high yield, in chemically defined media of proteins.


= Filamentous fungi

= Filamentous fungi, especially ''
Aspergillus ' () is a genus consisting of several hundred mold species found in various climates worldwide. ''Aspergillus'' was first catalogued in 1729 by the Italian priest and biologist Pier Antonio Micheli. Viewing the fungi under a microscope, Miche ...
'' and '' Trichoderma'', have long been used to produce diverse industrial enzymes from their own genomes ("native", "homologous") and from recombinant DNA ("heterologous"). More recently, '' Myceliophthora thermophila'' C1 has been developed into an expression platform for screening and production of native and heterologous proteins.The expression system C1 shows a low viscosity morphology in submerged culture, enabling the use of complex growth and production media. C1 also does not "hyperglycosylate" heterologous proteins, as ''Aspergillus'' and ''Trichoderma'' tend to do.


= ''Baculovirus''-infected cells

=
Baculovirus ''Baculoviridae'' is a family of viruses. Arthropods, among the most studied being Lepidoptera, Hymenoptera and Diptera, serve as natural hosts. Currently, 85 species are placed in this family, assigned to four genera. Baculoviruses are known ...
-infected insect cells ( Sf9, Sf21,
High Five The high five is a hand gesture whereby two people simultaneously raise one hand and slap the flat of their palm against the other. The gesture is often preceded verbally by a phrase like "Give me five", "High five", or "Up top". Its meaning var ...
strains) or mammalian cells (
HeLa HeLa () is an immortalized cell line used in scientific research. It is the oldest human cell line and one of the most commonly used. HeLa cells are durable and prolific, allowing for extensive applications in scientific study. The line is ...
, HEK 293) allow production of glycosylated or membrane proteins that cannot be produced using fungal or bacterial systems. It is useful for production of proteins in high quantity. Genes are not expressed continuously because infected host cells eventually lyse and die during each infection cycle.


= Non-lytic insect cell expression

= Non-lytic insect cell expression is an alternative to the lytic baculovirus expression system. In non-lytic expression, vectors are transiently or stably transfected into the chromosomal DNA of insect cells for subsequent gene expression. This is followed by selection and screening of recombinant clones. The non-lytic system has been used to give higher protein yield and quicker expression of recombinant genes compared to baculovirus-infected cell expression. Cell lines used for this system include: Sf9, Sf21 from ''
Spodoptera frugiperda The fall armyworm (''Spodoptera frugiperda'') is a species in the order Lepidoptera and one of the species of the fall armyworm moths distinguished by their larval life stage. The term "armyworm" can refer to several species, often describing the ...
'' cells, Hi-5 from '' Trichoplusia ni'' cells, and Schneider 2 cells and Schneider 3 cells from ''
Drosophila melanogaster ''Drosophila melanogaster'' is a species of fly (an insect of the Order (biology), order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the "vinegar fly", "pomace fly" ...
'' cells. With this system, cells do not lyse and several cultivation modes can be used. Additionally, protein production runs are reproducible. This system gives a homogeneous product. A drawback of this system is the requirement of an additional screening step for selecting viable clones.


= ''

Excavata Excavata is an obsolete, extensive and diverse paraphyletic group of unicellular Eukaryota. The group was first suggested by Simpson and Patterson in 1999 and the name latinized and assigned a rank by Thomas Cavalier-Smith in 2002. It contains ...
''

= ''
Leishmania ''Leishmania'' () is a genus of parasitic protozoans, single-celled eukaryotic organisms of the trypanosomatid group that are responsible for the disease leishmaniasis. The parasites are transmitted by sandflies of the genus '' Phlebotomus'' ...
tarentolae'' (cannot infect mammals) expression systems allow stable and lasting production of proteins at high yield, in chemically defined media. Produced proteins exhibit fully eukaryotic post-translational modifications, including
glycosylation Glycosylation is the reaction in which a carbohydrate (or ' glycan'), i.e. a glycosyl donor, is attached to a hydroxyl or other functional group of another molecule (a glycosyl acceptor) in order to form a glycoconjugate. In biology (but not ...
and disulfide bond formation.


= Mammalian systems

= The most common mammalian expression systems are Chinese Hamster
ovary The ovary () is a gonad in the female reproductive system that produces ova; when released, an ovum travels through the fallopian tube/ oviduct into the uterus. There is an ovary on the left and the right side of the body. The ovaries are end ...
(CHO) and Human embryonic kidney (HEK) cells. *
Chinese hamster ovary cell Chinese hamster ovary (CHO) cells are a family of immortalized cell lines derived from epithelial cells of the ovary of the Chinese hamster, often used in biological and medical research and commercially in the production of recombinant therap ...
*
Mouse A mouse (: mice) is a small rodent. Characteristically, mice are known to have a pointed snout, small rounded ears, a body-length scaly tail, and a high breeding rate. The best known mouse species is the common house mouse (''Mus musculus'' ...
myeloma lymphoblstoid (e.g. NS0 cell) * Fully Human ** Human embryonic kidney cells ( HEK-293) ** Human embryonic retinal cells (Crucell's Per.C6) ** Human amniocyte cells (Glycotope and CEVEC)


Cell-free systems

Cell-free production of proteins is performed ''in vitro'' using purified RNA polymerase, ribosomes, tRNA and ribonucleotides. These reagents may be produced by extraction from cells or from a cell-based expression system. Due to the low expression levels and high cost of cell-free systems, cell-based systems are more widely used.


See also

* Cellosaurus, a database of cell lines *
Gene expression Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
* Single-cell protein *
Protein purification Protein purification is a series of processes intended to isolate one or a few proteins from a complex mixture, usually Cell biology, cells, Tissue (biology), tissues, or whole organisms. Protein purification is vital for the specification of the ...
* Precision fermentation * Host cell protein * List of recombinant proteins


References


Further reading

* *


External links

{{Microorganisms Gene expression Biotechnology