Papain-like proteases (or papain-like (cysteine) peptidases; abbreviated PLP or PLCP) are a large
protein family
A protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be c ...
of
cysteine protease enzymes that share
structural and
enzymatic properties with the group's namesake member,
papain
Papain, also known as papaya proteinase I, is a cysteine protease () enzyme present in papaya (''Carica papaya'') and mountain papaya (''Vasconcellea cundinamarcensis''). It is the namesake member of the papain-like protease family.
It has wide ...
. They are found in all
domains of life
In Biology, biological Taxonomy (biology), taxonomy, a domain ( or ) (Latin: ''regio''), also dominion, superkingdom, realm, or empire, is the highest taxonomic rank of all organisms taken together. It was introduced in the three-domain system of ...
. In animals, the group is often known as cysteine
cathepsins or, in older literature,
lysosomal peptidases.
In the
MEROPS protease enzyme classification system, papain-like proteases form Clan CA.
Papain-like proteases share a common
catalytic dyad
A catalytic triad is a set of three coordinated amino acids that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes (e.g. proteases, amidases, esterases, acylases, li ...
active site
In biology and biochemistry, the active site is the region of an enzyme where substrate molecules bind and undergo a chemical reaction. The active site consists of amino acid residues that form temporary bonds with the substrate (binding site) a ...
featuring a
cysteine
Cysteine (symbol Cys or C; ) is a semiessential proteinogenic amino acid with the formula . The thiol side chain in cysteine often participates in enzymatic reactions as a nucleophile.
When present as a deprotonated catalytic residue, sometime ...
amino acid residue
Protein structure is the molecular geometry, three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single ami ...
that acts as a
nucleophile
In chemistry, a nucleophile is a chemical species that forms bonds by donating an electron pair. All molecules and ions with a free pair of electrons or at least one pi bond can act as nucleophiles. Because nucleophiles donate electrons, they are ...
.
The
human genome encodes eleven cysteine cathepsins which have a broad range of physiological functions.
In some
parasites papain-like proteases have roles in
host
A host is a person responsible for guests at an event or for providing hospitality during it.
Host may also refer to:
Places
* Host, Pennsylvania, a village in Berks County
People
*Jim Host (born 1937), American businessman
* Michel Host ...
invasion, such as
cruzipain from ''
Trypanosoma cruzi''.
In plants, they are involved in host defense and in development.
Studies of papain-like proteases from
prokaryotes have lagged their
eukaryotic counterparts.
In cellular organisms they are synthesized as
preproenzymes that are not enzymatically active until mature, and their activities are tightly regulated, often by the presence of endogenous
protease inhibitors such as
cystatins.
In many
RNA virus
An RNA virus is a virusother than a retrovirusthat has ribonucleic acid (RNA) as its genetic material. The nucleic acid is usually single-stranded RNA ( ssRNA) but it may be double-stranded (dsRNA). Notable human diseases caused by RNA viruses ...
es, including significant human
pathogens such as the
coronavirus
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the com ...
es
SARS-CoV and
SARS-CoV-2, papain-like protease
protein domains often have roles in processing of
polyproteins into mature
viral nonstructural protein In virology, a nonstructural protein is a protein encoded by a virus but that is not part of the viral particle. They typically include the various enzymes and transcription factors the virus uses to replicate itself, such as a viral protease ( 3CL ...
s.
Many papain-like proteases are considered potential
drug targets.
Classification
The
MEROPS system of protease enzyme classification defines clan CA as containing the papain-like proteases. They are thought to have a shared
evolutionary origin. As of 2021, the clan contained 45 families.
Structure

The structure of papain was among the earliest
protein structures experimentally determined by
X-ray crystallography.
Many papain-like protease enzymes function as
monomers, though a few, such as
cathepsin C (Dipeptidyl-peptidase I), are
homotetramers. The mature monomer structure is characteristically divided into two lobes or subdomains, known as the L-domain (
N-terminal
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
) and the R-domain (
C-terminal), where the
active site
In biology and biochemistry, the active site is the region of an enzyme where substrate molecules bind and undergo a chemical reaction. The active site consists of amino acid residues that form temporary bonds with the substrate (binding site) a ...
is located between them.
The L-domain is primarily
helical while the R-domain contains
beta-sheets in a
beta-barrel-like shape, surrounded by a helix.
The
enzyme substrate interacts with both domains in an extended conformation.
Papain-like proteases are often
synthesized as
preproenzymes, or enzymatically inactive precursors. A
signal peptide
A signal peptide (sometimes referred to as signal sequence, targeting signal, localization signal, localization sequence, transit peptide, leader sequence or leader peptide) is a short peptide (usually 16-30 amino acids long) present at the N-ter ...
at the
N-terminus
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
, which serves as a
subcellular localization The cells of eukaryotic organisms are elaborately subdivided into functionally-distinct membrane-bound compartments. Some major constituents of eukaryotic cells are: extracellular space, plasma membrane, cytoplasm, nucleus, mitochondria, Golgi ap ...
signal, is cleaved by
signal peptidase to form a
zymogen.
Post-translational modification in the form of
N-linked glycosylation also occurs in parallel.
The zymogen is still inactive due to the presence of a
propeptide
A protein precursor, also called a pro-protein or pro-peptide, is an inactive protein (or peptide) that can be turned into an active form by post-translational modification, such as breaking off a piece of the molecule or adding on another molecule ...
which functions as an inhibitor blocking access to the active site. The propeptide is removed by
proteolysis
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called protease ...
to form the mature enzyme.
Catalytic mechanism
Papain-like proteases have a
catalytic dyad
A catalytic triad is a set of three coordinated amino acids that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes (e.g. proteases, amidases, esterases, acylases, li ...
consisting of a
cysteine
Cysteine (symbol Cys or C; ) is a semiessential proteinogenic amino acid with the formula . The thiol side chain in cysteine often participates in enzymatic reactions as a nucleophile.
When present as a deprotonated catalytic residue, sometime ...
and a
histidine residue, which form an
ion pair through their charged
thiolate and
imidazolium side chains. The negatively charged cysteine thiolate functions as a
nucleophile
In chemistry, a nucleophile is a chemical species that forms bonds by donating an electron pair. All molecules and ions with a free pair of electrons or at least one pi bond can act as nucleophiles. Because nucleophiles donate electrons, they are ...
.
Additional neighboring residues -
aspartate
Aspartic acid (symbol Asp or D; the ionic form is known as aspartate), is an α-amino acid that is used in the biosynthesis of proteins. Like all other amino acids, it contains an amino group and a carboxylic acid. Its α-amino group is in the pro ...
,
asparagine, or
glutamine - position the catalytic residues;
in papain, the required catalytic residues cysteine, histidine, and aspartate are sometimes called the catalytic triad (similar to
serine proteases).
Papain-like proteases are usually
endopeptidases, but some members of the group are also, or even exclusively,
exopeptidases.
Some viral papain-like proteases, including those of
coronavirus
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the com ...
es, can also cleave
isopeptide bonds and can function as
deubiquitinase
Deubiquitinating enzymes (DUBs), also known as deubiquitinating peptidases, deubiquitinating isopeptidases, deubiquitinases, ubiquitin proteases, ubiquitin hydrolases, ubiquitin isopeptidases, are a large group of proteases that cleave ubiquitin ...
s.
Function
Eukaryotes
Mammals
In animals, especially in mammalian biology, members of the papain-like protease family are usually referred to as cysteine cathepsins - that is, the
cysteine protease members of the group of proteases known as
cathepsins (which includes cysteine,
serine
Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
, and
aspartic proteases). In humans, there are 11 cysteine cathepsins:
B,
C,
F,
H,
K,
L,
O,
S,
V,
X, and
W. Most cathepsins are
expressed throughout the body, but some have narrower
tissue distribution.

Although historically known as
lysosomal proteases and studied mainly for their role in protein
catabolism
Catabolism () is the set of metabolic pathways that breaks down molecules into smaller units that are either oxidized to release energy or used in other anabolic reactions. Catabolism breaks down large molecules (such as polysaccharides, lipids, ...
, cysteine cathepsins have since been identified playing major roles in a number of physiological processes and disease states. As part of normal physiological processes, they are involved in key steps of
antigen presentation as part of the
adaptive immune system, remodeling of the
extracellular matrix,
differentiation of
keratinocytes, and processing of
peptide hormones.
Cysteine cathepsins have been associated with
cancer and
tumor progression,
cardiovascular disease
Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels. CVD includes coronary artery diseases (CAD) such as angina and myocardial infarction (commonly known as a heart attack). Other CVDs include stroke, h ...
,
autoimmune disease
An autoimmune disease is a condition arising from an abnormal immune response to a functioning body part. At least 80 types of autoimmune diseases have been identified, with some evidence suggesting that there may be more than 100 types. Nearly a ...
, and other human health conditions.
Cathepsin K has a role in
bone resorption and has been studied as a
drug target for
osteoporosis
Osteoporosis is a systemic skeletal disorder characterized by low bone mass, micro-architectural deterioration of bone tissue leading to bone fragility, and consequent increase in fracture risk. It is the most common reason for a broken bone ...
.
Parasites
A number of
parasites, including
helminth
Parasitic worms, also known as helminths, are large macroparasites; adults can generally be seen with the naked eye. Many are intestinal worms that are soil-transmitted and infect the gastrointestinal tract. Other parasitic worms such as schi ...
s (parasitic worms), use papain-like proteases as mechanisms for invasion of their
hosts. Examples include ''
Toxoplasma gondii'' and ''
Giardia lamblia''. In many flatworms, there are very high levels of expression of cysteine cathepsins; in the
liver fluke ''
Fasciola hepatica'',
gene duplication
Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. ...
s have produced over 20
paralogs of a
cathepsin L-like enzyme.
Cysteine cathepsins are also part of the normal life cycle of the unicellular parasite ''
Leishmania
''Leishmania'' is a parasitic protozoan, a single-celled organism of the genus '' Leishmania'' that are responsible for the disease leishmaniasis. They are spread by sandflies of the genus ''Phlebotomus'' in the Old World, and of the genus '' ...
'', where they function as
virulence factors.
The enzyme and potential
drug target cruzipain is important for the life cycle of the parasite ''
Trypanosoma cruzi'', which causes
Chagas' disease
Chagas disease, also known as American trypanosomiasis, is a tropical parasitic disease caused by ''Trypanosoma cruzi''. It is spread mostly by insects in the subfamily ''Triatominae'', known as "kissing bugs". The symptoms change over the cou ...
.
Plants

Members of the papain-like protease family play a number of important roles in
plant development, including
seed germination,
leaf senescence, and responding to
abiotic stress. Papain-like proteases are involved in regulation of
programmed cell death in plants, for example in
tapetum during development of
pollen
Pollen is a powdery substance produced by seed plants. It consists of pollen grains (highly reduced microgametophytes), which produce male gametes (sperm cells). Pollen grains have a hard coat made of sporopollenin that protects the gametophyt ...
. They are also important in
plant immunity
Plant disease resistance protects plants from pathogens in two ways: by pre-formed structures and chemicals, and by infection-induced responses of the immune system. Relative to a susceptible plant, disease resistance is the reduction of pathoge ...
providing defense against
pests
PESTS was an anonymous American activist group formed in 1986 to critique racism, tokenism, and exclusion in the art world. PESTS produced newsletters, posters, and other print material highlighting examples of discrimination in gallery represent ...
and
pathogens.
The relationship between plant papain-like proteases and pathogen responses - such as
cystatin inhibitors - have been described as an
evolutionary arms race.
Some PLP family members in plants have culinary and commercial applications. The family's namesake member,
papain
Papain, also known as papaya proteinase I, is a cysteine protease () enzyme present in papaya (''Carica papaya'') and mountain papaya (''Vasconcellea cundinamarcensis''). It is the namesake member of the papain-like protease family.
It has wide ...
, is a protease derived from
papaya
The papaya (, ), papaw, () or pawpaw () is the plant species ''Carica papaya'', one of the 21 accepted species in the genus ''Carica'' of the family Caricaceae. It was first domesticated in Mesoamerica, within modern-day southern Mexico and ...
, used as a
meat tenderizer.
Similar but less widely used plant products include
bromelain from
pineapple and
ficin from
figs.
Prokaryotes
Although papain-like proteases are found in all
domains of life
In Biology, biological Taxonomy (biology), taxonomy, a domain ( or ) (Latin: ''regio''), also dominion, superkingdom, realm, or empire, is the highest taxonomic rank of all organisms taken together. It was introduced in the three-domain system of ...
, they have been less well-studied in
prokaryotes than in
eukaryote
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
s.
Only a few prokaryotic PLP enzymes have been characterized by
X-ray crystallography or enzymatic studies, mostly from pathogenic bacteria, including
streptopain
Streptopain (, ''Streptococcus peptidase A'', ''streptococcal cysteine proteinase'', ''Streptococcus protease'') is an enzyme. This enzyme catalyses the following chemical reaction
: Preferential cleavage with hydrophobic residues at P2, P1 and P ...
from ''
Streptococcus pyogenes'';
xylellain, from the plant pathogen ''
Xylella fastidiosa'';
Cwp84 from ''
Clostridium difficile
''Clostridioides difficile'' (syn. ''Clostridium difficile'') is a bacterium that is well known for causing serious diarrheal infections, and may also cause colon cancer. Also known as ''C. difficile'', or ''C. diff'' (), is Gram-positive spec ...
'';
and
Lpg2622 LPG may refer to:
Science
* Liquefied petroleum gas, a flammable mixture of hydrocarbon gases
* Lipophosphoglycan, a class of molecule found on the surface of some eukaryotes, in particular protozoa
Music groups
* La Perdita Generacio, a Swedish a ...
from ''
Legionella pneumophila
''Legionella pneumophila'' is a thin, aerobic, pleomorphic, flagellated, non-spore-forming, Gram-negative bacterium of the genus ''Legionella''. ''L. pneumophila'' is the primary human pathogenic bacterium in this group and is the causative age ...
''.
Viruses

The papain-like protease family includes a number of
protein domains that are found in large
polyproteins expressed by
RNA virus
An RNA virus is a virusother than a retrovirusthat has ribonucleic acid (RNA) as its genetic material. The nucleic acid is usually single-stranded RNA ( ssRNA) but it may be double-stranded (dsRNA). Notable human diseases caused by RNA viruses ...
es.
Among the best studied viral PLPs are
nidoviral papain-like protease domain
The nidoviral papain-like protease (PLPro or PLP) is a papain-like protease protein domain encoded in the genomes of nidoviruses. It is expressed as part of a large polyprotein from the ORF1a gene and has cysteine protease enzyme, enzymatic acti ...
s from
nidoviruses, particularly those from
coronavirus
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the com ...
es. These PLPs are responsible for several cleavage events that process a large polyprotein into
viral nonstructural protein In virology, a nonstructural protein is a protein encoded by a virus but that is not part of the viral particle. They typically include the various enzymes and transcription factors the virus uses to replicate itself, such as a viral protease ( 3CL ...
s, although they perform fewer cleavages than the
3C-like protease
The 3C-like protease (3CLpro) or main protease (Mpro), formally known as C30 endopeptidase or 3-chymotrypsin-like protease, is the main protease found in coronaviruses. It cleaves the coronavirus polyprotein at eleven conserved sites. It is a cy ...
(also known as the main protease).
Coronavirus PLPs are multifunctional enzymes that can also act as
deubiquitinase
Deubiquitinating enzymes (DUBs), also known as deubiquitinating peptidases, deubiquitinating isopeptidases, deubiquitinases, ubiquitin proteases, ubiquitin hydrolases, ubiquitin isopeptidases, are a large group of proteases that cleave ubiquitin ...
s (cleaving the
isopeptide bond to
ubiquitin) and "deISGylating enzymes" with analogous activity against the
ubiquitin-like protein ISG15.
In human pathogens including
SARS-CoV,
MERS-CoV, and
SARS-CoV-2, the PLP domain is
essential for
viral replication and is therefore considered a
drug target for the development of
antiviral drug
Antiviral drugs are a class of medication used for treating viral infections. Most antivirals target specific viruses, while a broad-spectrum antiviral is effective against a wide range of viruses. Unlike most antibiotics, antiviral drugs do n ...
s.
References
{{reflist, 30em
Proteases