Cleavage and polyadenylation specificity factor (CPSF) is involved in the
cleavage
Cleavage may refer to:
Science
* Cleavage (crystal), the way in which a crystal or mineral tends to split
* Cleavage (embryo), the division of cells in an early embryo
* Cleavage (geology), foliation of rock perpendicular to stress, a result of ...
of the
3' signaling region from a newly synthesized pre-
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the ...
(pre-mRNA) molecule in the process of
gene transcription
Transcription is the process of copying a segment of DNA into RNA for the purpose of gene expression. Some segments of DNA are transcribed into RNA molecules that can encode proteins, called messenger RNA (mRNA). Other segments of DNA are transc ...
. In eukaryotes, messenger RNA precursors (pre-mRNA) are transcribed in the nucleus from DNA by the enzyme, RNA polymerase II. The pre-mRNA must undergo post-transcriptional modifications, forming mature RNA (mRNA), before they can be transported into the cytoplasm for translation into proteins. The post-transcriptional modifications are: the addition of a 5' m7G cap, splicing of intronic sequences, and 3' cleavage and polyadenylation.
According to Schönemann et al., "CPSF recognizes the polyadenylation signal (PAS), providing sequence specificity in pre-mRNA cleavage and polyadenylation, and catalyzes pre-mRNA cleavage."
It is required to induce RNA polymerase pausing once it recognizes a functional PAS.
It is the first protein to bind to the signaling region near the cleavage site of the pre-mRNA, to which the
poly(A) tail
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euka ...
will be added by
polynucleotide adenylyltransferase
In enzymology, a polynucleotide adenylyltransferase () is an enzyme that catalysis, catalyzes the chemical reaction
:ATP + RNA-3'OH \rightleftharpoons pyrophosphate + RNApA-3'OH
Thus, the two substrate (biochemistry), substrates of this enzyme a ...
. The 10-30 nucleotide upstream signaling region of the cleavage site, polyadenylation signal (PAS), has the canonical
nucleotide
Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
sequence AAUAAA, which is highly conserved across the vast majority of pre-mRNAs. The AAUAAA region is usually defined by a cytosine/adenine (CA) dinucleotide, which is the preferred sequence, that is 5' to the site of the endonucleolytic cleavage.
A second downstream signaling region, located approximately 40 nucleotides downstream from the cleavage site on the portion of the pre-mRNA that is cleaved before polyadenylation, consists of a U/GU-rich region required for efficient processing. This downstream fragment is degraded. The mature RNA are transported into the cytoplasm, where they are translated into proteins.
Protein Structure & Interactions
In mammals, CPSF is a
protein complex
A protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multidomain enzymes, in which multiple active site, catalytic domains are found in a single polypeptide chain.
...
, consisting of six subunits:
CPSF-160
Cleavage and polyadenylation specificity factor subunit 1 is a protein that in humans is encoded by the ''CPSF1'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a seque ...
(CPSF1),
CPSF-100
Cleavage and polyadenylation specificity factor subunit 2 is a protein that in humans is encoded by the ''CPSF2'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a seq ...
(CPSF2),
CPSF-73
Cleavage and polyadenylation specificity factor subunit 3 is a protein that in humans is encoded by the ''CPSF3'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequ ...
(CPSF3), and
CPSF-30
Cleavage and polyadenylation specificity factor subunit 4 is a protein that in humans is encoded by the ''CPSF4'' gene.
Inhibition of the nuclear export of poly(A)-containing mRNAs caused by the influenza A virus NS1 protein requires its effector ...
(CPSF4) kDa subunits,
WDR33 and
Fip1 (FIP1L1).

The subunits form two components: mammalian polyadenylation specificity factors (mPSF) and mammalian cleavage factor (mCF). The mPSF is made up of CPSF-160, WDR33, CPSF-30, and Fip1. It is necessary for PAS recognition and polyadenylation. The mCF is made up of CPSF-73, CPSF-100, and symplekin. It catalyzes the cleavage reaction by recognizing the histone mRNA 3' processing site.
CPSF-73 is a zinc-dependent
hydrolase
In biochemistry, hydrolases constitute a class of enzymes that commonly function as biochemical catalysts that use water to break a chemical bond:
:\ce \quad \xrightarrowtext\quad \ce
This typically results in dividing a larger molecule into s ...
which cleaves the mRNA precursor between a CA dinucleotide just downstream the polyadenylation signal sequence AAUAAA.
CPSF-100 contributes to the endonuclease activity of CPSF-73.
CPSF-160 (160 kDa) is the largest subunit of CPSF and directly binds to the AAUAAA polyadenylation signal.
160 kDa has three β-propeller domains and a C-terminal domain.
CPSF-30 (30 kDa) has five Cys-Cys-Cys-His (CCCH) zinc-finger motifs near the N terminus and a CCCH zinc knuckle at the C terminus. Two isoforms of CPSF-30 exist and can be found in CPSF complexes. The RNA binding activity of CPSF-30 is mediated by its zinc-fingers 2 and 3. WD repeat domain 33 (146 kDa) has a WD40 domain near the N terminus. The WD40 domain interacts with RNA. WDR33 and CPSF-30 recognize the polyadenylation signal (PAS) in pre-mRNA, which aids in defining the position of RNA cleavage. CPSF-30 recognizes the AU-rich hexamer region by a cooperative, metal-dependent binding mechanism.
Although CPSF-160 is the largest subunit of CPSF, a study conducted by Schönemann et al., debate that WDR33 is responsible for recognizing the PAS and not CPSF-160 as previously believed. The study concluded that the reason that CPSF-160 was believed to be responsible for recognizing the PAS was due to the fact that the WDR33 subunit had not been discovered at the time of the claim.
Fip1 binds to U-rich RNAs by its arginine-rich C-terminus. It binds to RNA sequences upstream of the AAUAAA hexamer region in vitro. Fip1 and CPSF-160 recruit poly(A) polymerase (PAP) to the 3' processing site.
PAP is stimulated by Poly(A) binding protein nuclear one to add the poly(A) tail, a non-templated adenosine residues, at the cleavage site.
Only CPSF-160, CPSF-30, Fip1, and WDR33 are necessary and sufficient to form an active CPSF subcomplex in AAUAAA-dependent polyadenylation. CPSF-73 and CPSF-100 are disposable.
CPSF recruits proteins to the 3' region. Identified proteins that are coordinated by CPSF activity include:
cleavage stimulatory factor Cleavage stimulatory factor or cleavage stimulation factor (CstF or CStF) is a heterotrimeric protein, made up of the proteins CSTF1 (55 kDa), CSTF2 (64kDa) and CSTF3 (77kDa), totalling about 200 kDa. It is involved in the cleavage of the 3' si ...
and the two poorly understood
cleavage factor
__NOTOC__
Cleavage factors are two closely associated protein complexes involved in the cleavage of the 3' untranslated region of a newly synthesized pre-messenger RNA (mRNA) molecule in the process of gene transcription. The cleavage is the firs ...
s. The binding of the
polynucleotide adenylyltransferase
In enzymology, a polynucleotide adenylyltransferase () is an enzyme that catalysis, catalyzes the chemical reaction
:ATP + RNA-3'OH \rightleftharpoons pyrophosphate + RNApA-3'OH
Thus, the two substrate (biochemistry), substrates of this enzyme a ...
responsible for actually synthesizing the tail is a necessary prerequisite for cleavage, thus ensuring that cleavage and polyadenylation are tightly coupled processes.
Genes
* ''
CPSF1
Cleavage and polyadenylation specificity factor subunit 1 is a protein that in humans is encoded by the ''CPSF1'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a seque ...
'', ''
CPSF2
Cleavage and polyadenylation specificity factor subunit 2 is a protein that in humans is encoded by the ''CPSF2'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a seq ...
'', ''
CPSF3
Cleavage and polyadenylation specificity factor subunit 3 is a protein that in humans is encoded by the ''CPSF3'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequ ...
'', ''
CPSF4
Cleavage and polyadenylation specificity factor subunit 4 is a protein that in humans is encoded by the ''CPSF4'' gene.
Inhibition of the nuclear export of poly(A)-containing mRNAs caused by the influenza A virus NS1 protein requires its effector ...
'',
NUDT21
Cleavage and polyadenylation specificity factor subunit 5 (CPSF5) is an enzyme that in humans is encoded by the ''NUDT21'' gene. It belongs to the Nudix family of hydrolases.
The protein encoded by this gene is one subunit of the cleavage factor ...
, ''
CPSF6
Cleavage and polyadenylation specificity factor subunit 6 is a protein that in humans is encoded by the ''CPSF6'' gene.
Function
The protein encoded by this gene is one subunit of a cleavage factor required for 3' RNA cleavage and polyadenyla ...
,
CPSF7
Cleavage and polyadenylation specificity factor subunit 7 is a protein that in humans is encoded by the ''CPSF7'' gene.
Function
CPSF7, also known as CFIm59, is the cleavage factor of two closely associated protein complexes in the 3' untrans ...
,
FIP1L1
Factor interacting with PAPOLA and CPSF1 (i.e, FIP1L1; also termed Pre-mRNA 3'-end-processing factor FIP1) is a protein that in humans is encoded by the ''FIP1L1'' gene (also known as Rhe, FIP1, and hFip1). A medically important aspect of the ''F ...
''
Alternative Polyadenylation (APA)
Alternative polyadenylation (APA) is a regulatory mechanism that forms multiple 3' end on mRNA.
APA isoforms from the same gene can encode different proteins and/or contain different 3' untranslated regions (UTRs). Deregulation of APA has been associated with a number of human diseases. Since longer UTRs have more binding sites for microRNAs and/or RNA-binding proteins in comparison to shorter UTRs, APA require different stability, translation efficiency, and/or intracellular localization.
Mammalian PASs have a number of key ''cis'' elements.
* A(A/U)AAA hexamer
* U/GU-rich downstream element (DSE)
* U-rich upstream auxiliary elements (USEs)
* Upstream sequences conforming to the consensus UGUA
PAS sequences are variable, and many PASs lack one or more ''cis'' elements. PAS recognition is accomplished by protein-RNA interactions.
CPSF synergistically binds to the AAUAAA hexamer and CstF synergistically binds to the downstream element (DSE). The CFI complex binds to the UGUA motifs. CPSF, CstF, and CFI bind directly to RNA. They also recruit other proteins such as CFII, symplekin, and the poly(A) polymerase (PAP) to assemble the mRNA 3' processing complex, also known as the cleavage and polyadenylation complex. The assembly of these factors are facilitated by the C-terminal domain (CTD) of the RNA polymerase II (RNAP II) large subunit. The CTD provides a landing pad for mRNA processing factors.
Other Protein Complexes in the Cleavage and Polyadenylation Complex
Symplekin (SYMPK) is a scaffolding protein that mediates the interaction between CPSF and CstF.
In mammalian CPSF, both cleavage factor I (CFI
m) and cleavage and polyadenylation specificity factor (CPSF) are required for cleavage and polyadenylation whereas cleavage stimulation factor (CstF) is only essential for the cleavage step.
CPSF and CstF travel along with RNA polymerase II (RNAP II) during nascent gene transcription in search of the PAS.
Cleavage factor I (CFI
m) is made of 25 (
CPSF5), 59 (CPSF7), and 68 (CPSF6) kDa proteins. Cleavage factor II (CFII
m) is made of Pcf11, Clp1, and cleavage stimulation factor (CstF). CFII
m binds to the RNAP II C-terminal domain and other CpA factors.
Cleavage stimulation factor (CstF) has three subunits: CstF77 (CstF3), CstF50 (CstF1), and CstF64 (CstF2 and CstF2T). CstF recognizes the PAS that is 20 nucleotides downstream the signaling region of the cleavage site, which is a GU-rich
sequence motif
In biology, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and usually assumed to be related to biological function of the macromolecule. For example, an ''N''-glycosylation site motif can be defined as ''A ...
followed by U-rich sequences. CstF contributes to the selection of the cleavage site, as well as alternative polyadenylation.
Coupled Processes
Coupling of RNA polymerase II (pol II) transcription can influence processing reactions in three ways.
# localization
#* positions mRNA processing factors at the elongation complex, which raises their local concentration in the vicinity of the nascent transcript
# kinetic coupling
#* the rate of transcript can have profound effects on RNA folding and the assembly of RNA-protein complexes
# allosteric
#* contact between the pol II elongation complex and mRNA processing factors can allosterically inhibit or activate mRNA processing factors
References
Further reading
*
*
External links
*
{{Post transcriptional modification
Protein complexes
Gene expression