
In
molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a
protein that controls the rate of
transcription of
genetic information from
DNA to
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the p ...
, by binding to a specific
DNA sequence
DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. Th ...
.
The function of TFs is to regulate—turn on and off—genes in order to make sure that they are
expressed in the desired
cells
Cell most often refers to:
* Cell (biology), the functional basic unit of life
Cell may also refer to:
Locations
* Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct
cell division,
cell growth
Cell growth refers to an increase in the total mass of a cell, including both cytoplasmic, nuclear and organelle volume. Cell growth occurs when the overall rate of cellular biosynthesis (production of biomolecules or anabolism) is greater than ...
, and
cell death throughout life;
cell migration and organization (
body plan) during embryonic development; and intermittently in response to signals from outside the cell, such as a
hormone. There are up to 1600 TFs in the
human genome.
Transcription factors are members of the
proteome as well as
regulome.
TFs work alone or with other proteins in a complex, by promoting (as an
activator), or blocking (as a
repressor
In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
) the recruitment of
RNA polymerase
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the ...
(the enzyme that performs the
transcription of genetic information from DNA to RNA) to specific genes.
A defining feature of TFs is that they contain at least one
DNA-binding domain (DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate.
TFs are grouped into classes based on their DBDs.
Other proteins such as
coactivators,
chromatin remodelers,
histone acetyltransferases,
histone deacetylase
Histone deacetylases (, HDAC) are a class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on a histone, allowing the histones to wrap the DNA more tightly. This is important because DNA is wrapped around his ...
s,
kinase
In biochemistry, a kinase () is an enzyme that catalyzes the transfer of phosphate groups from high-energy, phosphate-donating molecules to specific substrates. This process is known as phosphorylation, where the high-energy ATP molecule don ...
s, and
methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs.
TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.
Number
Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene.
There are approximately 2800 proteins in the
human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors,
though other studies indicate it to be a smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example,
hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during
development.
Mechanism
Transcription factors bind to either
enhancer or
promoter regions of DNA adjacent to the genes that they regulate. Depending on the transcription factor, the transcription of the adjacent gene is either
up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression.
These mechanisms include:
* stabilize or block the binding of RNA polymerase to DNA
* catalyze the
acetylation
:
In organic chemistry, acetylation is an organic esterification reaction with acetic acid. It introduces an acetyl group into a chemical compound. Such compounds are termed ''acetate esters'' or simply '' acetates''. Deacetylation is the oppo ...
or deacetylation of
histone proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription:
**
histone acetyltransferase (HAT) activity – acetylates
histone proteins, which weakens the association of DNA with
histones, which make the DNA more accessible to transcription, thereby up-regulating transcription
**
histone deacetylase
Histone deacetylases (, HDAC) are a class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on a histone, allowing the histones to wrap the DNA more tightly. This is important because DNA is wrapped around his ...
(HDAC) activity – deacetylates
histone proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription
* recruit
coactivator or
corepressor proteins to the transcription factor DNA complex
Function
Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:
Basal transcriptional regulation
In
eukaryote
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
s, an important class of transcription factors called
general transcription factors (GTFs) are necessary for transcription to occur.
Many of these GTFs do not actually bind DNA, but rather are part of the large
transcription preinitiation complex that interacts with
RNA polymerase
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the ...
directly. The most common GTFs are
TFIIA,
TFIIB,
TFIID (see also
TATA binding protein),
TFIIE,
TFIIF, and
TFIIH.
The preinitiation complex binds to
promoter regions of DNA upstream to the gene that they regulate.
Differential enhancement of transcription
Other transcription factors differentially regulate the expression of various genes by binding to
enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.
Development
Many transcription factors in
multicellular organism
A multicellular organism is an organism that consists of more than one cell, in contrast to unicellular organism.
All species of animals, land plants and most fungi are multicellular, as are many algae, whereas a few organisms are partially uni- ...
s are involved in development.
Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell
morphology or activities needed for
cell fate determination and
cellular differentiation
Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
. The
Hox transcription factor family, for example, is important for proper
body pattern formation in organisms as diverse as fruit flies to humans.
Another example is the transcription factor encoded by the
sex-determining region Y (SRY) gene, which plays a major role in determining sex in humans.
Response to intercellular signals
Cells can communicate with each other by releasing molecules that produce
signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade.
Estrogen signaling is an example of a fairly short signaling cascade that involves the
estrogen receptor transcription factor: Estrogen is secreted by tissues such as the
ovaries and
placenta, crosses the
cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's
cytoplasm. The estrogen receptor then goes to the cell's
nucleus and binds to its
DNA-binding sites, changing the transcriptional regulation of the associated genes.
Response to environment
Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include
heat shock factor (HSF), which upregulates genes necessary for survival at higher temperatures,
hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments,
and
sterol regulatory element binding protein
Sterol regulatory element-binding proteins (SREBPs) are transcription factors that bind to the sterol regulatory element DNA sequence TCACNCCAC. Mammalian SREBPs are encoded by the genes ''SREBF1'' and ''SREBF2''. SREBPs belong to the basic-h ...
(SREBP), which helps maintain proper
lipid levels in the cell.
Cell cycle control
Many transcription factors, especially some that are
proto-oncogenes or
tumor suppressors, help regulate the
cell cycle and as such determine how large a cell will get and when it can divide into two daughter cells.
One example is the
Myc oncogene, which has important roles in
cell growth
Cell growth refers to an increase in the total mass of a cell, including both cytoplasmic, nuclear and organelle volume. Cell growth occurs when the overall rate of cellular biosynthesis (production of biomolecules or anabolism) is greater than ...
and
apoptosis
Apoptosis (from grc, ἀπόπτωσις, apóptōsis, 'falling off') is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to characteristic cell changes (morphology) and death. These changes incl ...
.
Pathogenesis
Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (
TAL effectors) secreted by
Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection.
TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site.
This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell.
Regulation
It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:
Synthesis
Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a
negative feedback
Negative feedback (or balancing feedback) occurs when some function (Mathematics), function of the output of a system, process, or mechanism is feedback, fed back in a manner that tends to reduce the fluctuations in the output, whether caused by ...
loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.
Nuclear localization
In
eukaryote
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
s, transcription factors (like most proteins) are transcribed in the
nucleus but are then translated in the cell's
cytoplasm. Many proteins that are active in the nucleus contain
nuclear localization signals that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation.
Important classes of transcription factors such as some
nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
s must first bind a
ligand while in the cytoplasm before they can relocate to the nucleus.
Activation
Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including:
*
ligand binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example,
nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
s).
*
phosphorylation
In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
– Many transcription factors such as
STAT proteins must be
phosphorylated before they can bind DNA.
* interaction with other transcription factors (''e.g.'', homo- or hetero-
dimerization
A dimer () (''wikt:di-, di-'', "two" + ''-mer'', "parts") is an oligomer consisting of two monomers joined by bonds that can be either strong or weak, Covalent bond, covalent or Intermolecular force, intermolecular. Dimers also have significant im ...
) or
coregulatory proteins
Accessibility of DNA-binding site
In eukaryotes, DNA is organized with the help of
histones into compact particles called
nucleosome
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool. The nucleosome is the fundamen ...
s, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called
pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as
chromatin remodelers. Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to
compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same
gene.
Availability of other cofactors/transcription factors
Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as
cofactors that allow efficient recruitment of the
preinitiation complex and
RNA polymerase
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the ...
. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary.
Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with
NF-κB, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.
Interaction with methylated cytosine
Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5' to 3' DNA sequence, a
CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription,
while methylation of CpGs in the body of a gene increases expression.
TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene.
The
DNA binding sites of 519 transcription factors were evaluated.
Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located.
TET enzymes do not specifically bind to methylcytosine except when recruited (see
DNA demethylation). Multiple transcription factors important in cell differentiation and lineage specification, including
NANOG,
SALL4A,
WT1,
EBF1
Transcription factor COE1 is a protein that in humans is encoded by the ''EBF1'' gene.
EBF1 stands for Early B-Cell Factor 1.
EBF1 controls the expression of key proteins required for B cell differentiation, signal transduction and function. The ...
,
PU.1, and
E2A, have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine).
TET-mediated conversion of mC to hmC appears to disrupt the binding of 5mC-binding proteins including
MECP2
''MECP2'' (methyl CpG binding protein 2) is a gene that encodes the protein MECP2. MECP2 appears to be essential for the normal function of nerve cells. The protein seems to be particularly important for mature nerve cells, where it is present in ...
and MBD (
Methyl-CpG-binding domain) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes.
EGR1 is an important transcription factor in
memory formation. It has an essential role in
brain neuron epigenetic
In biology, epigenetics is the study of stable phenotypic changes (known as ''marks'') that do not involve alterations in the DNA sequence. The Greek prefix '' epi-'' ( "over, outside of, around") in ''epigenetics'' implies features that are "o ...
reprogramming. The transcription factor
EGR1 recruits the
TET1 protein that initiates a pathway of
DNA demethylation. EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in
learning
Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machine learning, machines ...
(see
Epigenetics in learning and memory).
Structure

Transcription factors are modular in structure and contain the following
domains:
*
DNA-binding domain (DBD), which attaches to specific sequences of DNA (
enhancer or
promoter. Necessary component for all vectors. Used to drive transcription of the vector's transgene
promoter sequences) adjacent to regulated genes. DNA sequences that bind transcription factors are often referred to as
response elements.
*
Activation domain (AD), which contains binding sites for other proteins such as
transcription coregulators. These binding sites are frequently referred to as activation functions (AFs), Transactivation domain (TAD) or Trans-activating domain
TAD but not mix with topologically associating domain
TAD.
* An optional signal-sensing domain (SSD) (''e.g.'', a ligand-binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.
DNA-binding domain

The portion (
domain) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:
Response elements
The DNA sequence that a transcription factor binds to is called a
transcription factor-binding site or
response element.
Transcription factors interact with their binding sites using a combination of
electrostatic (of which
hydrogen bond
In chemistry, a hydrogen bond (or H-bond) is a primarily electrostatic force of attraction between a hydrogen (H) atom which is covalently bound to a more electronegative "donor" atom or group (Dn), and another electronegative atom bearing a ...
s are a special case) and
Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all
bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction.
For example, although the
consensus binding site for the
TATA-binding protein (TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA.
Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the
genome of the
cell. Other constraints, such as DNA accessibility in the cell or availability of
cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in a living cell.
Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.
Clinical significance
Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.
Disorders
Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with
mutations in transcription factors.
Many transcription factors are either
tumor suppressors or
oncogenes, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the
NF-kappaB and
AP-1 families, (2) the
STAT family and (3) the
steroid receptors
Steroid hormone receptors are found in the cell nucleus, nucleus, cytosol, and also on the plasma membrane of target cells. They are generally intracellular receptors (typically cytoplasmic or nuclear) and initiate signal transduction for steroid h ...
.
Below are a few of the better-studied examples:
Potential drug targets
Approximately 10% of currently prescribed drugs directly target the
nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
class of transcription factors.
Examples include
tamoxifen and
bicalutamide for the treatment of
breast and
prostate cancer
Prostate cancer is cancer of the prostate. Prostate cancer is the second most common cancerous tumor worldwide and is the fifth leading cause of cancer-related mortality among men. The prostate is a gland in the male reproductive system that sur ...
, respectively, and various types of
anti-inflammatory and
anabolic steroid
A steroid is a biologically active organic compound with four rings arranged in a specific molecular configuration. Steroids have two principal biological functions: as important components of cell membranes that alter membrane fluidity; and a ...
s. In addition, transcription factors are often indirectly modulated by drugs through
signaling cascade
A biochemical cascade, also known as a signaling cascade or signaling pathway, is a series of chemical reactions that occur within a biological cell when initiated by a stimulus. This stimulus, known as a first messenger, acts on a receptor that ...
s. It might be possible to directly target other less-explored transcription factors such as
NF-κB with drugs.
Transcription factors outside the nuclear receptor family are thought to be more difficult to target with
small molecule therapeutics since it is not clear that they are
"drugable" but progress has been made on Pax2
and the
notch pathway.
[
*]
Role in evolution
Gene duplications have played a crucial role in the
evolution of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy
Leafy transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative
phylogenetic hypotheses, and the role of transcription factors in the evolution of all species.
Role in biocontrol activity
The transcription factors have a role in
resistance
Resistance may refer to:
Arts, entertainment, and media Comics
* Either of two similarly named but otherwise unrelated comic book series, both published by Wildstorm:
** ''Resistance'' (comics), based on the video game of the same title
** ''T ...
activity which important for successful
biocontrol activity. The resistant to
oxidative stress and alkaline pH sensing were contributed from the transcription factor Yap1 and Rim101 of the ''
Papiliotrema terrestris'' LS28 as molecular tools revealed an understanding of the genetic mechanisms underlying the biocontrol activity which will supports
disease management programs based on biological and integrated control.
Analysis
There are different technologies available to analyze transcription factors. On the
genomic
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
level, DNA-
sequencing
In genetics and biochemistry, sequencing means to determine the primary structure (sometimes incorrectly called the primary sequence) of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succ ...
and database research are commonly used.
The protein version of the transcription factor is detectable by using specific
antibodies
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
. The sample is detected on a
western blot. By using
electrophoretic mobility shift assay (EMSA),
the activation profile of transcription factors can be detected. A
multiplex approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel.
The most commonly used method for identifying transcription factor binding sites is
chromatin immunoprecipitation Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genom ...
(ChIP). This technique relies on chemical fixation of chromatin with
formaldehyde, followed by co-precipitation of DNA and the transcription factor of interest using an
antibody that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing (
ChIP-seq) to determine transcription factor binding sites. If no antibody is available for the protein of interest,
DamID may be a convenient alternative.
Classes
As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains.
Mechanistic
There are two mechanistic classes of transcription factors:
*
General transcription factors are involved in the formation of a
preinitiation complex. The most common are abbreviated as
TFIIA,
TFIIB,
TFIID,
TFIIE,
TFIIF, and
TFIIH. They are ubiquitous and interact with the core promoter region surrounding the transcription start site(s) of all
class II gene
A class II gene is a type of gene that codes for a protein. Class II genes are transcribed by RNAP II .
Class II genes have a promoter that may contain a TATA box
In molecular biology, the TATA box (also called the Goldberg–Hogness box) ...
s.
*Upstream transcription factors are proteins that bind somewhere upstream of the initiation site to stimulate or repress transcription. These are roughly synonymous with specific transcription factors, because they vary considerably depending on what
recognition sequences are present in the proximity of the gene.
Functional
Transcription factors have been classified according to their regulatory function:
* I. constitutively active – present in all cells at all times –
general transcription factors,
Sp1,
NF1
Neurofibromin 1 (''NF1'') is a gene in humans that is located on chromosome 17. ''NF1'' codes for neurofibromin, a GTPase-activating protein that negatively regulates RAS/MAPK pathway activity by accelerating the hydrolysis of Ras-bound GTP. ''N ...
,
CCAAT
* II. conditionally active – requires activation
** II.A developmental (cell specific) – expression is tightly controlled, but, once expressed, require no additional activation –
GATA,
HNF,
PIT-1,
MyoD,
Myf5,
Hox,
Winged Helix
** II.B signal-dependent – requires external signal for activation
*** II.B.1 extracellular ligand (
endocrine
The endocrine system is a messenger system comprising feedback loops of the hormones released by internal glands of an organism directly into the circulatory system, regulating distant target organs. In vertebrates, the hypothalamus is the neu ...
or
paracrine)-dependent –
nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
s
*** II.B.2 intracellular ligand (
autocrine)-dependent – activated by small intracellular molecules –
SREBP,
p53, orphan nuclear receptors
*** II.B.3 cell membrane receptor-dependent – second messenger signaling cascades resulting in the phosphorylation of the transcription factor
**** II.B.3.a resident nuclear factors – reside in the nucleus regardless of activation state –
CREB,
AP-1,
Mef2
**** II.B.3.b latent cytoplasmic factors – inactive form reside in the cytoplasm, but, when activated, are translocated into the nucleus –
STAT,
R-SMAD
R-SMADs are receptor-regulated SMADs. SMADs are transcription factors that transduce extracellular TGF-β superfamily ligand signaling from cell membrane bound TGF-β receptors into the nucleus where they activate transcription TGF-β target gen ...
,
NF-κB,
Notch,
TUBBY,
NFAT
Structural
Transcription factors are often classified based on the
sequence similarity and hence the
tertiary structure of their DNA-binding domains:
*1 Superclass: Basic Domains
**1.1 Class:
Leucine zipper factors (
bZIP)
***1.1.1 Family:
AP-1(-like) components; includes (
c-Fos
Protein c-Fos is a proto-oncogene that is the human homolog of the retroviral oncogene v-fos. It is encoded in humans by the ''FOS'' gene. It was first discovered in rat fibroblasts as the transforming gene of the FBJ MSV (Finkel–Biskis–Jinkin ...
/
c-Jun
Transcription factor Jun is a protein that in humans is encoded by the ''JUN'' gene. c-Jun, in combination with protein c-Fos, forms the AP-1 early response transcription factor. It was first identified as the Fos-binding protein p39 and only lat ...
)
***1.1.2 Family:
CREB
***1.1.3 Family:
C/EBP-like factors
***1.1.4 Family: bZIP /
PAR
***1.1.5 Family: Plant G-box binding factors
***1.1.6 Family: ZIP only
**1.2 Class: Helix-loop-helix factors (
bHLH)
***1.2.1 Family: Ubiquitous (class A) factors
***1.2.2 Family: Myogenic transcription factors (
MyoD)
***1.2.3 Family: Achaete-Scute
***1.2.4 Family: Tal/Twist/Atonal/Hen
**1.3 Class: Helix-loop-helix / leucine zipper factors (
bHLH-ZIP)
***1.3.1 Family: Ubiquitous bHLH-ZIP factors; includes USF (
USF1,
USF2
Upstream stimulatory factor 2 is a protein that in humans is encoded by the ''USF2'' gene.
Function
This gene encodes a member of the basic helix-loop-helix leucine zipper family, and can function as a cellular transcription factor. The encode ...
); SREBP (
SREBP)
***1.3.2 Family: Cell-cycle controlling factors; includes
c-Myc
**1.4 Class: NF-1
***1.4.1 Family: NF-1 (
A,
B,
C,
X)
**1.5 Class: RF-X
***1.5.1 Family: RF-X (
1,
2,
3,
4,
5,
ANK)
**1.6 Class: bHSH
*2 Superclass: Zinc-coordinating DNA-binding domains
**2.1 Class: Cys4
zinc finger of
nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
type
***2.1.1 Family:
Steroid hormone receptors
***2.1.2 Family:
Thyroid hormone receptor-like factors
**2.2 Class: diverse Cys4 zinc fingers
***2.2.1 Family:
GATA-Factors
**2.3 Class: Cys2His2 zinc finger domain
***2.3.1 Family: Ubiquitous factors, includes
TFIIIA
Transcription factor IIIA is a protein that in humans is encoded by the ''GTF3A'' gene. It was first isolated and characterized by Wolffe and Brown in 1988.
The TFIIIA in ''Xenopus'' was the first zinc finger
A zinc finger is a small protei ...
,
Sp1
***2.3.2 Family: Developmental / cell cycle regulators; includes
Krüppel
Krüppel is a gap gene in ''Drosophila melanogaster'', located on the 2R chromosome, which encodes a zinc finger C2H2 transcription factor. Gap genes work together to establish the anterior-posterior segment patterning of the insect through r ...
***2.3.4 Family: Large factors with NF-6B-like binding properties
**2.4 Class: Cys6 cysteine-zinc cluster
**2.5 Class: Zinc fingers of alternating composition
*3 Superclass:
Helix-turn-helix
**3.1 Class:
Homeo domain
***3.1.1 Family: Homeo domain only; includes
Ubx
Ultrabithorax (Ubx) is a homeobox gene found in insects, and is used in the regulation of patterning in morphogenesis. There are many possible products of this gene, which function as transcription factors. Ubx is used in the specification of s ...
***3.1.2 Family:
POU domain factors; includes
Oct
***3.1.3 Family: Homeo domain with LIM region
***3.1.4 Family: homeo domain plus zinc finger motifs
**3.2 Class: Paired box
***3.2.1 Family: Paired plus homeo domain
***3.2.2 Family: Paired domain only
**3.3 Class:
Fork head /
winged helix
***3.3.1 Family: Developmental regulators; includes
forkhead
***3.3.2 Family: Tissue-specific regulators
***3.3.3 Family: Cell-cycle controlling factors
***3.3.0 Family: Other regulators
**3.4 Class:
Heat Shock Factors
***3.4.1 Family: HSF
**3.5 Class: Tryptophan clusters
***3.5.1 Family: Myb
***3.5.2 Family: Ets-type
***3.5.3 Family:
Interferon regulatory factors
**3.6 Class: TEA ( transcriptional enhancer factor) domain
***3.6.1 Family: TEA (
TEAD1,
TEAD2,
TEAD3,
TEAD4)
*4 Superclass: beta-Scaffold Factors with Minor Groove Contacts
**4.1 Class: RHR (
Rel homology region)
***4.1.1 Family: Rel/
ankyrin;
NF-kappaB
***4.1.2 Family: ankyrin only
***4.1.3 Family:
NFAT (Nuclear Factor of Activated T-cells) (
NFATC1,
NFATC2,
NFATC3)
**4.2 Class: STAT
***4.2.1 Family:
STAT
**4.3 Class: p53
***4.3.1 Family:
p53
**4.4 Class:
MADS box
***4.4.1 Family: Regulators of differentiation; includes (
Mef2)
***4.4.2 Family: Responders to external signals, SRF (
serum response factor) ()
***4.4.3 Family: Metabolic regulators (ARG80)
**4.5 Class: beta-Barrel alpha-helix transcription factors
**4.6 Class:
TATA binding proteins
***4.6.1 Family: TBP
**4.7 Class:
HMG-box
***4.7.1 Family:
SOX genes,
SRY
***4.7.2 Family: TCF-1 (
TCF1)
***4.7.3 Family: HMG2-related,
SSRP1
***4.7.4 Family: UBF
***4.7.5 Family: MATA
**4.8 Class: Heteromeric CCAAT factors
***4.8.1 Family: Heteromeric CCAAT factors
**4.9 Class: Grainyhead
***4.9.1 Family: Grainyhead
**4.10 Class:
Cold-shock domain factors
***4.10.1 Family: csd
**4.11 Class: Runt
***4.11.1 Family: Runt
*0 Superclass: Other Transcription Factors
**0.1 Class: Copper fist proteins
**0.2 Class: HMGI(Y) (
HMGA1)
***0.2.1 Family: HMGI(Y)
**0.3 Class: Pocket domain
**0.4 Class: E1A-like factors
**0.5 Class: AP2/EREBP-related factors
***0.5.1 Family:
AP2
***0.5.2 Family: EREBP
***0.5.3 Superfamily:
AP2/B3
****0.5.3.1 Family: ARF
****0.5.3.2 Family: ABI
****0.5.3.3 Family: RAV
See also
*
Cdx protein family
*
DNA-binding protein
*
Inhibitor of DNA-binding protein
*
Mapper(2) Mapper(2) is a database of transcription factor binding sites in multiple genomes.
See also
* Transcription factor
References
External links
* http://genome.ufl.edu/mapperdb
Biological databases
Gene expression
{{Biodatabase-stub ...
*
Nuclear receptor
In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These receptors work with other proteins to regulate the expression of speci ...
, a class of ligand activated transcription factors
*
Open Regulatory Annotation Database The Open Regulatory Annotation Database (also known as ORegAnno) is designed to promote community-based curation of regulatory information. Specifically, the database contains information about regulatory regions, transcription factor binding sites, ...
*
Phylogenetic footprinting
*
TRANSFAC database
*
YeTFaSCo
YeTFaSCo (The Yeast Transcription Factor Specificity Compendium) is a database of transcription factors for Saccharomyces cerevisiae.
See also
* Transcription factor
References
External links
* http://yetfasco.ccbr.utoronto.ca/
Biological ...
References
Further reading
* Carretero-Paulet, Lorenzo; Galstyan, Anahit; Roig-Villanova, Irma; Martínez-García, Jaime F.; Bilbao-Castro, Jose R. «Genome-Wide Classification and Evolutionary Analysis of the bHLH Family of Transcription Factors in Arabidopsis, Poplar, Rice, Moss, and Algae». ''Plant Physiology'', 153, 3, 2010-07, pàg. 1398–1412. DOI
10.1104/pp.110.153593
*
*
External links
*
Transcription factor database
Plant Transcription Factor Database and Transcriptional Regulation Data and Analysis Platform
{{Authority control
Gene expression
Protein families
DNA
Biophysics
Evolutionary developmental biology