Upstream Binding Factor
   HOME

TheInfoList



OR:

In
molecular biology Molecular biology is a branch of biology that seeks to understand the molecule, molecular basis of biological activity in and between Cell (biology), cells, including biomolecule, biomolecular synthesis, modification, mechanisms, and interactio ...
, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
that controls the rate of transcription of genetic information from
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
to
messenger RNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
, by binding to a specific
DNA sequence A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA (using GACT) or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nu ...
. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the desired cells at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct
cell division Cell division is the process by which a parent cell (biology), cell divides into two daughter cells. Cell division usually occurs as part of a larger cell cycle in which the cell grows and replicates its chromosome(s) before dividing. In eukar ...
,
cell growth Cell most often refers to: * Cell (biology), the functional basic unit of life * Cellphone, a phone connected to a cellular network * Clandestine cell, a penetration-resistant form of a secret or outlawed organization * Electrochemical cell, a de ...
, and
cell death Cell death is the event of a biological cell ceasing to carry out its functions. This may be the result of the natural process of old cells dying and being replaced by new ones, as in programmed cell death, or may result from factors such as di ...
throughout life;
cell migration Cell migration is a central process in the development and maintenance of multicellular organisms. Tissue formation during embryogenesis, embryonic development, wound healing and immune system, immune responses all require the orchestrated movemen ...
and organization (
body plan A body plan, (), or ground plan is a set of morphology (biology), morphological phenotypic trait, features common to many members of a phylum of animals. The vertebrates share one body plan, while invertebrates have many. This term, usually app ...
) during embryonic development; and intermittently in response to signals from outside the cell, such as a
hormone A hormone (from the Ancient Greek, Greek participle , "setting in motion") is a class of cell signaling, signaling molecules in multicellular organisms that are sent to distant organs or tissues by complex biological processes to regulate physio ...
. There are approximately 1600 TFs in the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as the DNA within each of the 23 distinct chromosomes in the cell nucleus. A small DNA molecule is found within individual Mitochondrial DNA, mitochondria. These ar ...
. Transcription factors are members of the
proteome A proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. P ...
as well as
regulome Regulome refers to the whole set of regulatory components in a cell. Those components can be regulatory elements, genes, mRNAs, proteins, and metabolites. The description includes the interplay of regulatory effects between these components, and t ...
. TFs work alone or with other proteins in a complex, by promoting (as an activator), or blocking (as a
repressor In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
) the recruitment of
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template. Using the e ...
(the enzyme that performs the transcription of genetic information from DNA to RNA) to specific genes. A defining feature of TFs is that they contain at least one
DNA-binding domain A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a gener ...
(DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate. TFs are grouped into classes based on their DBDs. Other proteins such as coactivators, chromatin remodelers,
histone acetyltransferase Histone acetyltransferases (HATs) are enzymes that acetylation, acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine, ε-''N''-acetyllysine. DNA is wrapped around his ...
s,
histone deacetylase Histone deacetylases (, HDAC) are a class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on both histone and non-histone proteins. HDACs allow histones to wrap the DNA more tightly. This is important becaus ...
s,
kinase In biochemistry, a kinase () is an enzyme that catalyzes the transfer of phosphate groups from high-energy, phosphate-donating molecules to specific substrates. This process is known as phosphorylation, where the high-energy ATP molecule don ...
s, and
methylase Methyltransferases are a large group of enzymes that all methylate their substrates but can be split into several subclasses based on their structural features. The most common class of methyltransferases is class I, all of which contain a Ro ...
s are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs. TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.


Number

Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene. There are approximately 2800 proteins in the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as the DNA within each of the 23 distinct chromosomes in the cell nucleus. A small DNA molecule is found within individual Mitochondrial DNA, mitochondria. These ar ...
that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors, though other studies indicate it to be a smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example,
hepatocyte nuclear factors Hepatocyte nuclear factors (HNFs) are a group of phylogenetically unrelated transcription factors that regulate the transcription of a diverse group of genes into proteins. These proteins include blood clotting factors and in addition, enzymes an ...
). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during
development Development or developing may refer to: Arts *Development (music), the process by which thematic material is reshaped * Photographic development *Filmmaking, development phase, including finance and budgeting * Development hell, when a proje ...
.


Mechanism

Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate based on recognizing specific DNA motifs. Depending on the transcription factor, the transcription of the adjacent gene is either up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression. These mechanisms include: * stabilize or block the binding of RNA polymerase to DNA * catalyze the
acetylation : In chemistry, acetylation is an organic esterification reaction with acetic acid. It introduces an acetyl group into a chemical compound. Such compounds are termed ''acetate esters'' or simply ''acetates''. Deacetylation is the opposite react ...
or deacetylation of
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription: **
histone acetyltransferase Histone acetyltransferases (HATs) are enzymes that acetylation, acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine, ε-''N''-acetyllysine. DNA is wrapped around his ...
(HAT) activity – acetylates
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
proteins, which weakens the association of DNA with
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
s, which make the DNA more accessible to transcription, thereby up-regulating transcription **
histone deacetylase Histone deacetylases (, HDAC) are a class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on both histone and non-histone proteins. HDACs allow histones to wrap the DNA more tightly. This is important becaus ...
(HDAC) activity – deacetylates
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription * recruit coactivator or
corepressor In genetics and molecular biology, a corepressor is a molecule that represses the expression of genes. In prokaryotes A prokaryote (; less commonly spelled procaryote) is a single-celled organism whose cell lacks a nucleus and other membra ...
proteins to the transcription factor DNA complex


Function

Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:


Basal transcriptional regulation

In
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s, an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. Many of these GTFs do not actually bind DNA, but rather are part of the large
transcription preinitiation complex The preinitiation complex (abbreviated PIC) is a complex of approximately 100 proteins that is necessary for the transcription (genetics), transcription of protein-coding genes in eukaryotes and archaea. The preinitiation complex positions RNA po ...
that interacts with
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template. Using the e ...
directly. The most common GTFs are TFIIA, TFIIB, TFIID (see also
TATA binding protein The TATA-binding protein (TBP) is a general transcription factor that binds to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene promoters. T ...
), TFIIE, TFIIF, and TFIIH. The preinitiation complex binds to promoter regions of DNA upstream to the gene that they regulate.


Differential enhancement of transcription

Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.


Development

Many transcription factors in
multicellular organism A multicellular organism is an organism that consists of more than one cell (biology), cell, unlike unicellular organisms. All species of animals, Embryophyte, land plants and most fungi are multicellular, as are many algae, whereas a few organism ...
s are involved in development. Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell
morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines *Morphology (archaeology), study of the shapes or forms of artifacts *Morphology (astronomy), study of the shape of astronomical objects such as nebulae, galaxies, ...
or activities needed for
cell fate determination Within the field of developmental biology, one goal is to understand how a particular cell develops into a specific cell type, known as fate determination. In an embryo, several processes play out at a molecular level to create an organism. These pr ...
and
cellular differentiation Cellular differentiation is the process in which a stem cell changes from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellula ...
. The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. Another example is the transcription factor encoded by the sex-determining region Y (SRY) gene, which plays a major role in determining sex in humans.


Response to intercellular signals

Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade.
Estrogen Estrogen (also spelled oestrogen in British English; see spelling differences) is a category of sex hormone responsible for the development and regulation of the female reproductive system and secondary sex characteristics. There are three ...
signaling is an example of a fairly short signaling cascade that involves the
estrogen receptor Estrogen receptors (ERs) are proteins found in cell (biology), cells that function as receptor (biochemistry), receptors for the hormone estrogen (17β-estradiol). There are two main classes of ERs. The first includes the intracellular estrogen ...
transcription factor: Estrogen is secreted by tissues such as the
ovaries The ovary () is a gonad in the female reproductive system that produces ova; when released, an ovum travels through the fallopian tube/oviduct into the uterus. There is an ovary on the left and the right side of the body. The ovaries are endocr ...
and
placenta The placenta (: placentas or placentae) is a temporary embryonic and later fetal organ that begins developing from the blastocyst shortly after implantation. It plays critical roles in facilitating nutrient, gas, and waste exchange between ...
, crosses the
cell membrane The cell membrane (also known as the plasma membrane or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of a cell from the outside environment (the extr ...
of the recipient cell, and is bound by the estrogen receptor in the cell's
cytoplasm The cytoplasm describes all the material within a eukaryotic or prokaryotic cell, enclosed by the cell membrane, including the organelles and excluding the nucleus in eukaryotic cells. The material inside the nucleus of a eukaryotic cell a ...
. The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes.


Response to environment

Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include
heat shock factor In molecular biology, heat shock factors (HSF), are the transcription factors that regulate the expression of the heat shock proteins. A typical example is the heat shock factor of ''Drosophila melanogaster''. Function Heat shock factors (H ...
(HSF), which upregulates genes necessary for survival at higher temperatures, hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments, and
sterol regulatory element binding protein Sterol regulatory element-binding proteins (SREBPs) are transcription factors that bind to the sterol regulatory element DNA sequence TCACNCCAC. Mammalian SREBPs are encoded by the genes '' SREBF1'' and '' SREBF2''. SREBPs belong to the basic- ...
(SREBP), which helps maintain proper
lipid Lipids are a broad group of organic compounds which include fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids include storing ...
levels in the cell.


Cell cycle control

Many transcription factors, especially some that are
proto-oncogene An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels.
s or tumor suppressors, help regulate the
cell cycle The cell cycle, or cell-division cycle, is the sequential series of events that take place in a cell (biology), cell that causes it to divide into two daughter cells. These events include the growth of the cell, duplication of its DNA (DNA re ...
and as such determine how large a cell will get and when it can divide into two daughter cells. One example is the Myc oncogene, which has important roles in
cell growth Cell most often refers to: * Cell (biology), the functional basic unit of life * Cellphone, a phone connected to a cellular network * Clandestine cell, a penetration-resistant form of a secret or outlawed organization * Electrochemical cell, a de ...
and
apoptosis Apoptosis (from ) is a form of programmed cell death that occurs in multicellular organisms and in some eukaryotic, single-celled microorganisms such as yeast. Biochemistry, Biochemical events lead to characteristic cell changes (Morphology (biol ...
.


Pathogenesis

Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (
TAL effector TAL (transcription activator-like) effectors (often referred to as TALEs, but not to be confused with the Homeobox, three amino acid loop extension homeobox class of proteins) are proteins secreted by some β-proteobacteria, β- and γ-proteobacter ...
s) secreted by
Xanthomonas ''Xanthomonas'' (from greek: ''xanthos'' – "yellow"; ''monas'' – "entity") is a genus of bacteria, many of which cause plant pathology, plant diseases. There are at least 27 plant associated ''Xanthomonas spp.'', that all together infect at l ...
bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site. This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell.


Regulation

It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:


Synthesis

Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a
negative feedback Negative feedback (or balancing feedback) occurs when some function (Mathematics), function of the output of a system, process, or mechanism is feedback, fed back in a manner that tends to reduce the fluctuations in the output, whether caused ...
loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.


Nuclear localization

In
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s, transcription factors (like most proteins) are transcribed in the nucleus but are then translated in the cell's
cytoplasm The cytoplasm describes all the material within a eukaryotic or prokaryotic cell, enclosed by the cell membrane, including the organelles and excluding the nucleus in eukaryotic cells. The material inside the nucleus of a eukaryotic cell a ...
. Many proteins that are active in the nucleus contain
nuclear localization signal A nuclear localization signal ''or'' sequence (NLS) is an amino acid sequence that 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysin ...
s that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation. Important classes of transcription factors such as some
nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
s must first bind a
ligand In coordination chemistry, a ligand is an ion or molecule with a functional group that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's el ...
while in the cytoplasm before they can relocate to the nucleus.


Activation

Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including: *
ligand In coordination chemistry, a ligand is an ion or molecule with a functional group that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's el ...
binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example,
nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
s). *
phosphorylation In biochemistry, phosphorylation is described as the "transfer of a phosphate group" from a donor to an acceptor. A common phosphorylating agent (phosphate donor) is ATP and a common family of acceptor are alcohols: : This equation can be writ ...
– Many transcription factors such as
STAT protein STAT, Stat., or stat may refer to: * stat (system call), a Unix system call that returns file attributes of an inode * ''Stat'' (TV series), an American sitcom that aired in 1991 * Stat (website), a health-oriented news website * STAT protein, ...
s must be
phosphorylated In biochemistry, phosphorylation is described as the "transfer of a phosphate group" from a donor to an acceptor. A common phosphorylating agent (phosphate donor) is ATP and a common family of acceptor are alcohols: : This equation can be writt ...
before they can bind DNA. * interaction with other transcription factors (''e.g.'', homo- or hetero- dimerization) or coregulatory proteins


Accessibility of DNA-binding site

In eukaryotes, DNA is organized with the help of
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
s into compact particles called
nucleosome A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone, histone proteins and resembles thread wrapped around a bobbin, spool. The nucleosome ...
s, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers. Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same
gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
.


Availability of other cofactors/transcription factors

Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of the preinitiation complex and
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template. Using the e ...
. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with
NF-κB Nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) is a family of transcription factor protein complexes that controls transcription (genetics), transcription of DNA, cytokine production and cell survival. NF-κB is found i ...
, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.


Interaction with methylated cytosine

Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5' to 3' DNA sequence, a CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription, while methylation of CpGs in the body of a gene increases expression. TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene. The
DNA binding site DNA binding sites are a type of binding site found in DNA where other molecules may bind. DNA binding sites are distinct from other binding sites in that (1) they are part of a DNA sequence (e.g. a genome) and (2) they are bound by DNA-binding ...
s of 519 transcription factors were evaluated. Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located. TET enzymes do not specifically bind to methylcytosine except when recruited (see
DNA demethylation For molecular biology in mammals, DNA demethylation causes replacement of 5-methylcytosine (5mC) in a DNA sequence by cytosine (C) (see figure of 5mC and C). DNA demethylation can occur by an active process at the site of a 5mC in a DNA sequence ...
). Multiple transcription factors important in cell differentiation and lineage specification, including NANOG, SALL4A,
WT1 Wilms tumor protein (WT33) is a protein that in humans is encoded by the ''WT1'' gene on chromosome 11p. Function This gene encodes a transcription factor that contains four zinc finger motifs at the C-terminus and a proline / glutamine-rich ...
, EBF1, PU.1, and E2A, have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine). TET-mediated conversion of mC to hmC appears to disrupt the binding of 5mC-binding proteins including MECP2 and MBD ( Methyl-CpG-binding domain) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes. EGR1 is an important transcription factor in
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
formation. It has an essential role in
brain The brain is an organ (biology), organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It consists of nervous tissue and is typically located in the head (cephalization), usually near organs for ...
neuron A neuron (American English), neurone (British English), or nerve cell, is an membrane potential#Cell excitability, excitable cell (biology), cell that fires electric signals called action potentials across a neural network (biology), neural net ...
epigenetic In biology, epigenetics is the study of changes in gene expression that happen without changes to the DNA sequence. The Greek prefix ''epi-'' (ἐπι- "over, outside of, around") in ''epigenetics'' implies features that are "on top of" or "in ...
reprogramming. The transcription factor EGR1 recruits the TET1 protein that initiates a pathway of
DNA demethylation For molecular biology in mammals, DNA demethylation causes replacement of 5-methylcytosine (5mC) in a DNA sequence by cytosine (C) (see figure of 5mC and C). DNA demethylation can occur by an active process at the site of a 5mC in a DNA sequence ...
. EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in
learning Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, Attitude (psychology), attitudes, and preferences. The ability to learn is possessed by humans, non-human animals, and ...
(see Epigenetics in learning and memory).


Structure

Transcription factors are modular in structure and contain the following domains: *
DNA-binding domain A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a gener ...
(DBD), which attaches to specific sequences of DNA ( enhancer or promoter. Necessary component for all vectors. Used to drive transcription of the vector's transgene promoter sequences) adjacent to regulated genes. DNA sequences that bind transcription factors are often referred to as
response elements ''Response elements'' are short sequences of DNA within a gene Promoter (genetics), promoter or Enhancer (genetics), enhancer region that are able to bind specific transcription factors and regulate Transcription (genetics), transcription of genes. ...
. * Activation domain (AD), which contains binding sites for other proteins such as transcription coregulators. These binding sites are frequently referred to as activation functions (AFs), Transactivation domain (TAD) or Trans-activating domain TAD, not to be confused with topologically associating domain ( TAD). * An optional signal-sensing domain (SSD) (''e.g.'', a ligand-binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.


DNA-binding domain

The portion ( domain) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:


Response elements

The DNA sequence that a transcription factor binds to is called a transcription factor-binding site or
response element ''Response elements'' are short sequences of DNA within a gene promoter or enhancer region that are able to bind specific transcription factors and regulate transcription of genes. Under conditions of stress, a transcription activator protein bi ...
. Transcription factors interact with their binding sites using a combination of
electrostatic Electrostatics is a branch of physics that studies slow-moving or stationary electric charges. Since classical times, it has been known that some materials, such as amber, attract lightweight particles after rubbing. The Greek word (), mean ...
(of which
hydrogen bond In chemistry, a hydrogen bond (H-bond) is a specific type of molecular interaction that exhibits partial covalent character and cannot be described as a purely electrostatic force. It occurs when a hydrogen (H) atom, Covalent bond, covalently b ...
s are a special case) and
Van der Waals force In molecular physics and chemistry, the van der Waals force (sometimes van der Waals' force) is a distance-dependent interaction between atoms or molecules. Unlike ionic or covalent bonds, these attractions do not result from a chemical elec ...
s. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction. For example, although the consensus binding site for the
TATA-binding protein The TATA-binding protein (TBP) is a general transcription factor that binds to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene promoters. T ...
(TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA. Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the
genome A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
of the cell. Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in a living cell. Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.


Clinical significance

Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.


Disorders

Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
s in transcription factors. Many transcription factors are either
tumor suppressor A tumor suppressor gene (TSG), or anti-oncogene, is a gene that regulates a cell (biology), cell during cell division and replication. If the cell grows uncontrollably, it will result in cancer. When a tumor suppressor gene is mutated, it results ...
s or
oncogene An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels.
s, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the steroid receptors. Below are a few of the better-studied examples:


Potential drug targets

Approximately 10% of currently prescribed drugs directly target the
nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
class of transcription factors. Examples include
tamoxifen Tamoxifen, sold under the brand name Nolvadex among others, is a selective estrogen receptor modulator used to prevent breast cancer in women and men. It is also being studied for other types of cancer. It has been used for Albright syndrome ...
and
bicalutamide Bicalutamide, sold under the brand name Casodex among others, is an antiandrogen medication that is primarily used to treat prostate cancer. It is typically used together with a gonadotropin-releasing hormone (GnRH) analogue or surgical remo ...
for the treatment of
breast The breasts are two prominences located on the upper ventral region of the torso among humans and other primates. Both sexes develop breasts from the same embryology, embryological tissues. The relative size and development of the breasts is ...
and
prostate cancer Prostate cancer is the neoplasm, uncontrolled growth of cells in the prostate, a gland in the male reproductive system below the bladder. Abnormal growth of the prostate tissue is usually detected through Screening (medicine), screening tests, ...
, respectively, and various types of
anti-inflammatory Anti-inflammatory is the property of a substance or treatment that reduces inflammation, fever or swelling. Anti-inflammatory drugs, also called anti-inflammatories, make up about half of analgesics. These drugs reduce pain by inhibiting mechan ...
and
anabolic Anabolism () is the set of metabolic pathways that construct macromolecules like DNA or RNA from smaller units. These reactions require energy, known also as an endergonic process. Anabolism is the building-up aspect of metabolism, whereas catab ...
steroid A steroid is an organic compound with four fused compound, fused rings (designated A, B, C, and D) arranged in a specific molecular configuration. Steroids have two principal biological functions: as important components of cell membranes t ...
s. In addition, transcription factors are often indirectly modulated by drugs through signaling cascades. It might be possible to directly target other less-explored transcription factors such as
NF-κB Nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) is a family of transcription factor protein complexes that controls transcription (genetics), transcription of DNA, cytokine production and cell survival. NF-κB is found i ...
with drugs. Transcription factors outside the nuclear receptor family are thought to be more difficult to target with
small molecule In molecular biology and pharmacology, a small molecule or micromolecule is a low molecular weight (≤ 1000 daltons) organic compound that may regulate a biological process, with a size on the order of 1 nm. Many drugs are small molecules; ...
therapeutics since it is not clear that they are "drugable" but progress has been made on Pax2 and the notch pathway. *


Role in evolution

Gene duplications have played a crucial role in the
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy Leafy transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative
phylogenetic In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...
hypotheses, and the role of transcription factors in the evolution of all species.


Role in biocontrol activity

The transcription factors have a role in resistance activity which is important for successful
biocontrol Biological control or biocontrol is a method of pest control, controlling pests, whether pest animals such as insects and mites, weeds, or pathogens affecting animals or phytopathology, plants by bioeffector, using other organisms. It relies o ...
activity. The resistant to
oxidative stress Oxidative stress reflects an imbalance between the systemic manifestation of reactive oxygen species and a biological system's ability to readily detoxify the reactive intermediates or to repair the resulting damage. Disturbances in the normal ...
and alkaline pH sensing were contributed from the transcription factor Yap1 and Rim101 of the '' Papiliotrema terrestris'' LS28 as molecular tools revealed an understanding of the genetic mechanisms underlying the biocontrol activity which supports disease management programs based on biological and integrated control.


Analysis

There are different technologies available to analyze transcription factors. On the
genomic Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, ...
level, DNA-
sequencing In genetics and biochemistry, sequencing means to determine the primary structure (sometimes incorrectly called the primary sequence) of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succ ...
and database research are commonly used. The protein version of the transcription factor is detectable by using specific
antibodies An antibody (Ab) or immunoglobulin (Ig) is a large, Y-shaped protein belonging to the immunoglobulin superfamily which is used by the immune system to identify and neutralize antigens such as bacteria and viruses, including those that caus ...
. The sample is detected on a
western blot The western blot (sometimes called the protein immunoblot), or western blotting, is a widely used analytical technique in molecular biology and immunogenetics to detect specific proteins in a sample of tissue homogenate or extract. Besides detect ...
. By using electrophoretic mobility shift assay (EMSA), the activation profile of transcription factors can be detected. A
multiplex Multiplex may refer to: Science and technology * Multiplex communication, combining many signals into one transmission circuit or channel ** Multiplex (television), a group of digital television or radio channels that are combined for broadcast * ...
approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel. The most commonly used method for identifying transcription factor binding sites is
chromatin immunoprecipitation Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genom ...
(ChIP). This technique relies on chemical fixation of chromatin with
formaldehyde Formaldehyde ( , ) (systematic name methanal) is an organic compound with the chemical formula and structure , more precisely . The compound is a pungent, colourless gas that polymerises spontaneously into paraformaldehyde. It is stored as ...
, followed by co-precipitation of DNA and the transcription factor of interest using an
antibody An antibody (Ab) or immunoglobulin (Ig) is a large, Y-shaped protein belonging to the immunoglobulin superfamily which is used by the immune system to identify and neutralize antigens such as pathogenic bacteria, bacteria and viruses, includin ...
that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing (
ChIP-seq ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with Massively parallel signature sequencing, massively parallel DNA sequencing to identify t ...
) to determine transcription factor binding sites. If no antibody is available for the protein of interest, DamID may be a convenient alternative.


Classes

As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains. They are also classified by 3D structure of their DBD and the way it contacts DNA.


Mechanistic

There are two mechanistic classes of transcription factors: * General transcription factors are involved in the formation of a preinitiation complex. The most common are abbreviated as TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH. They are ubiquitous and interact with the core promoter region surrounding the transcription start site(s) of all class II genes. * Upstream transcription factors are proteins that bind somewhere upstream of the initiation site to stimulate or repress transcription. These are roughly synonymous with specific transcription factors, because they vary considerably depending on what
recognition sequence A recognition sequence is a DNA sequence to which a structural motif of a DNA-binding domain exhibits binding specificity. Recognition sequences are palindromes. The transcription factor Sp1 for example, binds the sequences 5'-(G/T)GGGCGG(G/A)( ...
s are present in the proximity of the gene.


Functional

Transcription factors have been classified according to their regulatory function: * I. Constitutive – present in all cells at all times, constantly active, all being activators. Very likely playing an important facilitating role in the transcription of many chromosomal genes, possibly in genes that seem to be always transcribed (e.g., structural proteins like tubulin and actin, and ubiquitous metabolic enzymes such as glyceraldehyde phosphate dehydrogenase (GAPDH)). E.g.: general transcription factors, Sp1, NF1, CCAAT * II. Regulatory (conditionally active) – require activation. ** II.A Developmental (cell-type specific) – beginning in a fertilized egg. Once expressed, require no additional activation. E.g.: GATA, HNF, PIT-1,
MyoD MyoD, also known as myoblast determination protein 1, is a protein in animals that plays a major role in regulating muscle differentiation. MyoD, which was discovered in the laboratory of Harold M. Weintraub, belongs to a family of proteins kn ...
, Myf5, Hox, Winged Helix ** II.B Signal-dependent – may be either developmentally restricted in their expression or present in most or all cells, but all are inactive (or minimally active) until cells containing such proteins are exposed to the appropriate intra- or extracellular signal. *** II.B.1 Extracellular ligand (
endocrine The endocrine system is a messenger system in an organism comprising feedback loops of hormones that are released by internal glands directly into the circulatory system and that target and regulate distant organs. In vertebrates, the hypotha ...
or
paracrine In cellular biology, paracrine signaling is a form of cell signaling, a type of cellular communication (biology), cellular communication in which a Cell (biology), cell produces a signal to induce changes in nearby cells, altering the behaviour of ...
)-dependent –
nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
s. *** II.B.2 Intracellular ligand (
autocrine Autocrine signaling is a form of cell signaling in which a cell secretes a hormone or chemical messenger (called the autocrine agent) that binds to autocrine receptors on that same cell, leading to changes in the cell. This can be contrasted with ...
)-dependent – activated by small intracellular molecules. E.g.:
SREBP Sterol regulatory element-binding proteins (SREBPs) are transcription factors that bind to the sterol regulatory element DNA sequence TCACNCCAC. Mammalian SREBPs are encoded by the genes '' SREBF1'' and '' SREBF2''. SREBPs belong to the basic- ...
,
p53 p53, also known as tumor protein p53, cellular tumor antigen p53 (UniProt name), or transformation-related protein 53 (TRP53) is a regulatory transcription factor protein that is often mutated in human cancers. The p53 proteins (originally thou ...
, orphan nuclear receptors. *** II.B.3 Cell surface receptor-ligand interaction-dependent – activated by second messenger signaling cascades. **** II.B.3.a Constitutive nuclear factors activated by serine phosphorylation – residing within the nucleus. The serine phosphorylation enzymes can be activated by two main routes: *****
G protein-coupled receptors G protein-coupled receptors (GPCRs), also known as seven-(pass)-transmembrane domain receptors, 7TM receptors, heptahelical receptors, serpentine receptors, and G protein-linked receptors (GPLR), form a large protein family, group of evoluti ...
upon ligand binding increase intracellular levels of
second messengers Second messengers are intracellular signaling molecules released by the cell in response to exposure to extracellular signaling molecules—the first messengers. (Intercellular signals, a non-local form of cell signaling, encompassing both first m ...
( cAMP, IP3, DAG, calcium) which, in turn, activate protein serine-threonine kinase enzymes (such as PKA, PKC). ***** Receptor tyrosine kinases upon ligand binding trigger other pathways that finally terminate in serine phosphorylation of the abundant resident nuclear transcription factors. ***** Examples include:
CREB CREB-TF (CREB, cAMP response element-binding protein) is a cellular transcription factor. It binds to certain DNA sequences called cAMP response elements (CRE), thereby increasing or decreasing the transcription of the genes. CREB was first des ...
, AP-1,
Mef2 In the field of molecular biology, myocyte enhancer factor-2 (Mef2) proteins are a family of transcription factors which through control of gene expression are important regulators of cellular differentiation and consequently play a critical rol ...
**** II.B.3.b Latent cytoplasmic factors – residing in the cytoplasm when inactive. Structurally and chemically very diverse group, and so are their activation pathways. E.g.: STAT,
R-SMAD R-SMADs are receptor-regulated SMADs. SMADs are transcription factors that transduce extracellular TGF-β superfamily ligand signaling from cell membrane bound TGF-β receptors into the nucleus where they activate transcription TGF-β target ge ...
,
NF-κB Nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) is a family of transcription factor protein complexes that controls transcription (genetics), transcription of DNA, cytokine production and cell survival. NF-κB is found i ...
, Notch, TUBBY,
NFAT Nuclear factor of activated T-cells (NFAT) is a family of transcription factors shown to be important in immune response. One or more members of the NFAT family is expressed in most cells of the immune system. NFAT is also involved in the developme ...


Structural

Transcription factors are often classified based on the
sequence similarity Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
and hence the
tertiary structure Protein tertiary structure is the three-dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains and the ...
of their DNA-binding domains. The following classification is based of the 3D structure of their DBD and the way it contacts DNA. It was first developed for Human TF and later extended to rodents and also to plants. * 1 Superclass: Basic Domains ** 1.1 Class:
Leucine zipper A leucine zipper (or leucine scissors) is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amin ...
factors ( bZIP) *** 1.1.1 Family: AP-1(-like) components; includes (
c-Fos Protein c-Fos is a proto-oncogene that is the human homolog of the retroviral oncogene v-fos. It is encoded in humans by the ''FOS'' gene. It was first discovered in rat fibroblasts as the transforming gene of the FBJ MSV (Finkel–Biskis–Ji ...
/ c-Jun) *** 1.1.2 Family:
CREB CREB-TF (CREB, cAMP response element-binding protein) is a cellular transcription factor. It binds to certain DNA sequences called cAMP response elements (CRE), thereby increasing or decreasing the transcription of the genes. CREB was first des ...
*** 1.1.3 Family: C/EBP-like factors *** 1.1.4 Family: bZIP / PAR *** 1.1.5 Family: Plant G-box binding factors *** 1.1.6 Family: ZIP only ** 1.2 Class: Helix-loop-helix factors ( bHLH) *** 1.2.1 Family: Ubiquitous (class A) factors *** 1.2.2 Family: Myogenic transcription factors (
MyoD MyoD, also known as myoblast determination protein 1, is a protein in animals that plays a major role in regulating muscle differentiation. MyoD, which was discovered in the laboratory of Harold M. Weintraub, belongs to a family of proteins kn ...
) *** 1.2.3 Family: Achaete-Scute *** 1.2.4 Family: Tal/Twist/Atonal/Hen ** 1.3 Class: Helix-loop-helix / leucine zipper factors ( bHLH-ZIP) *** 1.3.1 Family: Ubiquitous bHLH-ZIP factors; includes USF ( USF1, USF2); SREBP (
SREBP Sterol regulatory element-binding proteins (SREBPs) are transcription factors that bind to the sterol regulatory element DNA sequence TCACNCCAC. Mammalian SREBPs are encoded by the genes '' SREBF1'' and '' SREBF2''. SREBPs belong to the basic- ...
) *** 1.3.2 Family: Cell-cycle controlling factors; includes
c-Myc ''Myc'' is a family of regulator genes and proto-oncogenes that code for transcription factors. The ''Myc'' family consists of three related human genes: ''c-myc'' ( MYC), ''l-myc'' ( MYCL), and ''n-myc'' ( MYCN). ''c-myc'' (also sometimes ...
** 1.4 Class: NF-1 *** 1.4.1 Family: NF-1 ( A, B, C, X) ** 1.5 Class: RF-X *** 1.5.1 Family: RF-X ( 1, 2, 3, 4, 5, ANK) ** 1.6 Class: bHSH * 2 Superclass: Zinc-coordinating DNA-binding domains ** 2.1 Class: Cys4
zinc finger A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
of
nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
type *** 2.1.1 Family:
Steroid hormone receptor Steroid hormone receptors are found in the nucleus, cytosol, and also on the plasma membrane of target cells. They are generally intracellular receptors (typically cytoplasmic or nuclear) and initiate signal transduction for steroid hormones which ...
s *** 2.1.2 Family: Thyroid hormone receptor-like factors ** 2.2 Class: diverse Cys4 zinc fingers *** 2.2.1 Family: GATA-Factors ** 2.3 Class: Cys2His2 zinc finger domain *** 2.3.1 Family: Ubiquitous factors, includes TFIIIA, Sp1 *** 2.3.2 Family: Developmental / cell cycle regulators; includes Krüppel *** 2.3.4 Family: Large factors with NF-6B-like binding properties ** 2.4 Class: Cys6 cysteine-zinc cluster ** 2.5 Class: Zinc fingers of alternating composition * 3 Superclass:
Helix-turn-helix Helix-turn-helix is a DNA-binding domain (DBD). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the majo ...
** 3.1 Class: Homeo domain *** 3.1.1 Family: Homeo domain only; includes
Ubx UBX may refer to: * Ulaanbaatar Securities Exchange, a stock exchange in Mongolia * Ultrabithorax, a homeobox gene found in insects {{Disambig ...
*** 3.1.2 Family: POU domain factors; includes Oct *** 3.1.3 Family: Homeo domain with LIM region *** 3.1.4 Family: homeo domain plus zinc finger motifs ** 3.2 Class: Paired box *** 3.2.1 Family: Paired plus homeo domain *** 3.2.2 Family: Paired domain only ** 3.3 Class: Fork head / winged helix *** 3.3.1 Family: Developmental regulators; includes
forkhead FOX (forkhead box) proteins are a family of transcription factors that play important roles in regulating the expression of genes involved in cell growth, proliferation, differentiation, and longevity. Many FOX proteins are important to embryonic ...
*** 3.3.2 Family: Tissue-specific regulators *** 3.3.3 Family: Cell-cycle controlling factors *** 3.3.0 Family: Other regulators ** 3.4 Class:
Heat Shock Factor In molecular biology, heat shock factors (HSF), are the transcription factors that regulate the expression of the heat shock proteins. A typical example is the heat shock factor of ''Drosophila melanogaster''. Function Heat shock factors (H ...
s *** 3.4.1 Family: HSF ** 3.5 Class: Tryptophan clusters *** 3.5.1 Family: Myb *** 3.5.2 Family: Ets-type *** 3.5.3 Family:
Interferon regulatory factors Interferon regulatory factors (IRF) are proteins which regulate Transcription (genetics), transcription of interferons (see regulation of gene expression). Interferon regulatory factors contain a conserved sequence, conserved N-terminal region of ...
** 3.6 Class: TEA ( transcriptional enhancer factor) domain *** 3.6.1 Family: TEA ( TEAD1, TEAD2, TEAD3, TEAD4) * 4 Superclass: beta-Scaffold Factors with Minor Groove Contacts ** 4.1 Class: RHR ( Rel homology region) *** 4.1.1 Family: Rel/
ankyrin Ankyrins are a family of proteins that mediate the attachment of integral membrane proteins to the spectrin-actin based membrane cytoskeleton. Ankyrins have binding sites for the beta subunit of spectrin and at least 12 families of integral mem ...
; NF-kappaB *** 4.1.2 Family: ankyrin only *** 4.1.3 Family:
NFAT Nuclear factor of activated T-cells (NFAT) is a family of transcription factors shown to be important in immune response. One or more members of the NFAT family is expressed in most cells of the immune system. NFAT is also involved in the developme ...
(Nuclear Factor of Activated T-cells) ( NFATC1,
NFATC2 Nuclear factor of activated T-cells, cytoplasmic 2 is a protein that in humans is encoded by the ''NFATC2'' gene. Function This gene is a member of the nuclear factor of activated T cells (NFAT) family. The product of this gene is a DNA-bindin ...
, NFATC3) ** 4.2 Class: STAT *** 4.2.1 Family: STAT ** 4.3 Class: p53 *** 4.3.1 Family:
p53 p53, also known as tumor protein p53, cellular tumor antigen p53 (UniProt name), or transformation-related protein 53 (TRP53) is a regulatory transcription factor protein that is often mutated in human cancers. The p53 proteins (originally thou ...
** 4.4 Class: MADS box *** 4.4.1 Family: Regulators of differentiation; includes (
Mef2 In the field of molecular biology, myocyte enhancer factor-2 (Mef2) proteins are a family of transcription factors which through control of gene expression are important regulators of cellular differentiation and consequently play a critical rol ...
) *** 4.4.2 Family: Responders to external signals, SRF (
serum response factor Serum response factor, also known as SRF, is a transcription factor protein. Function Serum response factor is a member of the MADS (MCM1, Agamous, Deficiens, and SRF) box superfamily of transcription factors. This protein binds to the serum ...
) () *** 4.4.3 Family: Metabolic regulators (ARG80) ** 4.5 Class: beta-Barrel alpha-helix transcription factors ** 4.6 Class:
TATA binding protein The TATA-binding protein (TBP) is a general transcription factor that binds to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene promoters. T ...
s *** 4.6.1 Family: TBP ** 4.7 Class: HMG-box *** 4.7.1 Family:
SOX genes ''SOX'' genes (''SRY''-related HMG-box genes) encode a family of transcription factors that bind to the minor groove in DNA, and belong to a super-family of genes characterized by a homology (biology), homologous sequence called the HMG-box (fo ...
, SRY *** 4.7.2 Family: TCF-1 ( TCF1) *** 4.7.3 Family: HMG2-related, SSRP1 *** 4.7.4 Family: UBF *** 4.7.5 Family: MATA ** 4.8 Class: Heteromeric CCAAT factors *** 4.8.1 Family: Heteromeric CCAAT factors ** 4.9 Class: Grainyhead *** 4.9.1 Family: Grainyhead ** 4.10 Class: Cold-shock domain factors *** 4.10.1 Family: csd ** 4.11 Class: Runt *** 4.11.1 Family: Runt * 0 Superclass: Other Transcription Factors ** 0.1 Class: Copper fist proteins ** 0.2 Class: HMGI(Y) ( HMGA1) *** 0.2.1 Family: HMGI(Y) ** 0.3 Class: Pocket domain ** 0.4 Class: E1A-like factors ** 0.5 Class: AP2/EREBP-related factors *** 0.5.1 Family: AP2 *** 0.5.2 Family: EREBP *** 0.5.3 Superfamily: AP2/B3 **** 0.5.3.1 Family: ARF **** 0.5.3.2 Family: ABI **** 0.5.3.3 Family: RAV


Transcription factor databases

There are numerous databases cataloging information about transcription factors, but their scope and utility vary dramatically. Some may contain only information about the actual proteins, some about their binding sites, or about their target genes. Examples include the following: * footprintDB - a metadatabase of multiple databases, including JASPAR and others * JASPAR: database of transcription factor binding sites for eukaryotes * PlantTFD: Plant transcription factor database * TcoF-DB: Database of transcription co-factors and transcription factor interactions * TFcheckpoint: database of human, mouse and rat TF candidates * transcriptionfactor.org (now commercial, selling reagents) * MethMotif.org: An integrative cell-specific database of transcription factor binding motifs coupled with DNA methylation profiles.


See also

* Cdx protein family *
DNA-binding protein DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, becau ...
* Inhibitor of DNA-binding protein * Mapper(2) *
Nuclear receptor In the field of molecular biology, nuclear receptors are a class of proteins responsible for sensing steroids, thyroid hormones, vitamins, and certain other molecules. These intracellular receptors work with other proteins to regulate the ex ...
, a class of ligand activated transcription factors *
Open Regulatory Annotation Database The Open Regulatory Annotation Database (also known as ORegAnno) is designed to promote community-based curation of regulatory information. Specifically, the database contains information about regulatory regions, transcription factor binding sites, ...
* Phylogenetic footprinting * TRANSFAC database * YeTFaSCo


References


Further reading

* Carretero-Paulet, Lorenzo; Galstyan, Anahit; Roig-Villanova, Irma; Martínez-García, Jaime F.; Bilbao-Castro, Jose R. «Genome-Wide Classification and Evolutionary Analysis of the bHLH Family of Transcription Factors in Arabidopsis, Poplar, Rice, Moss, and Algae». ''Plant Physiology'', 153, 3, 2010-07, pàg. 1398–1412. doi:10.1104/pp.110.153593. *


External links

*
Transcription factor database
{{Authority control Gene expression Protein families DNA Biophysics Evolutionary developmental biology