HOME

TheInfoList



OR:

''Cis''-regulatory elements (CREs) or ''Cis''-regulatory modules (CRMs) are regions of non-coding DNA which
regulate Regulate may refer to: * Regulation * '' Regulate...G Funk Era'', an album from rapper Warren G ** Regulate (song), title song from the album See also * * * Regulator (disambiguation) Regulator may refer to: Technology * Regulator (automat ...
the transcription of neighboring
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s. CREs are vital components of
genetic regulatory networks A gene (or genetic) regulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the fun ...
, which in turn control
morphogenesis Morphogenesis (from the Greek ''morphê'' shape and ''genesis'' creation, literally "the generation of form") is the biological process that causes a cell, tissue or organism to develop its shape. It is one of three fundamental aspects of deve ...
, the development of
anatomy Anatomy () is the branch of biology concerned with the study of the structure of organisms and their parts. Anatomy is a branch of natural science that deals with the structural organization of living things. It is an old science, having i ...
, and other aspects of
embryonic development An embryo is an initial stage of development of a multicellular organism. In organisms that reproduce sexually, embryonic development is the part of the life cycle that begins just after fertilization of the female egg cell by the male sperm ...
, studied in
evolutionary developmental biology Evolutionary developmental biology (informally, evo-devo) is a field of biological research that compares the developmental processes of different organisms to infer how developmental processes evolved. The field grew from 19th-century beginn ...
. CREs are found in the vicinity of the genes that they regulate. CREs typically regulate gene transcription by binding to
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s. A single transcription factor may bind to many CREs, and hence control the expression of many genes (
pleiotropy Pleiotropy (from Greek , 'more', and , 'way') occurs when one gene influences two or more seemingly unrelated phenotypic traits. Such a gene that exhibits multiple phenotypic expression is called a pleiotropic gene. Mutation in a pleiotropic g ...
). The
Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through ...
prefix ''cis'' means "on this side", i.e. on the same molecule of DNA as the gene(s) to be transcribed. CRMs are stretches of DNA, usually 100–1000 DNA base pairs in length, where a number of
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s can bind and regulate expression of nearby
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s and regulate their transcription rates. They are labeled as ''cis'' because they are typically located on the same DNA strand as the genes they control as opposed to ''trans'', which refers to effects on genes not located on the same strand or farther away, such as transcription factors. One ''cis''-regulatory element can regulate several genes, and conversely, one gene can have several ''cis''-regulatory modules. ''Cis''-regulatory modules carry out their function by integrating the active transcription factors and the associated co-factors at a specific time and place in the cell where this information is read and an output is given. CREs are often but not always upstream of the transcription site. CREs contrast with trans-regulatory elements (TREs). TREs code for transcription factors.


Overview

The
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
of an organism contains anywhere from a few hundred to thousands of different genes, all encoding a singular product or more. For numerous reasons, including organizational maintenance, energy conservation, and generating
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological pr ...
variance, it is important that genes are only expressed when they are needed. The most efficient way for an organism to regulate gene expression is at the transcriptional level. CREs function to control transcription by acting nearby or within a gene. The most well characterized types of CREs are enhancers and promoters. Both of these sequence elements are structural regions of DNA that serve as
transcriptional regulators Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products ( protein or RNA). Sophisticated programs of gene expression are wid ...
. ''Cis''-regulatory modules are one of several types of functional regulatory elements. Regulatory elements are binding sites for transcription factors, which are involved in gene regulation. ''Cis''-regulatory modules perform a large amount of developmental information processing. ''Cis''-regulatory modules are non-random clusters at their specified target site that contain transcription factor binding sites. The original definition presented cis-regulatory modules as enhancers of cis-acting DNA, which increased the rate of transcription from a linked promoter. However, this definition has changed to define ''cis''-regulatory modules as a DNA sequence with transcription factor binding sites which are clustered into modular structures, including -but not limited to- locus control regions, promoters, enhancers, silencers, boundary control elements and other modulators. ''Cis''-regulatory modules can be divided into three classes; enhancers, which regulate gene expression positively; insulators, which work indirectly by interacting with other nearby ''cis''-regulatory modules; and silencers that turn off expression of genes. The design of ''cis''-regulatory modules is such that
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s and epigenetic modifications serve as inputs, and the output of the module is the command given to the transcription machinery, which in turn determines the rate of gene transcription or whether it is
turned on ''Turned On'' is a live album from the Rollins Band, fronted by ex-Black Flag (band), Black Flag singer, Henry Rollins, recorded in Vienna, Austria on November 27, 1989. Even though the album sleeve has the individual tracks listed, the CD is en ...
or
off Off or OFF may refer to: Art and entertainment * ''Off'' (video game), a video game by Mortis Ghost. *Sven Väth, German DJ and singer who uses the pseudonym OFF * ''Off'' (album), by Ciwan Haco, 2006 * ''Off!'' (album), by Off! * Off!, an Americ ...
. There are two types of transcription factor inputs: those that determine when the target gene is to be expressed and those that serve as functional ''drivers'', which come into play only during specific situations during development. These inputs can come from different time points, can represent different signal ligands, or can come from different domains or lineages of cells. However, a lot still remains unknown. Additionally, the regulation of chromatin structure and nuclear organization also play a role in determining and controlling the function of cis-regulatory modules. Thus gene-regulation functions (GRF) provide a unique characteristic of a cis-regulatory module (CRM), relating the concentrations of transcription factors (input) to the promoter activities (output). The challenge is to predict GRFs. This challenge still remains unsolved. In general, gene-regulation functions do not use
Boolean logic In mathematics and mathematical logic, Boolean algebra is a branch of algebra. It differs from elementary algebra in two ways. First, the values of the variables are the truth values ''true'' and ''false'', usually denoted 1 and 0, whereas ...
, although in some cases the approximation of the Boolean logic is still very useful.


The Boolean logic assumption

Within the assumption of the Boolean logic, principles guiding the operation of these modules includes the design of the module which determines the regulatory function. In relation to development, these modules can generate both positive and negative outputs. The output of each module is a product of the various operations performed on it. Common operations include the
OR gate The OR gate is a digital logic gate that implements logical disjunction. The OR gate returns true if either or both of its inputs are true; otherwise it returns false. The input and output states are normally represented by different voltage lev ...
– this design indicates that in an output will be given when either input is given and the AND gate – in this design two different regulatory factors are necessary to make sure that a positive output results. "Toggle Switches" – This design occurs when the signal ligand is absent while the transcription factor is present; this transcription factor ends up acting as a dominant repressor. However, once the signal ligand is present the transcription factor's role as repressor is eliminated and transcription can occur. Other Boolean logic operations can occur as well, such as sequence specific transcriptional repressors, which when they bind to the ''cis''-regulatory module lead to an output of zero. Additionally, besides influence from the different logic operations, the output of a "cis"-regulatory module will also be influenced by prior events. 4) ''Cis''-regulatory modules must interact with other regulatory elements. For the most part, even with the presence of functional overlap between ''cis''-regulatory modules of a gene, the modules' inputs and outputs tend to not be the same. While the assumption of Boolean logic is important for ''systems biology'', detailed studies show that in general the logic of gene regulation is not Boolean. This means, for example, that in the case of a ''cis''-regulatory module regulated by two transcription factors, experimentally determined gene-regulation functions can not be described by the 16 possible Boolean functions of two variables. Non-Boolean extensions of the gene-regulatory logic have been proposed to correct for this issue.


Classification

''Cis''-regulatory modules can be characterized by the information processing that they encode and the organization of their transcription factor binding sites. Additionally, ''cis''-regulatory modules are also characterized by the way they affect the probability, proportion, and rate of transcription. Highly cooperative and coordinated ''cis''-regulatory modules are classified as
enhanceosome An enhanceosome is a protein complex that assembles at an enhancer region on DNA and helps to regulate the expression of a target gene. Formation Enhancers are bound by transcription activator proteins and transcriptional regulation is typica ...
s. The architecture and the arrangement of the transcription factor binding sites are critical because disruption of the arrangement could cancel out the function. Functional flexible ''cis''-regulatory modules are called billboards. Their transcriptional output is the summation effect of the bound transcription factors. Enhancers affect the probability of a gene being activated, but have little or no effect on rate. The Binary response model acts like an on/off switch for transcription. This model will increase or decrease the amount of cells that transcribe a gene, but it does not affect the rate of transcription. Rheostatic response model describes cis-regulatory modules as regulators of the initiation rate of transcription of its associated gene.


Promoter

Promoters are CREs consisting of relatively short sequences of DNA which include the site where transcription is initiated and the region approximately 35 bp upstream or downstream from the initiation site (bp). In
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
s, promoters usually have the following four components: the TATA box, a TFIIB
recognition site A recognition sequence is a DNA sequence to which a structural motif of a DNA-binding domain exhibits binding specificity. Recognition sequences are palindromes. The transcription factor Sp1 for example, binds the sequences 5'-(G/T)GGGCGG(G/A)(G/ ...
, an
initiator An initiator can refer to: * A person who instigates something. * Modulated neutron initiator, a neutron source used in some nuclear weapons ** Initiator, an Explosive booster ** Initiator, the first Nuclear chain reaction * Pyrotechnic initiator, ...
, and the downstream core promoter element. It has been found that a single gene can contain multiple promoter sites. In order to initiate transcription of the downstream gene, a host of DNA-binding proteins called transcription factors (TFs) must bind sequentially to this region. Only once this region has been bound with the appropriate set of TFs, and in the proper order, can
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template. Using the enzyme helicase, RNAP locally opens th ...
bind and begin transcribing the gene.


Enhancers

Enhancers are CREs that influence (enhance) the transcription of genes on the same molecule of DNA and can be found upstream, downstream, within the
intron An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of ...
s, or even relatively far away from the gene they regulate. Multiple enhancers can act in a coordinated fashion to regulate transcription of one gene. A number of genome-wide sequencing projects have revealed that enhancers are often transcribed to long non-coding RNA (lncRNA) or enhancer RNA (eRNA), whose changes in levels frequently correlate with those of the target gene mRNA.


Silencers

Silencers are CREs that can bind transcription regulation factors (proteins) called
repressors In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
, thereby preventing transcription of a gene. The term "silencer" can also refer to a region in the 3' untranslated region of messenger RNA, that binds proteins which suppress translation of that mRNA molecule, but this usage is distinct from its use in describing a CRE.


Operators

Operators are CREs in prokaryotes and some eukaryotes that exist within operons, where they can bind proteins called
repressors In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
to affect transcription.


Evolutionary role

CREs have an important evolutionary role. The coding regions of genes are often well conserved among organisms; yet different organisms display marked phenotypic diversity. It has been found that polymorphisms occurring within non-coding sequences have a profound effect on phenotype by altering
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
.
Mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, m ...
s arising within a CRE can generate expression variance by changing the way TFs bind. Tighter or looser binding of regulatory proteins will lead to up- or down-regulated transcription.


''Cis''-regulatory module in gene regulatory network

The function of a gene regulatory network depends on the architecture of the
node In general, a node is a localized swelling (a " knot") or a point of intersection (a vertex). Node may refer to: In mathematics * Vertex (graph theory), a vertex in a mathematical graph * Vertex (geometry), a point where two or more curves, line ...
s, whose function is dependent on the multiple ''cis''-regulatory modules. The layout of ''cis''-regulatory modules can provide enough information to generate spatial and temporal patterns of gene expression. During development each domain, where each domain represents a different spatial regions of the embryo, of gene expression will be under the control of different ''cis''-regulatory modules. The design of regulatory modules help in producing
feedback Feedback occurs when outputs of a system are routed back as inputs as part of a chain of cause-and-effect that forms a circuit or loop. The system can then be said to ''feed back'' into itself. The notion of cause-and-effect has to be handled ...
, feed forward, and cross-regulatory loops.


Mode of action

''Cis''-regulatory modules can regulate their target genes over large distances. Several models have been proposed to describe the way that these modules may communicate with their target gene promoter. These include the DNA scanning model, the DNA sequence looping model and the facilitated tracking model. In the DNA scanning model, the transcription factor and cofactor complex form at the ''cis''-regulatory module and then continues to move along the DNA sequence until it finds the target gene promoter. In the looping model, the transcription factor binds to the ''cis''-regulatory module, which then causes the ''looping'' of the DNA sequence and allows for the interaction with the target gene promoter. The transcription factor-''cis''-regulatory module complex causes the looping of the DNA sequence slowly towards the target promoter and forms a stable looped configuration. The facilitated tracking model combines parts of the two previous models.


Identification and computational prediction

Besides experimentally determining CRMs, there are various
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
algorithms for predicting them. Most algorithms try to search for significant combinations of transcription factor binding sites ( DNA binding sites) in promoter sequences of co-expressed genes. More advanced methods combine the search for significant motifs with correlation in
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
datasets between transcription factors and target genes. Both methods have been implemented, for example, in th
ModuleMaster
Other programs created for the identification and prediction of ''cis''-regulatory modules include:
INSECT 2.0
ref name=Parra2015>
is a web server that allows to search Cis-regulatory modules in a genome-wide manner. The program relies on the definition of strict restrictions among the Transcription Factor Binding Sites (TFBSs) that compose the module in order to decrease the false positives rate. INSECT is designed to be user-friendly since it allows automatic retrieval of sequences and several visualizations and links to third-party tools in order to help users to find those instances that are more likely to be true regulatory sites. INSECT 2.0 algorithm was previously published and the algorithm and theory behind it explained in Stubb uses hidden Markov models to identify statistically significant clusters of transcription factor combinations. It also uses a second related genome to improve the prediction accuracy of the model.
Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
s use an algorithm that combines site predictions and tissue-specific expression data for transcription factors and target genes of interest. This model also uses regression trees to depict the relationship between the identified ''cis''-regulatory module and the possible binding set of transcription factors. CRÈME examine clusters of target sites for transcription factors of interest. This program uses a database of confirmed transcription factor binding sites that were annotated across the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the ...
. A search
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
is applied to the data set to identify possible combinations of transcription factors, which have binding sites that are close to the promoter of the gene set of interest. The possible cis-regulatory modules are then statistically analyzed and the significant combinations are graphically represented Active ''cis''-regulatory modules in a genomic sequence have been difficult to identify. Problems in identification arise because often scientists find themselves with a small set of known transcription factors, so it makes it harder to identify statistically significant clusters of transcription factor binding sites. Additionally, high costs limit the use of large whole genome tiling arrays.


Examples

An example of a cis-acting
regulatory sequence A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and ...
is the operator in the lac operon. This DNA sequence is bound by the lac repressor, which, in turn, prevents transcription of the adjacent genes on the same DNA molecule. The lac operator is, thus, considered to "act in cis" on the regulation of the nearby genes. The operator itself does not code for any
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
or RNA. In contrast, trans-regulatory elements are diffusible factors, usually proteins, that may modify the expression of genes distant from the gene that was originally transcribed to create them. For example, a
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
that regulates a gene on
chromosome 6 Chromosome 6 is one of the 23 pairs of chromosomes in humans. People normally have two copies of this chromosome. Chromosome 6 spans more than 170 million base pairs (the building material of DNA) and represents between 5.5 and 6% of the total ...
might itself have been transcribed from a gene on chromosome 11. The term ''trans-regulatory'' is constructed from the Latin root ''trans'', which means "across from". There are cis-regulatory and trans-regulatory elements. Cis-regulatory elements are often binding sites for one or more trans-acting factors. To summarize, cis-regulatory elements are present on the same molecule of DNA as the gene they regulate whereas trans-regulatory elements can regulate genes distant from the gene from which they were transcribed.


Examples in RNA


See also

* DNA ** TATA box ** Pribnow box **
SOS box SOS box is the region in the promoter of various genes to which the LexA repressor binds to repress the transcription of SOS-induced proteins. This occurs in the absence of DNA damage. In the presence of DNA damage the binding of LexA is inactivat ...
** CAAT box ** CCAAT box ** Operator (biology) ** Upstream activation sequence * RNA **
List of cis-regulatory RNA elements This is a list of ''cis''-regulatory RNAs. These are RNA motifs which regulate nucleic acid regions on the same molecule, as opposed to ''trans''-acting motifs which regulate a distal molecule. Some of these RNAs are broadly distributed while other ...
**
Polyadenylation Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
signals, mRNA ** AU-rich element, mRNA * Other **
Regulation of gene expression Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are w ...
** Cis-trans isomerism ** Gene regulatory network ** Operon ** Promoter **
Trans-acting factor In the field of molecular biology, ''trans''-acting (''trans''-regulatory, ''trans''-regulation), in general, means "acting from a different molecule" (''i.e.'', intermolecular). It may be considered the opposite of ''cis''-acting (''cis''-regula ...
**
Rfam Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Jane ...
**
Transterm Transterm is a database of mRNA sequences, codon usage, and associated cis-regulatory elements that regulate gene expression. Many of these elements are in the 3' UTR In molecular genetics, the three prime untranslated region (3′-UTR) is th ...


References


Further reading

* * * * *


External links


Gene Regulation Info – manually curated lists of resources, reviews, community discussions


* {{Molecular biology, state=expanded RNA Non-coding RNA DNA Non-coding DNA