HOME

TheInfoList



OR:

Long non-coding RNAs (long ncRNAs, lncRNA) are a type of RNA, generally defined as transcripts more than 200
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecu ...
s that are not translated into protein. This arbitrary limit distinguishes long ncRNAs from small
non-coding RNAs A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally important types of non- ...
, such as
microRNA MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. mi ...
s (miRNAs),
small interfering RNA Small interfering RNA (siRNA), sometimes known as short interfering RNA or silencing RNA, is a class of double-stranded RNA at first non-coding RNA molecules, typically 20-24 (normally 21) base pairs in length, similar to MicroRNA, miRNA, and op ...
s (siRNAs),
Piwi-interacting RNA Piwi-interacting RNA (piRNA) is the largest class of small non-coding RNA, non-coding RNA molecules expressed in animal cells. piRNAs form RNA-protein complexes through interactions with piwi-subfamily Argonaute proteins. These piRNA complexes are ...
s (piRNAs), small nucleolar RNAs (snoRNAs), and other short RNAs. Long intervening/intergenic noncoding RNAs (lincRNAs) are sequences of lncRNA which do not overlap protein-coding genes. Long non-coding RNAs include intergenic lincRNAs, intronic ncRNAs, and sense and antisense lncRNAs, each type showing different genomic positions in relation to
genes In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
and
exons An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding seque ...
.


Abundance

In 2007 a study found only one-fifth of transcription across the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the ...
is associated with protein-coding genes, indicating at least four times more long non-coding than coding RNA sequences. Large-scale
complementary DNA In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA (e.g., messenger RNA (mRNA) or microRNA (miRNA)) template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a spec ...
(cDNA) sequencing projects such as
FANTOM FANTOM (Functional Annotation of the Mouse/Mammalian Genome) is an international research consortium first established in 2000 as part of the RIKEN research institute in Japan. The original meeting gathered international scientists from diverse bac ...
reveal the complexity of this transcription. The FANTOM3 project identified ~35,000 non-coding transcripts that bear many signatures of
messenger RNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
s, including 5' capping, splicing, and poly-adenylation, but have little or no
open reading frame In molecular biology, open reading frames (ORFs) are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible readi ...
(ORF). This number represents a conservative lower estimate, since it omitted many singleton transcripts and non- polyadenylated transcripts (
tiling array Tiling arrays are a subtype of microarray chips. Like traditional microarrays, they function by hybridizing labeled DNA or RNA target molecules to probes fixed onto a solid surface. Tiling arrays differ from traditional microarrays in the nature ...
data shows more than 40% of transcripts are non-polyadenylated). Identifying ncRNAs within these cDNA libraries is challenging since it can be difficult to distinguish protein-coding transcripts from non-coding transcripts. It has been suggested through multiple studies that
testis A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testostero ...
, and neural tissues express the greatest amount of long non-coding RNAs of any tissue type. Using FANTOM5, 27,919 long ncRNAs have been identified in various human sources. Quantitatively, lncRNAs demonstrate ~10-fold lower abundance than
mRNAs In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the p ...
, which is explained by higher cell-to-cell variation of expression levels of lncRNA genes in the individual cells, when compared to protein-coding genes. In general, the majority (~78%) of lncRNAs are characterized as tissue-specific, as opposed to only ~19% of mRNAs. In addition to higher tissue specificity, lncRNAs are characterized by higher developmental stage specificity, and cell subtype specificity in tissues such as human
neocortex The neocortex, also called the neopallium, isocortex, or the six-layered cortex, is a set of layers of the mammalian cerebral cortex involved in higher-order brain functions such as sensory perception, cognition, generation of motor commands, sp ...
and other parts of the brain, regulating correct brain development and function. In 2018, a comprehensive integration of lncRNAs from existing databases, published literature and novel RNA assemblies based on
RNA-seq RNA-Seq (named as an abbreviation of RNA sequencing) is a sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing ...
data analysis, revealed that there are 270,044 lncRNA transcripts in humans. In comparison to mammals relatively few studies have focused on the prevalence of lncRNAs in
plant Plants are predominantly photosynthetic eukaryotes of the kingdom Plantae. Historically, the plant kingdom encompassed all living things that were not animals, and included algae and fungi; however, all current definitions of Plantae excl ...
s. However an extensive study considering 37 higher plant species and six
algae Algae (; singular alga ) is an informal term for a large and diverse group of photosynthetic eukaryotic organisms. It is a polyphyletic grouping that includes species from multiple distinct clades. Included organisms range from unicellular micr ...
identified ~200,000 non-coding transcripts using an '' in-silico'' approach, which also established the associated Green Non-Coding Database ( GreeNC), a repository of plant lncRNAs.


Genomic organization

In 2005 the landscape of the mammalian genome was described as numerous 'foci' of transcription that are separated by long stretches of
intergenic An intergenic region is a stretch of DNA sequences located between genes. Intergenic regions may contain functional elements and junk DNA. ''Inter''genic regions should not be confused with ''intra''genic regions (or introns), which are non-cod ...
space. While some long ncRNAs are located within the intergenic stretches, the majority are overlapping sense and antisense transcripts that often include protein-coding genes, giving rise to a complex hierarchy of overlapping isoforms. Genomic sequences within these transcriptional foci are often shared within a number of coding and non-coding transcripts in the sense and antisense directions For example, 3012 out of 8961 cDNAs previously annotated as truncated coding sequences within FANTOM2 were later designated as genuine ncRNA variants of protein-coding cDNAs. While the abundance and conservation of these arrangements suggest they have biological relevance, the complexity of these foci frustrates easy evaluation. The GENCODE consortium has collated and analysed a comprehensive set of human lncRNA annotations and their
genomic Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
organisation, modifications, cellular locations and tissue expression profiles. Their analysis indicates human lncRNAs show a bias toward two-
exon An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
transcripts.


Identification software


Translation

There has been considerable debate about whether lncRNAs have been misannotated and do in fact encode
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
s. Several lncRNAs have been found to in fact encode for
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. ...
s with biologically significant function.
Ribosome profiling Ribosome profiling, or Ribo-Seq (also named ribosome footprinting), is an adaptation of a technique developed by Joan Steitz and Marilyn Kozak almost 50 years ago that Nicholas Ingolia and Jonathan Weissman adapted to work with next generation ...
studies have suggested that anywhere from 40% to 90% of annotated lncRNAs are in fact
translated Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
, although there is disagreement about the correct method for analyzing ribosome profiling data. Additionally, it is thought that many of the peptides produced by lncRNAs may be highly unstable and without biological function.


Conservation

Initial studies into lncRNA conservation noted that as a class, they were enriched for
conserved sequence In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids ( DNA and RNA) or proteins across species ( orthologous sequences), or within a genome ( paralogous sequences), or between donor and receptor taxa ...
elements, depleted in substitution and insertion/deletion rates and depleted in rare frequency variants, indicative of purifying selection maintaining lncRNA function. However, further investigations into
vertebrate Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, with ...
lncRNAs revealed that while lncRNAs are conserved in sequence, they are not conserved in transcription. In other words, even when the sequence of a human lncRNA is conserved in another vertebrate species, there is often no transcription of a lncRNA in the orthologous genomic region. Some argue that these observations suggest non-functionality of the majority of lncRNAs, while others argue that they may be indicative of rapid
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriat ...
-specific adaptive selection. While the turnover of lncRNA transcription is much higher than initially expected, it is important to note that still, hundreds of lncRNAs are conserved at the sequence level. There have been several attempts to delineate the different categories of selection signatures seen amongst lncRNAs including: lncRNAs with strong sequence conservation across the entire length of the
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
, lncRNAs in which only a portion of the transcript (e.g. 5′ end, splice sites) is conserved, and lncRNAs that are transcribed from syntenic regions of the genome but have no recognizable sequence similarity. Additionally, there have been attempts to identify conserved
secondary structures Secondary may refer to: Science and nature * Secondary emission, of particles ** Secondary electrons, electrons generated as ionization products * The secondary winding, or the electrical or electronic circuit connected to the secondary winding i ...
in lncRNAs, though these studies have currently given way to conflicting results.


Functions

Despite accumulating evidence that the majority of long noncoding RNAs in mammals are likely to be functional, only a relatively small proportion has been demonstrated to be biologically relevant. Some lncRNAs have been functionally annotated in LncRNAdb (a database of literature described lncRNAs), with the majority of these being described in
human Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, cultu ...
s. The functions of other lncRNAs with experimental evidences have been community-curated in LncRNAWiki (a
wiki A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pub ...
-based, publicly editable and open-content platform for
community curation Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by th ...
of human lncRNAs) in respect of the functional mechanisms and disease associations, which can also be accessed in LncBook. According to the curation of functional mechanisms of lncRNAs based on the literatures, lncRNAs are extensively reported to be involved in
transcriptional regulation In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA ( transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from ...
. A further large-scale sequencing study provides evidence that many transcripts thought to be lncRNAs may, in fact, be translated into
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
s.


In the regulation of gene transcription


In gene-specific transcription

In
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
s,
RNA transcription Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
is a tightly regulated process. Noncoding RNAs act upon different aspects of this process, targeting transcriptional modulators, RNA polymerase (RNAP) II and even the DNA duplex to regulate gene expression. NcRNAs modulate transcription by several mechanisms, including functioning themselves as co-regulators, modifying
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
activity, or regulating the association and activity of co-regulators. For example, the noncoding RNA Evf-2 functions as a co-activator for the
homeobox A homeobox is a DNA sequence, around 180 base pairs long, that regulates large-scale anatomical features in the early stages of embryonic development. For instance, mutations in a homeobox may change large-scale anatomical features of the full-g ...
transcription factor
Dlx2 Homeobox protein DLX-2 is a protein that in humans is encoded by the ''DLX2'' gene. Many vertebrate homeo box-containing genes have been identified on the basis of their sequence similarity with Drosophila developmental genes. Members of the Dl ...
, which plays important roles in
forebrain In the anatomy of the brain of vertebrates, the forebrain or prosencephalon is the rostral (forward-most) portion of the brain. The forebrain (prosencephalon), the midbrain (mesencephalon), and hindbrain (rhombencephalon) are the three primary ...
development and
neurogenesis Neurogenesis is the process by which nervous system cells, the neurons, are produced by neural stem cells (NSCs). It occurs in all species of animals except the porifera (sponges) and placozoans. Types of NSCs include neuroepithelial cells (NEC ...
.
Sonic hedgehog Sonic hedgehog protein (SHH) is encoded for by the ''SHH'' gene. The protein is named after the character ''Sonic the Hedgehog''. This signaling molecule is key in regulating embryonic morphogenesis in all animals. SHH controls organogenesis a ...
induces transcription of Evf-2 from an
ultra-conserved element An ultra-conserved element (UCE) was originally defined as a genome segment longer than 200 base pairs (bp) that is absolutely conserved, with no insertions or deletions and 100% identity, between orthologous regions of the human, rat, and mouse ge ...
located between the
Dlx5 Homeobox protein DLX-5 is a protein that in humans is encoded by the distal-less homeobox 5 gene, or ''DLX5'' gene. DLX5 is a member of DLX gene family. Function This gene encodes a member of a homeobox transcription factor gene family similar ...
and
Dlx6 Homeobox protein DLX-6 is a protein that in humans is encoded by the ''DLX6'' gene. This gene encodes a member of a homeobox transcription factor gene family similar to the Drosophila ''Drosophila'' () is a genus of flies, belonging to the ...
genes during forebrain development. Evf-2 then recruits the Dlx2 transcription factor to the same ultra-conserved element whereby Dlx2 subsequently induces expression of Dlx5. The existence of other similar ultra- or highly conserved elements within the mammalian genome that are both transcribed and fulfill enhancer functions suggest Evf-2 may be illustrative of a generalised mechanism that regulates developmental genes with complex expression patterns during vertebrate growth. Indeed, the transcription and expression of similar non-coding ultraconserved elements was shown to be abnormal in human
leukaemia Leukemia ( also spelled leukaemia and pronounced ) is a group of blood cancers that usually begin in the bone marrow and result in high numbers of abnormal blood cells. These blood cells are not fully developed and are called ''blasts'' or ...
and to contribute to
apoptosis Apoptosis (from grc, ἀπόπτωσις, apóptōsis, 'falling off') is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to characteristic cell changes ( morphology) and death. These changes in ...
in
colon cancer Colorectal cancer (CRC), also known as bowel cancer, colon cancer, or rectal cancer, is the development of cancer from the colon or rectum (parts of the large intestine). Signs and symptoms may include blood in the stool, a change in bowe ...
cells, suggesting their involvement in
tumorigenesis Carcinogenesis, also called oncogenesis or tumorigenesis, is the formation of a cancer, whereby normal cells are transformed into cancer cells. The process is characterized by changes at the cellular, genetic, and epigenetic levels and abnor ...
. Local ncRNAs can also recruit transcriptional programmes to regulate adjacent protein-coding
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
. For example, divergent lncRNAs that are transcribed in the opposite direction to nearby protein-coding genes (~20% of total lncRNAs in mammalian genomes) possibly regulate the transcription of nearby adjacent essential developmental regulatory genes in pluripotent cells. The RNA binding protein TLS binds and inhibits the
CREB binding protein Cyclic adenosine monophosphate Response Element Binding protein Binding Protein (CREB-binding protein), also known as CREBBP or CBP or KAT3A, is a coactivator encoded by the ''CREBBP'' gene in humans, located on chromosome 16p13.3. CBP has intrin ...
and p300 histone acetyltransferase activities on a repressed gene target,
cyclin D1 Cyclin D1 is a protein that in humans is encoded by the ''CCND1'' gene. Gene expression The CCND1 gene encodes the cyclin D1 protein. The human CCND1 gene is located on the long arm of chromosome 11 (band 11q13). It is 13,388 base pairs lon ...
. The recruitment of TLS to the promoter of cyclin D1 is directed by long ncRNAs expressed at low levels and tethered to 5' regulatory regions in response to DNA damage signals. Moreover, these local ncRNAs act cooperatively as ligands to modulate the activities of TLS. In the broad sense, this mechanism allows the cell to harness
RNA-binding protein RNA-binding proteins (often abbreviated as RBPs) are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif ...
s, which make up one of the largest classes within the mammalian
proteome The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. ...
, and integrate their function in transcriptional programs. Nascent long ncRNAs have been shown to increase the activity of CREB binding protein, which in turn increases the transcription of that ncRNA. A study found that a lncRNA in the antisense direction of the
Apolipoprotein A1 Apolipoprotein AI (ApoA-I) is a protein that in humans is encoded by the ''APOA1'' gene. As the major component of HDL particles, it has a specific role in lipid metabolism. Structure ''APOA1'' is located on chromosome 11, with its specific lo ...
(APOA1) regulates the transcription of APOA1 through
epigenetic In biology, epigenetics is the study of stable phenotypic changes (known as ''marks'') that do not involve alterations in the DNA sequence. The Greek prefix '' epi-'' ( "over, outside of, around") in ''epigenetics'' implies features that are ...
modifications. Recent evidence has raised the possibility that transcription of genes that escape from X-inactivation might be mediated by expression of long non-coding RNA within the escaping
chromosomal A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins ar ...
domains.


Regulating basal transcription machinery

NcRNAs also target general
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s required for the
RNAP II RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryotic ...
transcription of all genes. These general factors include components of the initiation complex that assemble on promoters or involved in transcription elongation. A ncRNA transcribed from an upstream minor promoter of the
dihydrofolate reductase Dihydrofolate reductase, or DHFR, is an enzyme that reduces dihydrofolic acid to tetrahydrofolic acid, using NADPH as an electron donor, which can be converted to the kinds of tetrahydrofolate cofactors used in 1-carbon transfer chemistry ...
(DHFR) gene forms a stable RNA-DNA triplex within the major promoter of DHFR to prevent the binding of the transcriptional co-factor
TFIIB Transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC) and aids in stimulating transcription initiation. TFIIB is localised to the nucleus and pr ...
. This novel mechanism of regulating gene expression may represent a widespread method of controlling promoter usage, as thousands of RNA-DNA triplexes exist in eukaryotic
chromosome A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins ar ...
. The U1 ncRNA can induce transcription by binding to and stimulating
TFIIH Transcription factor II Human (transcription factor II H; TFIIH) is an important protein complex, having roles in transcription of various protein-coding genes and DNA nucleotide excision repair (NER) pathways. TFIIH first came to light in 1989 ...
to
phosphorylate In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, ...
the C-terminal domain of RNAP II. In contrast the ncRNA 7SK is able to repress transcription elongation by, in combination with
HEXIM1 Protein HEXIM1 is a protein that in humans is encoded by the ''HEXIM1'' gene. Interactions HEXIM1 has been shown to interact with Cyclin T1 and Cdk9 Cyclin-dependent kinase 9 or CDK9 is a cyclin-dependent kinase associated with P-TEFb. F ...
/ 2, forming an inactive complex that prevents PTEFb from phosphorylating the C-terminal domain of RNAP II, repressing global elongation under stressful conditions. These examples, which bypass specific modes of regulation at individual promoters provide a means of quickly affecting global changes in
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
. The ability to quickly mediate global changes is also apparent in the rapid expression of non-coding repetitive sequences. The short interspersed nuclear (
SINE In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opp ...
) Alu elements in humans and analogous B1 and B2 elements in
mice A mouse ( : mice) is a small rodent. Characteristically, mice are known to have a pointed snout, small rounded ears, a body-length scaly tail, and a high breeding rate. The best known mouse species is the common house mouse (''Mus musculus'' ...
have succeeded in becoming the most abundant mobile elements within the genomes, comprising ~10% of the
human Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, cultu ...
and ~6% of the
mouse A mouse ( : mice) is a small rodent. Characteristically, mice are known to have a pointed snout, small rounded ears, a body-length scaly tail, and a high breeding rate. The best known mouse species is the common house mouse (''Mus musculus' ...
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
, respectively. These elements are transcribed as ncRNAs by RNAP III in response to environmental stresses such as
heat shock The heat shock response (HSR) is a cell stress response that increases the number of molecular chaperones to combat the negative effects on proteins caused by stressors such as increased temperatures, oxidative stress, and heavy metals. In a normal ...
, where they then bind to RNAP II with high affinity and prevent the formation of active pre-initiation complexes. This allows for the broad and rapid repression of gene expression in response to stress. A dissection of the functional sequences within Alu RNA transcripts has drafted a
modular Broadly speaking, modularity is the degree to which a system's components may be separated and recombined, often with the benefit of flexibility and variety in use. The concept of modularity is used primarily to reduce complexity by breaking a s ...
structure analogous to the organization of domains in protein transcription factors. The Alu RNA contains two 'arms', each of which may bind one RNAP II molecule, as well as two regulatory domains that are responsible for RNAP II transcriptional repression in vitro. These two loosely structured domains may even be concatenated to other ncRNAs such as B1 elements to impart their repressive role. The abundance and distribution of Alu elements and similar repetitive elements throughout the mammalian genome may be partly due to these functional domains being co-opted into other long ncRNAs during evolution, with the presence of functional repeat sequence domains being a common characteristic of several known long ncRNAs including Kcnq1ot1, Xlsirt and Xist. In addition to
heat shock The heat shock response (HSR) is a cell stress response that increases the number of molecular chaperones to combat the negative effects on proteins caused by stressors such as increased temperatures, oxidative stress, and heavy metals. In a normal ...
, the expression of
SINE In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opp ...
elements (including Alu, B1, and B2 RNAs) increases during cellular stress such as
viral infection A viral disease (or viral infection) occurs when an organism's body is invaded by pathogenic viruses, and infectious virus particles (virions) attach to and enter susceptible cells. Structural Characteristics Basic structural characteristics, ...
in some
cancer cell Cancer cells are cells that divide continually, forming solid tumors or flooding the blood with abnormal cells. Cell division is a normal process used by the body for growth and repair. A parent cell divides to form two daughter cells, and these d ...
s where they may similarly regulate global changes to gene expression. The ability of Alu and B2 RNA to bind directly to RNAP II provides a broad mechanism to repress transcription. Nevertheless, there are specific exceptions to this global response where Alu or B2 RNAs are not found at activated promoters of genes undergoing induction, such as the
heat shock The heat shock response (HSR) is a cell stress response that increases the number of molecular chaperones to combat the negative effects on proteins caused by stressors such as increased temperatures, oxidative stress, and heavy metals. In a normal ...
genes. This additional hierarchy of regulation that exempts individual genes from the generalised repression also involves a long ncRNA, heat shock RNA-1 (HSR-1). It was argued that HSR-1 is present in mammalian
cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
in an inactive state, but upon stress is activated to induce the expression of
heat shock genes In thermodynamics, heat is defined as the form of energy crossing the boundary of a thermodynamic system by virtue of a temperature difference across the boundary. A thermodynamic system does not ''contain'' heat. Nevertheless, the term is ...
. This activation involves a conformational alteration of HSR-1 in response to rising temperatures, permitting its interaction with the
transcriptional activator A transcriptional activator is a protein (transcription factor) that increases transcription of a gene or set of genes. Activators are considered to have ''positive'' control over gene expression, as they function to promote gene transcription and ...
HSF-1, which trimerizes and induces the expression of heat shock genes. In the broad sense, these examples illustrate a regulatory circuit nested within ncRNAs whereby Alu or B2 RNAs repress general
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
, while other ncRNAs activate the expression of specific
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s.


Transcribed by RNA polymerase III

Many of the ncRNAs that interact with general
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s or
RNAP II RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryotic ...
itself (including 7SK, Alu and B1 and B2 RNAs) are transcribed by RNAP III, uncoupling their expression from RNAP II, which they regulate. RNAP III also transcribes other ncRNAs, such as BC2, BC200 and some microRNAs and snoRNAs, in addition to
housekeeping Housekeeping is the management and routine support activities of running an organised physical institution occupied or used by people, like a house, ship, hospital or factory, such as tidying, cleaning, cooking, routine maintenance, shopping, ...
ncRNA genes such as
tRNAs Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ...
, 5S rRNAs and snRNAs. The existence of an RNAP III-dependent ncRNA transcriptome that regulates its RNAP II-dependent counterpart is supported by the finding of a set of ncRNAs transcribed by RNAP III with
sequence homology Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a ...
to protein-coding genes. This prompted the authors to posit a 'cogene/gene' functional regulatory network, showing that one of these ncRNAs, 21A, regulates the expression of its antisense partner gene, CENP-F in trans.


In post-transcriptional regulation

In addition to regulating transcription, ncRNAs also control various aspects of post-transcriptional mRNA processing. Similar to small regulatory RNAs such as
microRNA MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. mi ...
s and
snoRNAs In molecular biology, Small nucleolar RNAs (snoRNAs) are a class of small RNA molecules that primarily guide chemical modifications of other RNAs, mainly ribosomal RNAs, transfer RNAs and small nuclear RNAs. There are two main classes of snoRNA, ...
, these functions often involve complementary base pairing with the target mRNA. The formation of RNA duplexes between complementary ncRNA and mRNA may mask key elements within the mRNA required to bind trans-acting factors, potentially affecting any step in post-transcriptional
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
including pre-mRNA processing and splicing, transport, translation, and degradation.


In splicing

The splicing of mRNA can induce its translation and functionally diversify the repertoire of
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
it encodes. The
Zeb2 Zinc finger E-box-binding homeobox 2 is a protein that in humans is encoded by the ''ZEB2'' gene. The ZEB2 protein is a transcription factor that plays a role in the transforming growth factor β (TGFβ) signaling pathways that are essential durin ...
mRNA requires the retention of a 5'UTR
intron An intron is any Nucleic acid sequence, nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e. a region inside a gene."The notion of ...
that contains an
internal ribosome entry site An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. In eukaryotic translation, initiation typically occurs at ...
for efficient translation. The retention of the intron depends on the expression of an
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context ...
transcript that complements the intronic 5' splice site. Therefore, the
ectopic expression Ectopic is a word used with a prefix, ecto, meaning “out of place.” Ectopic expression is an abnormal gene expression in a cell type, tissue type, or developmental stage in which the gene is not usually expressed. The term ectopic expression is ...
of the antisense transcript represses splicing and induces translation of the Zeb2 mRNA during
mesenchymal Mesenchyme () is a type of loosely organized animal embryonic connective tissue of undifferentiated cells that give rise to most tissues, such as skin, blood or bone. The interactions between mesenchyme and epithelium help to form nearly every ...
development. Likewise, the expression of an overlapping antisense Rev-ErbAa2 transcript controls the alternative splicing of the
thyroid hormone receptor The thyroid hormone receptor (TR) is a type of nuclear receptor that is activated by binding thyroid hormone. TRs act as transcription factors, ultimately affecting the regulation of gene transcription and translation. These receptors also have ...
ErbAa2 mRNA to form two antagonistic isoforms.


In translation

NcRNA may also apply additional regulatory pressures during
translation Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
, a property particularly exploited in
neuron A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa ...
s where the
dendritic Dendrite derives from the Greek word "dendron" meaning ( "tree-like"), and may refer to: Biology *Dendrite, a branched projection of a neuron * Dendrite (non-neuronal), branching projections of certain skin cells and immune cells Physical *Dendr ...
or
axon An axon (from Greek ἄξων ''áxōn'', axis), or nerve fiber (or nerve fibre: see spelling differences), is a long, slender projection of a nerve cell, or neuron, in vertebrates, that typically conducts electrical impulses known as action p ...
al translation of mRNA in response to synaptic activity contributes to changes in
synaptic plasticity In neuroscience, synaptic plasticity is the ability of synapses to strengthen or weaken over time, in response to increases or decreases in their activity. Since memories are postulated to be represented by vastly interconnected neural circuits ...
and the remodelling of neuronal networks. The RNAP III transcribed BC1 and BC200 ncRNAs, that previously derived from
tRNAs Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ...
, are expressed in the mouse and human
central nervous system The central nervous system (CNS) is the part of the nervous system consisting primarily of the brain and spinal cord. The CNS is so named because the brain integrates the received information and coordinates and influences the activity of all p ...
, respectively. BC1 expression is induced in response to synaptic activity and
synaptogenesis Synaptogenesis is the formation of synapses between neurons in the nervous system. Although it occurs throughout a healthy person's lifespan, an explosion of synapse formation occurs during early brain development, known as exuberant synaptogenes ...
and is specifically targeted to dendrites in neurons. Sequence complementarity between BC1 and regions of various neuron-specific
mRNAs In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the p ...
also suggest a role for BC1 in targeted translational repression. Indeed, it was recently shown that BC1 is associated with translational repression in dendrites to control the efficiency of dopamine D2 receptor-mediated transmission in the
striatum The striatum, or corpus striatum (also called the striate nucleus), is a nucleus (a cluster of neurons) in the subcortical basal ganglia of the forebrain. The striatum is a critical component of the motor and reward systems; receives gluta ...
and BC1 RNA-deleted mice exhibit behavioural changes with reduced exploration and increased
anxiety Anxiety is an emotion which is characterized by an unpleasant state of inner turmoil and includes feelings of dread over anticipated events. Anxiety is different than fear in that the former is defined as the anticipation of a future threat wh ...
.


In siRNA-directed gene regulation

In addition to masking key elements within single-stranded RNA, the formation of
double-stranded RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydr ...
duplexes can also provide a substrate for the generation of endogenous siRNAs (endo-siRNAs) in
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many speci ...
and mouse
oocyte An oocyte (, ), oöcyte, or ovocyte is a female gametocyte or germ cell involved in reproduction. In other words, it is an immature ovum, or egg cell. An oocyte is produced in a female fetus in the ovary during female gametogenesis. The female ...
s. The annealing of complementary sequences, such as antisense or repetitive regions between transcripts, forms an RNA duplex that may be processed by Dicer-2 into endo-siRNAs. Also, long ncRNAs that form extended intramolecular hairpins may be processed into siRNAs, compellingly illustrated by the esi-1 and esi-2 transcripts. Endo-siRNAs generated from these transcripts seem particularly useful in suppressing the spread of mobile transposon elements within the genome in the germline. However, the generation of endo-siRNAs from antisense transcripts or
pseudogene Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by DNA duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes ar ...
s may also silence the expression of their functional counterparts via RISC effector complexes, acting as an important node that integrates various modes of long and short RNA regulation, as exemplified by the Xist and
Tsix Tsix is a non-coding RNA gene that is antisense to the Xist RNA. Tsix binds Xist during X chromosome inactivation. The name Tsix comes from the reverse of Xist, which stands for X-inactive specific transcript. Background Female mammals have ...
(see above).


In epigenetic regulation

Epigenetic modifications, including
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
and
DNA methylation DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts ...
,
histone acetylation Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-''N''-acetyllysine. DNA is wrapped around histones, and, by transferring an ...
and
sumoylation In molecular biology, SUMO (Small Ubiquitin-like Modifier) proteins are a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their function. This process is called SUMOylation (sometimes w ...
, affect many aspects of chromosomal biology, primarily including regulation of large numbers of genes by remodeling broad
chromatin Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important ...
domains. While it has been known for some time that RNA is an integral component of chromatin, it is only recently that we are beginning to appreciate the means by which RNA is involved in pathways of chromatin modification. For example, Oplr16 epigenetically induces the activation of
stem cell In multicellular organisms, stem cells are undifferentiated or partially differentiated cells that can differentiate into various types of cells and proliferate indefinitely to produce more of the same stem cell. They are the earliest type of ...
core factors by coordinating intrachromosomal
looping Looping may refer to: Media and entertainment * Loop (music), a repeating section of sound material * Audio induction loop, an aid for the hard of hearing * a film production term for dubbing (filmmaking) * repeating drawings in an animated cartoo ...
and recruitment of DNA demethylase TET2. In
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many speci ...
, long ncRNAs induce the expression of the homeotic gene, Ubx, by recruiting and directing the chromatin modifying functions of the trithorax protein Ash1 to Hox regulatory elements. Similar models have been proposed in mammals, where strong epigenetic mechanisms are thought to underlie the embryonic expression profiles of the Hox genes that persist throughout human development. Indeed, the human
Hox gene Hox genes, a subset of homeobox genes, are a group of related genes that specify regions of the body plan of an embryo along the head-tail axis of animals. Hox proteins encode and specify the characteristics of 'position', ensuring that the cor ...
s are associated with hundreds of ncRNAs that are sequentially expressed along both the spatial and temporal axes of human development and define chromatin domains of differential histone methylation and
RNA polymerase In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template. Using the enzyme helicase, RNAP locally opens th ...
accessibility. One ncRNA, termed HOTAIR, that originates from the HOXC locus represses transcription across 40 kb of the HOXD locus by altering chromatin trimethylation state. HOTAIR is thought to achieve this by directing the action of
Polycomb Polycomb-group proteins (PcG proteins) are a family of protein complexes first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place. Polycomb-group proteins are well known for silencing Hox genes ...
chromatin remodeling complexes in trans to govern the cells' epigenetic state and subsequent
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. T ...
. Components of the Polycomb complex, including Suz12,
EZH2 Enhancer of zeste homolog 2 (EZH2) is a histone-lysine N-methyltransferase enzyme ( EC 2.1.1.43) encoded by gene, that participates in histone methylation and, ultimately, transcriptional repression. EZH2 catalyzes the addition of methyl groups ...
and EED, contain RNA binding domains that may potentially bind HOTAIR and probably other similar ncRNAs. This example nicely illustrates a broader theme whereby ncRNAs recruit the function of a generic suite of chromatin modifying proteins to specific genomic loci, underscoring the complexity of recently published genomic maps. Indeed, the prevalence of long ncRNAs associated with protein coding genes may contribute to localised patterns of chromatin modifications that regulate gene expression during development. For example, the majority of protein-coding genes have antisense partners, including many tumour suppressor genes that are frequently silenced by epigenetic mechanisms in cancer. A recent study observed an inverse expression profile of the p15 gene and an antisense ncRNA in leukaemia. A detailed analysis showed the p15 antisense ncRNA (
CDKN2BAS CDKN2B-AS, also known as ANRIL (antisense non-coding RNA in the INK4 locus) is a long non-coding RNA consisting of 19 exons, spanning 126.3kb in the genome, and its spliced product is a 3834bp RNA. It is located within the p15/CDKN2B-p16/CDKN2 ...
) was able to induce changes to heterochromatin and DNA methylation status of p15 by an unknown mechanism, thereby regulating p15 expression. Therefore, misexpression of the associated antisense ncRNAs may subsequently silence the tumour suppressor gene contributing towards
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
.


Imprinting

Many emergent themes of ncRNA-directed
chromatin Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important ...
modification were first apparent within the phenomenon of imprinting, whereby only one allele of a gene is expressed from either the maternal or the paternal
chromosome A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins ar ...
. In general, imprinted genes are clustered together on chromosomes, suggesting the imprinting mechanism acts upon local chromosome domains rather than individual genes. These clusters are also often associated with long ncRNAs whose expression is correlated with the repression of the linked protein-coding gene on the same allele. Indeed, detailed analysis has revealed a crucial role for the ncRNAs Kcnqot1 and Igf2r/Air in directing imprinting. Almost all the genes at the
Kcnq1 Kv7.1 (KvLQT1) is a potassium channel protein whose primary subunit in humans is encoded by the ''KCNQ1'' gene. Kv7.1 is a voltage and lipid-gated potassium channel present in the cell membranes of cardiac tissue and in inner ear neurons among ...
loci are maternally inherited, except the paternally expressed antisense ncRNA Kcnqot1. Transgenic mice with truncated Kcnq1ot fail to silence the adjacent genes, suggesting that Kcnqot1 is crucial to the imprinting of genes on the paternal chromosome. It appears that Kcnqot1 is able to direct the trimethylation of lysine 9 (
H3K9me3 H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin. Nomenclature H3K9me3 ...
) and 27 of histone 3 (
H3K27me3 H3K27me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein. This tri-methylation is associated with the downregulation of nearby genes via t ...
) to an imprinting centre that overlaps the Kcnqot1 promoter and actually resides within a Kcnq1 sense exon. Similar to HOTAIR (see above), Eed-Ezh2 Polycomb complexes are recruited to the Kcnq1 loci paternal chromosome, possibly by Kcnqot1, where they may mediate gene silencing through repressive
histone methylation Histone methylation is a process by which methyl groups are transferred to amino acids of histone proteins that make up nucleosomes, which the DNA double helix wraps around to form chromosomes. Methylation of histones can either increase or decr ...
. A differentially methylated imprinting centre also overlaps the promoter of a long antisense ncRNA Air that is responsible for the silencing of neighbouring genes at the Igf2r locus on the paternal chromosome. The presence of allele-specific histone methylation at the Igf2r locus suggests Air also mediates silencing via chromatin modification.


Xist and X-chromosome inactivation

The inactivation of a X-chromosome in female placental mammals is directed by one of the earliest and best characterized long ncRNAs, Xist. The expression of Xist from the future inactive X-chromosome, and its subsequent coating of the inactive X-chromosome, occurs during early
embryonic stem cell Embryonic stem cells (ESCs) are pluripotent stem cells derived from the inner cell mass of a blastocyst, an early-stage pre- implantation embryo. Human embryos reach the blastocyst stage 4–5 days post fertilization, at which time they consist ...
differentiation. Xist expression is followed by irreversible layers of chromatin modifications that include the loss of the histone (H3K9) acetylation and H3K4 methylation that are associated with active chromatin, and the induction of repressive chromatin modifications including H4 hypoacetylation, H3K27 trimethylation,
H3K9 The histone code is a hypothesis that the transcription of genetic information encoded in DNA is in part regulated by chemical modifications (known as ''histone marks'') to histone proteins, primarily on their unstructured ends. Together with sim ...
hypermethylation and H4K20 monomethylation as well as H2AK119 monoubiquitylation. These modifications coincide with the transcriptional silencing of the X-linked genes. Xist RNA also localises the histone variant macroH2A to the inactive X–chromosome. There are additional ncRNAs that are also present at the Xist loci, including an antisense transcript
Tsix Tsix is a non-coding RNA gene that is antisense to the Xist RNA. Tsix binds Xist during X chromosome inactivation. The name Tsix comes from the reverse of Xist, which stands for X-inactive specific transcript. Background Female mammals have ...
, which is expressed from the future active chromosome and able to repress Xist expression by the generation of endogenous siRNA. Together these ncRNAs ensure that only one X-chromosome is active in
female Female ( symbol: ♀) is the sex of an organism that produces the large non-motile ova (egg cells), the type of gamete (sex cell) that fuses with the male gamete during sexual reproduction. A female has larger gametes than a male. Fema ...
mammals.


Telomeric non-coding RNAs

Telomeres A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes. Although there are different architectures, telomeres, in a broad sense, are a widespread genetic feature mos ...
form the terminal region of mammalian chromosomes and are essential for stability and aging and play central roles in diseases such as
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
. Telomeres have been long considered transcriptionally inert DNA-protein complexes until it was shown in the late 2000s that telomeric repeats may be transcribed as telomeric RNAs (TelRNAs) or
telomeric repeat-containing RNAs A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes. Although there are different architectures, telomeres, in a broad sense, are a widespread genetic feature mos ...
. These ncRNAs are heterogeneous in length, transcribed from several sub-telomeric loci and physically localise to telomeres. Their association with chromatin, which suggests an involvement in regulating telomere specific heterochromatin modifications, is repressed by SMG proteins that protect chromosome ends from telomere loss. In addition, TelRNAs block
telomerase Telomerase, also called terminal transferase, is a ribonucleoprotein that adds a species-dependent telomere repeat sequence to the 3' end of telomeres. A telomere is a region of repetitive sequences at each end of the chromosomes of most euk ...
activity in vitro and may therefore regulate telomerase activity. Although early, these studies suggest an involvement for telomeric ncRNAs in various aspects of telomere biology.


In regulation of DNA replication timing and chromosome stability

Asynchronously replicating autosomal RNAs (ASARs) are very long (~200kb) non-coding RNAs that are non-spliced, non-polyadenylated, and are required for normal
DNA replication In molecular biology, DNA replication is the biological process of producing two identical replicas of DNA from one original DNA molecule. DNA replication occurs in all living organisms acting as the most essential part for biological inheritan ...
timing and chromosome stability. Deletion of any one of the genetic loci containing ASAR6, ASAR15, or ASAR6-141 results in the same phenotype of delayed replication timing and delayed
mitotic In cell biology, mitosis () is a part of the cell cycle in which replicated chromosomes are separated into two new nuclei. Cell division by mitosis gives rise to genetically identical cells in which the total number of chromosomes is maintai ...
condensation (DRT/DMC) of the entire chromosome. DRT/DMC results in chromosomal segregation errors that lead to increased frequency of secondary rearrangements and an unstable chromosome. Similar to Xist, ASARs show random monoallelic expression and exist in asynchronous DNA replication domains. Although the mechanism of ASAR function is still under investigation, it is hypothesized that they work via similar mechanisms as the Xist lncRNA, but on smaller autosomal domains resulting in allele specific changes in gene expression. Incorrect reparation of DNA double-strand breaks (DSB) leading to chromosomal rearrangements is one of the oncogenesis's primary causes. A number of lncRNAs are crucial at the different stages of the main pathways of DSB repair in
eukaryotic cells Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bact ...
: nonhomologous end joining (
NHEJ Non-homologous end joining (NHEJ) is a pathway that repairs double-strand breaks in DNA. NHEJ is referred to as "non-homologous" because the break ends are directly ligated without the need for a homologous template, in contrast to homology direc ...
) and homology-directed repair ( HDR). Gene mutations or variation in expression levels of such RNAs can lead to local DNA repair defects, increasing the chromosome aberration frequency. Moreover, it was demonstrated that some RNAs could stimulate long-range chromosomal rearrangements.


In aging and disease

The discovery that long ncRNAs function in various aspects of cell biology has led to research on their role in
disease A disease is a particular abnormal condition that negatively affects the structure or function of all or part of an organism, and that is not immediately due to any external injury. Diseases are often known to be medical conditions that a ...
. Tens of thousands of lncRNAs are potentially associated with diseases based on the
multi-omics Multiomics, multi-omics, integrative omics, "panomics" or "pan-omics" is a biological analysis approach in which the data sets are multiple "omes", such as the genome, proteome, transcriptome, epigenome, metabolome, and microbiome (i.e., a meta ...
evidence. A handful of studies have implicated long ncRNAs in a variety of disease states and support an involvement and co-operation in
neurological disease A neurological disorder is any disorder of the nervous system. Structural, biochemical or electrical abnormalities in the brain, spinal cord or other nerves can result in a range of symptoms. Examples of symptoms include paralysis, muscle weakness ...
and
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
. The first published report of an alteration in lncRNA abundance in aging and human
neurological disease A neurological disorder is any disorder of the nervous system. Structural, biochemical or electrical abnormalities in the brain, spinal cord or other nerves can result in a range of symptoms. Examples of symptoms include paralysis, muscle weakness ...
was provided by Lukiw et al. in a study using short
post-mortem interval The post-mortem interval (PMI) is the time that has elapsed since an individual's death. When the time of death is not known, the interval may be estimated, and so an approximate time of death established. Postmortem interval estimations can ra ...
tissues from patients with
Alzheimer's disease Alzheimer's disease (AD) is a neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As ...
and non-Alzheimer's dementia (NAD) ; this early work was based on the prior identification of a
primate Primates are a diverse order of mammals. They are divided into the strepsirrhines, which include the lemurs, galagos, and lorisids, and the haplorhines, which include the tarsiers and the simians ( monkeys and apes, the latter includin ...
brain-specific
cytoplasmic In cell biology, the cytoplasm is all of the material within a eukaryotic cell, enclosed by the cell membrane, except for the cell nucleus. The material inside the nucleus and contained within the nuclear membrane is termed the nucleoplasm. The ...
transcript of the
Alu repeat An Alu element is a short stretch of DNA originally characterized by the action of the '' Arthrobacter luteus (Alu)'' restriction endonuclease. ''Alu'' elements are the most abundant transposable elements, containing over one million copies di ...
family by Watson and Sutcliffe in 1987 known as BC200 (brain, cytoplasmic, 200 nucleotide). While many association studies have identified unusual expression of long ncRNAs in disease states, there is little understanding of their role in causing disease. Expression analyses that compare
tumor cells A neoplasm () is a type of abnormal and excessive growth of tissue. The process that occurs to form or produce a neoplasm is called neoplasia. The growth of a neoplasm is uncoordinated with that of the normal surrounding tissue, and persists ...
and normal cells have revealed changes in the expression of ncRNAs in several forms of
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
. For example, in prostate tumours, PCGEM1 (one of two overexpressed ncRNAs) is correlated with increased proliferation and colony formation suggesting an involvement in regulating cell growth. PRNCR1 was found to promote
tumor A neoplasm () is a type of abnormal and excessive growth of tissue. The process that occurs to form or produce a neoplasm is called neoplasia. The growth of a neoplasm is uncoordinated with that of the normal surrounding tissue, and persists ...
growth in several
malignancies Malignancy () is the tendency of a medical condition to become progressively worse. Malignancy is most familiar as a characterization of cancer. A ''malignant'' tumor contrasts with a non-cancerous ''benign'' tumor in that a malignancy is not ...
like
prostate cancer Prostate cancer is cancer of the prostate. Prostate cancer is the second most common cancerous tumor worldwide and is the fifth leading cause of cancer-related mortality among men. The prostate is a gland in the male reproductive system that su ...
,
breast cancer Breast cancer is cancer that develops from breast tissue. Signs of breast cancer may include a lump in the breast, a change in breast shape, dimpling of the skin, milk rejection, fluid coming from the nipple, a newly inverted nipple, or ...
,
non-small cell lung cancer Non-small-cell lung cancer (NSCLC) is any type of epithelial lung cancer other than small-cell lung carcinoma (SCLC). NSCLC accounts for about 85% of all lung cancers. As a class, NSCLCs are relatively insensitive to chemotherapy, compared to s ...
, oral squamous cell carcinoma and
colorectal cancer Colorectal cancer (CRC), also known as bowel cancer, colon cancer, or rectal cancer, is the development of cancer from the colon or rectum (parts of the large intestine). Signs and symptoms may include blood in the stool, a change in bowel ...
. MALAT1 (also known as NEAT2) was originally identified as an abundantly expressed ncRNA that is upregulated during
metastasis Metastasis is a pathogenic agent's spread from an initial or primary site to a different or secondary site within the host's body; the term is typically used when referring to metastasis by a cancerous tumor. The newly pathological sites, the ...
of early-stage
non-small cell lung cancer Non-small-cell lung cancer (NSCLC) is any type of epithelial lung cancer other than small-cell lung carcinoma (SCLC). NSCLC accounts for about 85% of all lung cancers. As a class, NSCLCs are relatively insensitive to chemotherapy, compared to s ...
and its overexpression is an early prognostic marker for poor patient survival rates. LncRNAs such as HEAT2 or KCNQ1OT1 have been shown to be regulated in the blood of patients with cardiovascular diseases such as heart failure or coronary artery disease and, moreover, to predict cardiovascular disease events. More recently, the highly conserved mouse homologue of MALAT1 was found to be highly expressed in
hepatocellular carcinoma Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer in adults and is currently the most common cause of death in people with cirrhosis. HCC is the third leading cause of cancer-related deaths worldwide. It occurs in t ...
. Intronic antisense ncRNAs with expression correlated to the degree of tumor differentiation in prostate cancer samples have also been reported. Despite a number of long ncRNAs having aberrant expression in cancer, their function and potential role in tumourigenesis is relatively unknown. For example, the ncRNAs HIS-1 and BIC have been implicated in cancer development and growth control, but their function in normal cells is unknown. In addition to cancer, ncRNAs also exhibit aberrant expression in other disease states. Overexpression of PRINS is associated with
psoriasis Psoriasis is a long-lasting, noncontagious autoimmune disease characterized by raised areas of abnormal skin. These areas are red, pink, or purple, dry, itchy, and scaly. Psoriasis varies in severity from small, localized patches to comple ...
susceptibility, with PRINS expression being elevated in the uninvolved
epidermis The epidermis is the outermost of the three layers that comprise the skin, the inner layers being the dermis and hypodermis. The epidermis layer provides a barrier to infection from environmental pathogens and regulates the amount of water rel ...
of psoriatic patients compared with both psoriatic lesions and healthy epidermis. Genome-wide profiling revealed that many transcribed non-coding ultraconserved regions exhibit distinct profiles in various human cancer states. An analysis of
chronic lymphocytic leukaemia Chronic lymphocytic leukemia (CLL) is a type of cancer in which the bone marrow makes too many lymphocytes (a type of white blood cell). Early on, there are typically no symptoms. Later, non-painful lymph node swelling, feeling tired, fever, nigh ...
,
colorectal carcinoma Colorectal cancer (CRC), also known as bowel cancer, colon cancer, or rectal cancer, is the development of cancer from the colon or rectum (parts of the large intestine). Signs and symptoms may include blood in the stool, a change in bowel m ...
and
hepatocellular carcinoma Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer in adults and is currently the most common cause of death in people with cirrhosis. HCC is the third leading cause of cancer-related deaths worldwide. It occurs in t ...
found that all three cancers exhibited aberrant expression profiles for ultraconserved ncRNAs relative to normal cells. Further analysis of one ultraconserved ncRNA suggested it behaved like an
oncogene An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels.
by mitigating apoptosis and subsequently expanding the number of malignant cells in colorectal cancers. Many of these transcribed ultraconserved sites that exhibit distinct signatures in cancer are found at fragile sites and genomic regions associated with cancer. It seems likely that the aberrant expression of these ultraconserved ncRNAs within malignant processes results from important functions they fulfil in normal human
development Development or developing may refer to: Arts *Development hell, when a project is stuck in development *Filmmaking, development phase, including finance and budgeting *Development (music), the process thematic material is reshaped * Photograph ...
. Recently, a number of association studies examining
single nucleotide polymorphisms In genetics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently lar ...
(SNPs) associated with disease states have been mapped to long ncRNAs. For example, SNPs that identified a susceptibility locus for
myocardial infarction A myocardial infarction (MI), commonly known as a heart attack, occurs when blood flow decreases or stops to the coronary artery of the heart, causing damage to the heart muscle. The most common symptom is chest pain or discomfort which ma ...
mapped to a long ncRNA, MIAT (myocardial infarction associated transcript). Likewise, genome-wide association studies identified a region associated with
coronary artery disease Coronary artery disease (CAD), also called coronary heart disease (CHD), ischemic heart disease (IHD), myocardial ischemia, or simply heart disease, involves the reduction of blood flow to the heart muscle due to build-up of atherosclerotic pl ...
that encompassed a long ncRNA, ANRIL. ANRIL is expressed in tissues and
cell type A cell type is a classification used to identify cells that share morphological or phenotypical features. A multicellular organism may contain cells of a number of widely differing and specialized cell types, such as muscle cells and skin cell ...
s affected by
atherosclerosis Atherosclerosis is a pattern of the disease arteriosclerosis in which the wall of the artery develops abnormalities, called lesions. These lesions may lead to narrowing due to the buildup of atheromatous plaque. At onset there are usually no s ...
and its altered expression is associated with a high-risk
haplotype A haplotype ( haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material ( DNA) which is inherited from two parents. Normally these organisms have their DNA o ...
for coronary artery disease. The complexity of the
transcriptome The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The t ...
, and our evolving understanding of its structure may inform a reinterpretation of the functional basis for many natural polymorphisms associated with disease states. Many SNPs associated with certain disease conditions are found within non-coding regions and the complex networks of non-coding transcription within these regions make it particularly difficult to elucidate the functional effects of polymorphisms. For example, a SNP both within the truncated form of ZFAT and the promoter of an antisense transcript increases the expression of ZFAT not through increasing the
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein. mRNA is created during the ...
stability, but rather by repressing the expression of the antisense transcript. The ability of long ncRNAs to regulate associated protein-coding genes may contribute to disease if misexpression of a long ncRNA deregulates a protein coding gene with clinical significance. In similar manner, an antisense long ncRNA that regulates the expression of the sense BACE1 gene, a crucial enzyme in
Alzheimer's disease Alzheimer's disease (AD) is a neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As ...
etiology, exhibits elevated expression in several regions of the brain in individuals with Alzheimer's disease Alteration of the expression of ncRNAs may also mediate changes at an epigenetic level to affect gene expression and contribute to disease aetiology. For example, the induction of an
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context ...
transcript by a genetic mutation led to
DNA methylation DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts ...
and silencing of sense genes, causing β-thalassemia in a patient. Alongside their role in mediating pathological processes, long noncoding RNAs play a role in the
immune response An immune response is a reaction which occurs within an organism for the purpose of defending against foreign invaders. These invaders include a wide variety of different microorganisms including viruses, bacteria, parasites, and fungi which coul ...
to
vaccination Vaccination is the administration of a vaccine to help the immune system develop immunity from a disease. Vaccines contain a microorganism or virus in a weakened, live or killed state, or proteins or toxins from the organism. In stimulat ...
, as identified for both the
influenza vaccine Influenza vaccines, also known as flu shots, are vaccines that protect against infection by influenza viruses. New versions of the vaccines are developed twice a year, as the influenza virus rapidly changes. While their effectiveness varies f ...
and the yellow fever vaccine.


See also

* List of long non-coding RNA databases * NONCODE *
Pinc Pinc (pregnancy induced noncoding RNA) is a long non-coding RNA. It was originally identified in the mammary glands of oestrogen and progesterone-treated rats. Pinc may be a mammal-specific gene. It is conserved in a number of mammalian genome ...
* Sphinx (gene) *
VIS1 VIS1 (viral integration site 1), also known as HIS-1, is a long non-coding RNA. It was originally identified in mice in a screen for genes involved in the development of myeloid leukemia. In murine The Old World rats and mice, part of the ...
* ZNRD1-AS1


References

{{DEFAULTSORT:Long Noncoding Rna RNA Non-coding RNA Biotechnology