LINE1
   HOME

TheInfoList



OR:

LINE1 (an abbreviation of Long interspersed nuclear element-1, also known as L1 and LINE-1) is a family of related class I transposable elements in the
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
of many groups of
eukaryotes The eukaryotes ( ) constitute the domain of Eukaryota or Eukarya, organisms whose cells have a membrane-bound nucleus. All animals, plants, fungi, seaweeds, and many unicellular organisms are eukaryotes. They constitute a major group of ...
, including animals and plants, classified with the long interspersed nuclear elements (LINEs). L1 transposons are most ubiquitous in mammals, where they make up a significant fraction of the total genome length, for example they comprise approximately 17% of the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as the DNA within each of the 23 distinct chromosomes in the cell nucleus. A small DNA molecule is found within individual Mitochondrial DNA, mitochondria. These ar ...
. These active L1s can interrupt the genome through insertions, deletions, rearrangements, and
copy number variation Copy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals. Copy number variation is a type of structural variation: specifically, it is a type of ...
s. L1 activity has contributed to the instability and evolution of genomes and is tightly regulated in the germline by
DNA methylation DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter (genetics), promoter, DNA methylati ...
,
histone modification In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes. ...
s, and
piRNA Pirna (; , ) is a town in Saxony, Germany and capital of the administrative district Sächsische Schweiz-Osterzgebirge. The town's population is over 37,000. Pirna is located near Dresden and is an important district town as well as a ''Große ...
. L1s can further impact genome variation through mispairing and
unequal crossing over Unequal crossing over is a type of gene duplication or deletion event that deletes a sequence in one strand and replaces it with a duplication from its sister chromatid in mitosis or from its homologous chromosome during meiosis. It is a type of ...
during meiosis due to its repetitive DNA sequences. L1 gene products are also required by many non-autonomous Alu and SVA
SINE In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side opposite th ...
retrotransposons.
Mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
s induced by L1 and its non-autonomous counterparts have been found to cause a variety of heritable and somatic diseases. In 2011, human L1 was reportedly discovered in the genome of the
gonorrhea Gonorrhoea or gonorrhea, colloquially known as the clap, is a sexually transmitted infection (STI) caused by the bacterium ''Neisseria gonorrhoeae''. Infection may involve the genitals, mouth, or rectum. Gonorrhea is spread through sexual c ...
bacteria, evidently having arrived there by
horizontal gene transfer Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the e ...
.


Structure

A typical L1 element is approximately 6,000
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s (bp) long and consists of two non-overlapping
open reading frame In molecular biology, reading frames are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible reading frames ...
s (ORFs) which are flanked by
untranslated region In molecular genetics, an untranslated region (or UTR) refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the Directionality (molecular biology), 5' side, it is called the Five prime ...
s (UTRs) and target site duplications. In humans, ORF2 is thought to be translated by an unconventional termination/reinitiation mechanism, while mouse L1s contain an
internal ribosome entry site An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. Initiation of eukaryotic translation nearly always occur ...
(IRES) upstream of each ORF.


5' UTR

The 5' UTRs of mouse L1s contain a variable number of GC-rich tandemly repeated monomers of around 200 bp, followed by a short non-monomeric region. Human 5’ UTRs are ~900 bp in length and do not contain repeated motifs. All families of human L1s harbor in their most 5’ extremity a binding motif for the transcription factor YY1. Younger families also have two binding sites for SOX-family transcription factors, and both YY1 and SOX sites were shown to be required for human L1 transcription initiation and activation. Both mouse and human 5’ UTRs also contain a weak
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, ...
promoter of unknown function.


ORF1

The first ORF of L1 encodes a 500-amino acid, 40- kDa protein that lacks homology with any protein of known function. In vertebrates, it contains a conserved
C-terminus The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, carboxy tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein Proteins are large biomolecules and macromolecules that comp ...
domain and a highly variable coiled-coil
N-terminus The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amin ...
that mediates the formation of ORF1 trimeric complexes. ORF1 trimers have RNA-binding and nucleic acid chaperone activity that are necessary for retrotransposition.


ORF2

The second ORF of L1 encodes a protein that has
endonuclease In molecular biology, endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain (namely DNA or RNA). Some, such as deoxyribonuclease I, cut DNA relatively nonspecifically (with regard to sequence), while man ...
and
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
activity. The encoded protein has a molecular weight of 150 kDa. The structure of the ORF2 protein was solved in 2023. Its protein core contains three domains of unknown functions, termed "tower/EN-linker" and "wrist/RNA-binding domain" that bind Alu RNA's polyA tail and C-terminal domain that binds Alu RNA stem loop. The nicking and
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
activities of L1 ORF2p are boosted by single-stranded DNA structures likely present on the active
replication fork In molecular biology, DNA replication is the biological process of producing two identical replicas of DNA from one original DNA molecule. DNA replication occurs in all living organisms, acting as the most essential part of biological inheritanc ...
s. Unlike viral RTs, L1 ORF2p can be primed by RNA, including RNA hairpin primers produced by the Alu element.


Regulation

As with other transposable elements, the host organism keeps a heavy check on LINE1 to prevent it from becoming overly active. In the primitive eukaryote ''
Entamoeba histolytica ''Entamoeba histolytica'' is an anaerobic organism, anaerobic parasitic amoebozoan, part of the genus ''Entamoeba''. Predominantly infecting humans and other primates causing amoebiasis, ''E. histolytica'' is estimated to infect about 35-50 mil ...
'', ORF2 is massively expressed in
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, ...
, resulting in no detectable amounts of its protein product.


Roles in disease


Cancer

L1 activity has been observed in numerous types of
cancer Cancer is a group of diseases involving Cell growth#Disorders, abnormal cell growth with the potential to Invasion (cancer), invade or Metastasis, spread to other parts of the body. These contrast with benign tumors, which do not spread. Po ...
s, with particularly extensive insertions found in colorectal and lung cancers. It is currently unclear if these insertions are causes or secondary effects of cancer progression. However, at least two cases have found somatic L1 insertions causative of cancer by disrupting the coding sequences of genes APC and PTEN in colon and
endometrial The endometrium is the inner epithelium, epithelial layer, along with its mucous membrane, of the mammalian uterus. It has a basal layer and a functional layer: the basal layer contains stem cells which regenerate the functional layer. The funct ...
cancer, respectively. Quantification of L1 copy number by qPCR or L1 methylation levels with bisulfite sequencing are used as diagnostic biomarkers in some types of cancers. L1 hypomethylation of colon tumor samples is correlated with cancer stage progression. Furthermore, less invasive blood assays for L1 copy number or methylation levels are indicative of breast or bladder cancer progression and may serve as methods for early detection.


Neuropsychiatric disorders

Higher L1 copy numbers have been observed in the human
brain The brain is an organ (biology), organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It consists of nervous tissue and is typically located in the head (cephalization), usually near organs for ...
compared to other organs. Studies of animal models and human cell lines have shown that L1s become active in neural progenitor cells (NPCs), and that experimental deregulation of or overexpression of L1 increases somatic mosaicism. This phenomenon is negatively regulated by Sox2, which is downregulated in NPCs, and by MeCP2 and methylation of the L1 5' UTR. Human cell lines modeling the neurological disorder
Rett syndrome Rett syndrome (RTT) is a genetic disorder that typically becomes apparent after 6–18 months of age and almost exclusively in girls. Symptoms include impairments in language and coordination, and repetitive movements. Those affected often h ...
, which carry MeCP2 mutations, exhibit increased L1 transposition, suggesting a link between L1 activity and neurological disorders. Current studies are aimed at investigating the potential roles of L1 activity in various neuropsychiatric disorders including
schizophrenia Schizophrenia () is a mental disorder characterized variously by hallucinations (typically, Auditory hallucination#Schizophrenia, hearing voices), delusions, thought disorder, disorganized thinking and behavior, and Reduced affect display, f ...
,
autism spectrum disorders Autism, also known as autism spectrum disorder (ASD), is a neurodevelopmental disorder characterized by differences or difficulties in social communication and interaction, a preference for predictability and routine, sensory processing di ...
,
epilepsy Epilepsy is a group of Non-communicable disease, non-communicable Neurological disorder, neurological disorders characterized by a tendency for recurrent, unprovoked Seizure, seizures. A seizure is a sudden burst of abnormal electrical activit ...
,
bipolar disorder Bipolar disorder (BD), previously known as manic depression, is a mental disorder characterized by periods of Depression (mood), depression and periods of abnormally elevated Mood (psychology), mood that each last from days to weeks, and in ...
,
Tourette syndrome Tourette syndrome (TS), or simply Tourette's, is a common neurodevelopmental disorder that begins in childhood or adolescence. It is characterized by multiple movement (motor) tics and at least one vocal (phonic) tic. Common tics are blinkin ...
, and drug
addiction Addiction is a neuropsychological disorder characterized by a persistent and intense urge to use a drug or engage in a behavior that produces natural reward, despite substantial harm and other negative consequences. Repetitive drug use can ...
. L1s are also highly expressed in octopus brain, suggesting a convergent mechanism in complex cognition.


Retinal disease

Increased RNA levels of Alu, which requires L1 proteins, are associated with a form of age-related
macular degeneration Macular degeneration, also known as age-related macular degeneration (AMD or ARMD), is a medical condition which may result in blurred vision, blurred or vision loss, no vision in the center of the visual field. Early on there are often no sym ...
, a neurological disorder of the
eye An eye is a sensory organ that allows an organism to perceive visual information. It detects light and converts it into electro-chemical impulses in neurons (neurones). It is part of an organism's visual system. In higher organisms, the ey ...
s. The naturally occurring mouse retinal degeneration model rd7 is caused by an L1 insertion in the Nr2e3 gene.


COVID-19

In 2021, a study proposed that L1 elements may be responsible for potential endogenisation of the
SARS-CoV-2 Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the Novel coronavirus, provisional nam ...
genome in Huh7 mutant cancer cells, which would possibly explain why some patients test PCR positive for SARS-CoV-2 even after clearance of the virus. These results however have been criticized as "mechanistically plausible but likely very rare", misleading and infrequent or artefactual.


See also

* L1Base, a database of functional annotations and predictions of active LINE1 elements


References


Further reading

* * * * * * * * {{Repeated sequence Mobile genetic elements Molecular biology