HOME

TheInfoList



OR:

LTR retrotransposons are class I transposable elements (TEs) characterized by the presence of
long terminal repeat A long terminal repeat (LTR) is a pair of identical sequences of DNA, several hundred base pairs long, which occur in eukaryotic genomes on either end of a series of genes or pseudogenes that form a retrotransposon or an endogenous retrovirus o ...
s (LTRs) directly flanking an internal coding region. As retrotransposons, they mobilize through reverse transcription of their
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
and integration of the newly created
cDNA In genetics, complementary DNA (cDNA) is DNA that was reverse transcribed (via reverse transcriptase) from an RNA (e.g., messenger RNA or microRNA). cDNA exists in both single-stranded and double-stranded forms and in both natural and engin ...
into another genomic location. Their mechanism of retrotransposition is shared with
retrovirus A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. After invading a host cell's cytoplasm, the virus uses its own reverse transcriptase e ...
es, with the difference that the rate of horizontal transfer in LTR-retrotransposons is much lower than the vertical transfer by passing active TE insertions to the progeny. LTR retrotransposons that form virus-like particles are classified under ''
Ortervirales ''Ortervirales'' is an order that contains all accepted species of single-stranded RNA viruses that replicate through a DNA intermediate (Group VI) and all accepted species of double-stranded DNA viruses (except ''Hepadnaviridae'') that replicate ...
''. Their size ranges from a few hundred base pairs to 30 kb, the largest species reported to date are members of the Burro retrotransposon family in '' Schmidtea mediterranea''. In plant genomes, LTR retrotransposons are the major repetitive sequence class constituting more than 75% of the maize genome. LTR retrotransposons make up about 8% of the human genome and approximately 10% of the mouse genome.


Structure and propagation

LTR retrotransposons have
direct Direct may refer to: Mathematics * Directed set, in order theory * Direct limit of (pre), sheaves * Direct sum of modules, a construction in abstract algebra which combines several vector spaces Computing * Direct access (disambiguation), ...
long terminal repeat A long terminal repeat (LTR) is a pair of identical sequences of DNA, several hundred base pairs long, which occur in eukaryotic genomes on either end of a series of genes or pseudogenes that form a retrotransposon or an endogenous retrovirus o ...
s that range from ~100 bp to over 5 kb in size. LTR retrotransposons are further sub-classified into the Ty1-''copia''-like (
Pseudoviridae ''Pseudoviridae'' is a family of viruses, which includes three genera. Viruses of the family are actually LTR retrotransposons of the Ty1-copia family. They replicate via structures called virus-like particles (VLPs). VLPs are not infectious li ...
), Ty3-like (
Metaviridae ''Metaviridae'' is a family of viruses which exist as Ty3-gypsy LTR retrotransposons in a eukaryotic host's genome. They are closely related to retroviruses: members of the family ''Metaviridae'' share many genomic elements with retroviruses, i ...
, formally referred to as Gypsy-like, a name that is being considered for retirement), and BEL-Pao-like (
Belpaoviridae ''Semotivirus'' is the only genus of viruses in the family ''Belpaoviridae'' (formerly included in the family ''Metaviridae''). Species exist as retrotransposons in a Eukaryote, eukaryotic host's genome. BEL/pao transposons are only found in anim ...
) groups based on both their degree of sequence similarity and the order of encoded gene products. Ty1-''copia'' and Ty3-Metaviridae groups of retrotransposons are commonly found in high copy number (up to a few million copies per
haploid Ploidy () is the number of complete sets of chromosomes in a cell (biology), cell, and hence the number of possible alleles for Autosome, autosomal and Pseudoautosomal region, pseudoautosomal genes. Here ''sets of chromosomes'' refers to the num ...
nucleus) in animals, fungi, protista, and plants genomes. BEL-Pao like elements have so far only been found in animals. All functional LTR-retrotransposons encode a minimum of two genes, gag and pol, that are sufficient for their replication. ''Gag'' encodes a polyprotein with a capsid and a nucleocapsid domain. Gag proteins form virus-like particles in the cytoplasm inside which reverse-transcription occurs. The ''Pol'' gene produces three proteins: a
protease A protease (also called a peptidase, proteinase, or proteolytic enzyme) is an enzyme that catalysis, catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products ...
(PR), a
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
endowed with an RT (reverse-transcriptase) and an RNAse H domains, and an
integrase Retroviral integrase (IN) is an enzyme An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme ...
(IN). Typically, LTR-retrotransposon mRNAs are produced by the host RNA pol II acting on a promoter located in their 5' LTR. The Gag and Pol genes are encoded in the same mRNA. Depending on the host species, two different strategies can be used to express the two polyproteins: a fusion into a single
open reading frame In molecular biology, reading frames are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible reading frames ...
(ORF) that is then cleaved or the introduction of a frameshift between the two ORFs. Occasional ribosomal frameshifting allows the production of both proteins, while ensuring that much more Gag protein is produced to form virus-like particles. Reverse transcription usually initiates at a short sequence located immediately downstream of the 5'-LTR and termed the primer binding site (PBS). Specific host tRNAs bind to the PBS and act as primers for reverse-transcription, which occurs in a complex and multi-step process, ultimately producing a double- stranded
cDNA In genetics, complementary DNA (cDNA) is DNA that was reverse transcribed (via reverse transcriptase) from an RNA (e.g., messenger RNA or microRNA). cDNA exists in both single-stranded and double-stranded forms and in both natural and engin ...
molecule. The cDNA is finally integrated into a new location, creating short TSDs (Target Site Duplications) and adding a new copy in the host genome


Types


Ty1-''copia'' retrotransposons

Ty1-''copia'' retrotransposons are abundant in species ranging from single-cell
algae Algae ( , ; : alga ) is an informal term for any organisms of a large and diverse group of photosynthesis, photosynthetic organisms that are not plants, and includes species from multiple distinct clades. Such organisms range from unicellular ...
to
bryophytes Bryophytes () are a group of land plants ( embryophytes), sometimes treated as a taxonomic division referred to as Bryophyta '' sensu lato'', that contains three groups of non-vascular land plants: the liverworts, hornworts, and mosses. In t ...
,
gymnosperms The gymnosperms ( ; ) are a group of woody, perennial Seed plant, seed-producing plants, typically lacking the protective outer covering which surrounds the seeds in flowering plants, that include Pinophyta, conifers, cycads, Ginkgo, and gnetoph ...
, and
angiosperms Flowering plants are plants that bear flowers and fruits, and form the clade Angiospermae (). The term angiosperm is derived from the Greek words (; 'container, vessel') and (; 'seed'), meaning that the seeds are enclosed within a fruit. T ...
. They encode four protein domains in the following order:
protease A protease (also called a peptidase, proteinase, or proteolytic enzyme) is an enzyme that catalysis, catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products ...
,
integrase Retroviral integrase (IN) is an enzyme An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme ...
,
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
, and ribonuclease H. At least two classification systems exist for the subdivision of Ty1-''copia'' retrotransposons into five lineages: ''Sireviruses''/Maximus, Oryco/Ivana, Retrofit/Ale, TORK (subdivided in Angela/Sto, TAR/Fourf, GMR/Tork), and Bianca. ''Sireviruses''/Maximus retrotransposons contain an additional putative envelope gene. This lineage is named for the founder element SIRE1 in the '' Glycine max'' genome, and was later described in many species such as ''
Zea mays Maize (; ''Zea mays''), also known as corn in North American English, is a tall stout Poaceae, grass that produces cereal grain. It was domesticated by indigenous peoples of Mexico, indigenous peoples in southern Mexico about 9,000 years ago ...
'', ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small plant from the mustard family (Brassicaceae), native to Eurasia and Africa. Commonly found along the shoulders of roads and in disturbed land, it is generally ...
'', ''
Beta vulgaris ''Beta vulgaris'' (beet) is a species of flowering plant in the subfamily Betoideae of the family Amaranthaceae. Economically, it is the most important crop of the large order Caryophyllales. It has several cultivar groups: the sugar beet, of gre ...
'', and '' Pinus pinaster''. Plant ''Sireviruses'' of many sequenced plant genomes are summarized at the MASIVEdb ''Sirevirus'' database.


Ty3-retrotransposons (formally gypsy)

Ty3-retrotransposons are widely distributed in the plant kingdom, including both
gymnosperm The gymnosperms ( ; ) are a group of woody, perennial Seed plant, seed-producing plants, typically lacking the protective outer covering which surrounds the seeds in flowering plants, that include Pinophyta, conifers, cycads, Ginkgo, and gnetoph ...
s and
angiosperms Flowering plants are plants that bear flowers and fruits, and form the clade Angiospermae (). The term angiosperm is derived from the Greek words (; 'container, vessel') and (; 'seed'), meaning that the seeds are enclosed within a fruit. T ...
. They encode at least four protein domains in the order:
protease A protease (also called a peptidase, proteinase, or proteolytic enzyme) is an enzyme that catalysis, catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products ...
,
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to convert RNA genome to DNA, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobi ...
, ribonuclease H, and
integrase Retroviral integrase (IN) is an enzyme An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme ...
. Based on structure, presence/absence of specific protein domains, and conserved protein sequence motifs, they can be subdivided into several lineages: ''Errantiviruses'' contain an additional defective envelope ORF with similarities to the retroviral envelope gene. First described as Athila-elements in ''
Arabidopsis thaliana ''Arabidopsis thaliana'', the thale cress, mouse-ear cress or arabidopsis, is a small plant from the mustard family (Brassicaceae), native to Eurasia and Africa. Commonly found along the shoulders of roads and in disturbed land, it is generally ...
'', they have been later identified in many species, such as '' Glycine max'' and ''
Beta vulgaris ''Beta vulgaris'' (beet) is a species of flowering plant in the subfamily Betoideae of the family Amaranthaceae. Economically, it is the most important crop of the large order Caryophyllales. It has several cultivar groups: the sugar beet, of gre ...
''. ''Chromoviruses'' contain an additional chromodomain (chromatin organization modifier domain) at the C-terminus of their integrase protein. They are widespread in plants and fungi, probably retaining protein domains during evolution of these two kingdoms. It is thought that the chromodomain directs retrotransposon integration to specific target sites. According to sequence and structure of the chromodomain, chromoviruses are subdivided into the four clades CRM, Tekay, Reina and Galadriel. Chromoviruses from each clade show distinctive integration patterns, e.g. into centromeres or into the rRNA genes. Ogre-elements are gigantic Ty3-retrotransposons reaching lengths up to 25 kb. Ogre elements have been first described in '' Pisum sativum''. ''Metaviruses'' describe conventional Ty3-''gypsy'' retrotransposons that do not contain additional domains or ORFs. The Sushi family of Ty3 long terminal repeat retrotransposons were first identified in teleost fish and Sushi-like neogenes were subsequently identified in mammals. Mammalian retrotransposon-derived transcripts (MARTs) cannot transpose but have retained open reading frames, demonstrate high levels of evolutionary conservation and are subject to selective pressures, which suggests some have become neofunctionalized genes with new cellular functions. Retrotransposon gag-like-3 (RTL3/ZCCHC5/MART3) is one of eleven Sushi-like neogenes identified in the human genome.


BEL/pao family

The BEL/pao family is found in animals.


Endogenous retroviruses (ERV)

Although
retrovirus A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. After invading a host cell's cytoplasm, the virus uses its own reverse transcriptase e ...
es are often classified separately, they share many features with LTR retrotransposons. A major difference with Ty1-''copia'' and Ty3-''gypsy'' retrotransposons is that retroviruses have an envelope protein (ENV). A retrovirus can be transformed into an LTR retrotransposon through inactivation or deletion of the domains that enable extracellular mobility. If such a retrovirus infects and subsequently inserts itself in the genome in germ line cells, it may become transmitted vertically and become an
Endogenous Retrovirus Endogenous retroviruses (ERVs) are endogenous viral elements in the genome that closely resemble and can be derived from retroviruses. They are abundant in the genomes of jawed vertebrates, and they comprise up to 5–8% of the human genome ( ...
.


Terminal repeat retrotransposons in miniature (TRIMs)

Some LTR retrotransposons lack all of their coding domains. Due to their short size, they are referred to as terminal repeat retrotransposons in miniature (TRIMs). Nevertheless, TRIMs can be able to retrotranspose, as they may rely on the coding domains of autonomous Ty1-''copia'' or Ty3-''gypsy'' retrotransposons. Among the TRIMs, the Cassandra family plays an exceptional role, as the family is unusually wide-spread among higher plants. In contrast to all other characterized TRIMs, Cassandra elements harbor a 5S rRNA promoter in their LTR sequence. Due to their short overall length and the relatively high contribution of the flanking LTRs, TRIMs are prone to re-arrangements by recombination.


References

{{Repeated sequence Mobile genetic elements