Fact-checked by Grok 2 weeks ago

Retrotransposon

A retrotransposon is a mobile genetic element that transposes within a via an RNA intermediate, employing to synthesize a copy that integrates into new genomic locations, thereby increasing its copy number through a "copy-and-paste" . These elements, classified as Class I transposable elements, are ubiquitous across eukaryotic and constitute a major portion of repetitive DNA. Retrotransposons are broadly divided into two categories: those with long terminal repeats (LTRs), which resemble retroviruses in structure and include endogenous retroviruses (ERVs), and non-LTR retrotransposons, which lack these repeats and encompass autonomous elements like LINEs (long interspersed nuclear elements) and non-autonomous ones like SINEs (short interspersed nuclear elements). LTR retrotransposons integrate using an , while non-LTR types employ target-primed reverse transcription. In humans, LINE-1 (L1) elements, the most abundant non-LTR retrotransposons, comprise about 17% of the , with roughly 100 active copies per individual capable of mobilization. SINEs, such as Alu elements, rely on LINE machinery for transposition and make up another significant fraction, totaling approximately 42% of the derived from retrotransposons overall. These elements play pivotal roles in genome evolution by driving structural variations, gene duplication, and the creation of new regulatory sequences, though their activity can also lead to insertional mutagenesis implicated in diseases like cancer and neurological disorders. To mitigate deleterious effects, retrotransposons are tightly regulated through epigenetic silencing mechanisms, including DNA methylation, histone modifications, and RNA-based interference via piRNAs and siRNAs. Despite such controls, their reactivation under cellular stress or in aging contributes to genomic instability, underscoring their dual impact as both evolutionary innovators and potential mutagens.

Introduction

Definition

Retrotransposons are a class of that transpose within eukaryotic via an RNA intermediate, utilizing to convert the RNA into DNA for insertion at new genomic locations. This process enables retrotransposons to increase their copy number through a "copy-and-paste" mechanism, distinguishing them from other transposable elements that do not involve RNA intermediates. Unlike DNA transposons, which mobilize through a direct "cut-and-paste" excision and reinsertion of DNA segments, retrotransposons amplify themselves without excising the original copy, contributing significantly to genome expansion and variability. Key characteristics of retrotransposons include their classification as autonomous or non-autonomous elements. Autonomous retrotransposons encode their own ; those with long terminal repeats (LTRs) additionally encode integrase for , while non-LTR autonomous elements encode an endonuclease, enabling independent . Non-autonomous ones lack these coding regions and rely on the enzymatic machinery provided by autonomous elements for mobility. Retrotransposons are highly prevalent in eukaryotic genomes, comprising approximately 42% of the , where they serve as major drivers of and . At a basic structural level, retrotransposons are divided into those with long terminal repeats (LTRs) and those without (non-LTRs). LTR retrotransposons feature identical direct repeats at both ends that facilitate transcription and integration, resembling retroviral proviruses, while non-LTR retrotransposons typically contain internal promoter sequences and lack these terminal repeats, relying on target-primed reverse transcription for insertion. This structural dichotomy underlies their diverse impacts on host genomes.

History and Discovery

The discovery of retrotransposons emerged from early observations of in the mid-20th century. In the 1940s and 1950s, Barbara McClintock's pioneering cytogenetic studies on revealed mutable loci and chromosome breakage events caused by transposable elements, which she termed "controlling elements," demonstrating that genes could relocate within the genome and influence phenotypic variability. Her work, initially met with skepticism, established the concept of genomic mobility and provided the foundational framework for later recognition of retrotransposons as a subclass of these elements that transpose via an intermediate. McClintock received the in Physiology or Medicine in 1983 for this discovery, highlighting its transformative impact on . A pivotal breakthrough occurred in with the independent discoveries of by Howard Temin and Satoshi Mizutani, and by , which provided the enzymatic mechanism for RNA-to-DNA conversion essential to retrotransposition. Temin's hypothesis of a DNA intermediate in RNA tumor viruses, validated by these findings, overturned the central dogma's unidirectional flow of genetic information and enabled the identification of retroviral-like elements in eukaryotic genomes. This enzyme's role was confirmed through assays showing synthesis of DNA from viral templates, earning Temin and Baltimore the 1975 Nobel Prize in Physiology or Medicine (shared with ). In the 1970s, specific retrotransposons were identified, beginning with the Ty1 element in budding yeast (Saccharomyces cerevisiae), recognized as a mobile sequence causing insertional mutations. Cameron et al. cloned and characterized Ty1 in 1979, revealing its long terminal repeats (LTRs) and similarity to retroviral proviruses, marking it as the first LTR retrotransposon demonstrated to mobilize via an RNA intermediate in a non-viral context. The 1980s saw further advancements in mammalian systems: LINE-1 (L1) elements were established as autonomous non-LTR retrotransposons in humans, with Fanning and Singer's 1987 review synthesizing evidence for their structure, reverse transcriptase-like open reading frames, and transposition activity contributing to genomic diversity. Concurrently, the retrotransposon role of Alu sequences—short interspersed elements (SINEs) first noted as repetitive DNA in the 1960s—was clarified; Jagadeeswaran et al. demonstrated in 1981 that Alu RNAs serve as templates for integration, dependent on LINE-1 machinery, explaining their proliferation in primate genomes. The completion of the in 2001 profoundly illuminated retrotransposon abundance, revealing that these elements constitute over 40% of the , with LINEs (~17%), including Alus (~11%), and LTR retrotransposons (~8%) dominating the repetitive fraction and influencing gene regulation and . Post-2020 research has leveraged CRISPR-Cas9 to dissect retrotransposon activity, enabling precise editing and activation studies; recent work has reprogrammed site-specific retrotransposon insertion for therapeutic genome engineering. These advances underscore ongoing efforts to harness and mitigate retrotransposon dynamics in disease and development.

Molecular Mechanism

Transposition Process

Retrotransposons propagate through a "copy-and-paste" mechanism that involves transcription of their DNA into RNA, followed by reverse transcription of that RNA into complementary DNA (cDNA), and subsequent integration of the cDNA into a new genomic location. This RNA-mediated process distinguishes retrotransposons from DNA transposons and enables amplification within the host genome. The cycle begins in the nucleus with transcription of the retrotransposon by RNA polymerase II, producing a full-length RNA transcript that serves as both mRNA for protein synthesis and the template for reverse transcription. The is exported to the , where it associates with retrotransposon-encoded proteins to form a ribonucleoprotein complex. The details of reverse transcription and differ between LTR and non-LTR retrotransposons. For (LTR) retrotransposons, this complex often assembles into virus-like particles that facilitate packaging and protection of the RNA. Reverse transcription occurs in the , generating a double-stranded cDNA copy from the RNA template; this step is catalyzed by the element's and is inherently error-prone, introducing mutations that contribute to sequence variation among progeny elements. The resulting cDNA is then transported back to the for . In contrast, for non-LTR retrotransposons, the ribonucleoprotein complex is imported into the , where reverse transcription and integration are coupled through target-primed reverse transcription (TPRT). Non-LTR elements, such as LINE-1, employ TPRT, where the endonuclease nicks the target DNA at a preferred site, exposing a 3'-hydroxyl group that directly primes reverse transcription of the template at the insertion locus. This process often results in short target site duplications (typically 2 bp for LINE-1) and can lead to minor deletions or inversions due to incomplete second-strand synthesis. LTR retrotransposons use an integrase enzyme to process the LTR ends of the cDNA, cleave the target DNA, and insert the element, followed by host-mediated gap repair that ligates the 5' ends of the LTRs to the target DNA and generates short target site duplications. Insertion sites are non-random, with preferences influenced by sequence motifs and context. Non-LTR retrotransposons like LINE-1 favor AT-rich regions and gene-poor areas, such as intergenic spaces or low-recombination zones, which minimizes disruption to essential genes while allowing proliferation. LTR elements show broader integration patterns but often cluster in heterochromatic or repetitive regions. The error-prone reverse transcription, with rates comparable to those of retroviral reverse transcriptases (around 10^{-4} to 10^{-5} errors per ), promotes but can also generate defective copies that accumulate as genomic fossils. This variability, combined with insertion biases, drives ectopic recombination and contributes to over time.

Key Enzymes and Components

Retrotransposons rely on a suite of specialized enzymes and structural proteins to facilitate their replication via an RNA intermediate, with (RT) being the central enzyme that catalyzes the synthesis of (cDNA) from the retrotransposon template. RT possesses both polymerase and (RNase H) activities; the polymerase domain extends the DNA primer using the RNA template, while the RNase H domain degrades the RNA strand in the RNA-DNA hybrid to enable second-strand DNA synthesis. This dual functionality is conserved across retrotransposons and is essential for completing reverse transcription within the host cell. In long terminal repeat (LTR) retrotransposons, the pol gene encodes additional core enzymes, including integrase, which mediates the covalent insertion of the double-stranded cDNA into the host genome by recognizing specific DNA ends and performing strand transfer. Protease, also derived from pol, processes the polyprotein precursors into mature functional forms, enabling particle assembly and maturation analogous to retroviral capsid formation. For non-LTR retrotransposons, such as long interspersed nuclear elements (LINEs), integrase is absent; instead, the ORF2 protein includes an endonuclease domain that nicks the target DNA to prime reverse transcription directly at the insertion site. Accessory components include Gag-like proteins, which in LTR retrotransposons form virus-like particles that package the genome and enzymes, facilitating intracellular transport and reverse transcription. In non-LTR elements like LINEs, the ORF1 protein serves a similar structural role, binding to form ribonucleoprotein complexes that protect the template and chaperone it to the insertion site. RNase H activity, often fused to , ensures efficient removal of during cDNA synthesis across both LTR and non-LTR types. Retrotransposons are classified as autonomous or non-autonomous based on their encoding capacity; autonomous elements like LINEs produce a full enzymatic machinery, including , endonuclease, and RNase H within ORF2, enabling independent retrotransposition. In contrast, non-autonomous elements such as short interspersed nuclear elements () lack these enzymes and hijack the and other components from co-transcribed LINEs to mobilize their own . To prevent deleterious , retrotransposon activity is tightly regulated by host epigenetic mechanisms, primarily at CpG islands in their promoters, which represses transcription. modifications, including H3K9 methylation and deacetylation, further enforce formation and transcriptional silencing, with these marks cooperating to maintain long-term repression across cell divisions.

LTR Retrotransposons

Structural Features

LTR retrotransposons are characterized by their symmetrical structure, featuring identical long terminal repeats (LTRs) at both the 5' and 3' ends, which flank an internal . Each LTR is typically 250–600 base pairs in length and consists of three distinct regions: U3, which contains promoter and enhancer sequences essential for transcription initiation; , which includes the signal; and U5, involved in the start of reverse transcription and processes. The internal domain between the LTRs encodes key proteins necessary for transposition. The gag open reading frame produces structural proteins that form virus-like particles for packaging the retrotransposon RNA. The pol polyprotein includes enzymatic domains such as protease (or aspartyl proteinase) for cleaving precursor proteins, reverse transcriptase for synthesizing DNA from RNA, and integrase for inserting the DNA into the host genome; some elements also feature an RNase H domain within reverse transcriptase. Certain LTR retrotransposons, particularly those related to endogenous retroviruses, contain an env gene encoding envelope proteins that confer infectivity, though this is absent or non-functional in most plant and fungal examples. Solo-LTRs arise from between the 5' and 3' LTRs of a full-length element, resulting in the deletion of the internal sequences and leaving a single, shorter LTR remnant that retains promoter activity and can drive transcription of adjacent genes. Full-length LTR retrotransposons typically range in size from 1 to 12 kb, reflecting variability in LTR length and internal gene content, and exhibit high structural similarity to retroviral proviruses, differing primarily in the lack of a functional for extracellular transmission in most cases.

Endogenous Retroviruses

Endogenous retroviruses (ERVs) represent fossilized proviruses derived from ancient germline infections by exogenous retroviruses, which integrated into the host genome and were subsequently inherited across generations. In humans, these sequences, collectively known as human endogenous retroviruses (HERVs), constitute approximately 8% of the genome, with prominent families including HERV-K and HERV-H. ERVs retain structural similarities to modern retroviruses, including long terminal repeats (LTRs) flanking internal genes, but the majority have accumulated mutations rendering them replication-defective. HERVs are classified into three major classes (I, II, and III) based on their sequence relatedness to contemporary exogenous retroviruses: Class I (gammaretrovirus-like, e.g., HERV-H), Class II (betaretrovirus-like, e.g., ), and Class III (spumaretrovirus-like). Within these classes, most ERV loci feature mutated open reading frames (ORFs) in genes such as , , and , preventing the production of functional viral particles and leading to their designation as defective proviruses. This defectiveness has allowed ERVs to persist as genomic parasites while occasionally providing adaptive benefits through of their genetic elements. A notable example of ERV involves the (env) genes, which have been repurposed for essential physiological functions. , derived from the HERV-W env gene, and syncytin-2, from HERV-FRD, mediate cell-cell fusion in cells, facilitating formation critical for placental development and nutrient exchange in eutherian mammals. Additionally, certain HERV-encoded proteins function in immunity; for instance, superantigens from HERV-H and HERV-K can non-specifically activate T cells by binding class II molecules and T-cell receptors, potentially modulating immune responses or contributing to autoimmune conditions. Although largely silenced by epigenetic mechanisms in healthy somatic cells, ERVs exhibit rare transcriptional activity in specific contexts. In early embryos, HERV-K expression influences cortical development, with dysregulation linked to impaired neuronal patterning. Recent 2025 studies have also identified co-option of LTR7-HERVH elements in early human embryos for roles in pluripotency maintenance and defense against active retroelements. In cancers, such as those of the colorectal and , HERV-derived enhancers drive oncogenic transcriptional rewiring, promoting tumor evolution. Post-2020 research has highlighted HERV reactivation in neurological diseases, including and , where elevated HERV-W and HERV-K expression correlates with and disease progression.

Non-LTR Retrotransposons

LINEs

Long Interspersed Nuclear Elements (LINEs) are autonomous non-long terminal repeat (LTR) retrotransposons that constitute a significant portion of mammalian genomes, enabling their own mobilization through an intermediate. A full-length is approximately 6 kb in size and features a (UTR) containing an internal promoter, two open reading frames (ORFs), and a 3' UTR ending in a signal followed by a variable-length poly-A tail. Unlike LTR retrotransposons, LINEs lack long terminal repeats and instead rely on the poly-A tail for 3' end processing and stability. The first ORF (ORF1) encodes a protein with RNA-binding and chaperone activities that facilitates the formation of ribonucleoprotein particles essential for transposition, while the second ORF (ORF2) produces a multifunctional protein harboring endonuclease and reverse transcriptase domains critical for target site cleavage and cDNA synthesis. LINEs are classified into three major families in mammals: L1, L2, and L3, distinguished by sequence divergence and evolutionary age. The L1 family is the most abundant and active, accounting for about 17% of the , whereas L2 and L3 represent older, inactive relics comprising roughly 3-4% of the combined and are no longer capable of retrotransposition due to accumulated . In humans, the L1 family alone includes over 500,000 copies, but the vast majority are truncated or mutated; only approximately 80-100 are full-length and retrotransposition-competent, with a subset of "hot" L1s driving the majority of ongoing insertions. These active human L1s, often referred to as L1Hs, belong to a younger subfamily that emerged around 40 million years ago and continue to propagate at a low but detectable rate. Human L1 elements insert into the via target-primed reverse transcription (TPRT), a process where the ORF2 endonuclease nicks the target DNA at a (5'-TTTT/AA-3'), exposing a 3' hydroxyl group that primes reverse transcription directly from the L1 template. This mechanism results in new insertions that are typically flanked by short target site duplications (TSDs) of 2-20 bp, with 2 bp being common at sites, reflecting the staggered nick created by the endonuclease. TPRT ensures precise without requiring a separate integrase, distinguishing LINEs from other transposons, and often leads to 5' truncation of the inserted copy due to incomplete reverse transcription. The activity of LINEs, particularly L1, is tightly regulated to prevent excessive genomic instability, primarily through transcriptional and post-transcriptional mechanisms. Transcription initiates from the bidirectional promoter within the 5' UTR, which is responsive to specific transcription factors and upstream sequences, allowing sense-strand expression of the ORFs while the antisense strand may produce regulatory non-coding RNAs. In mammals, L1 expression is robust in the , where it contributes to , but is largely silenced in cells via epigenetic modifications such as and modifications; however, occasional somatic retrotransposition occurs, notably in early embryogenesis, neural tissues, and certain cancers, leading to mosaicism. This differential regulation ensures controlled propagation while minimizing deleterious insertions in differentiated cells.

SINEs

Short interspersed nuclear elements () are non-autonomous non-LTR retrotransposons that mobilize via an intermediate but lack the genes necessary for independent retrotransposition, instead relying on proteins provided by autonomous elements such as LINEs. These elements are typically short, ranging from 100 to 300 base pairs in length, and are transcribed by using an internal promoter consisting of A-box and B-box motifs. SINEs originate from various small RNAs, such as 7SL (the source of the primate-specific Alu family) or tRNA (as in the mammalian-wide interspersed repeat or family). Unlike autonomous retrotransposons, SINEs have no coding capacity and propagate parasitically by co-opting the enzymatic machinery of LINEs. In the , constitute approximately 13% of the total sequence, with the Alu family being the most abundant, comprising over 1 million copies and accounting for about 10-11% of the genome alone. The Alu elements are divided into three main subfamilies based on diagnostic mutations and activity levels: AluJ (the oldest and least active), AluS (intermediate), and AluY (the youngest and most active in modern humans). These subfamilies reflect waves of amplification, with AluJ expanding primarily 65-55 million years ago, AluS between 35-20 million years ago, and AluY from about 20 million years ago to the present. Alu insertions are often found in GC-rich, gene-dense regions of the genome, though recent insertions show less bias toward such sites. The retrotransposition mechanism of begins with transcription by from their internal promoters, producing RNA transcripts that are then reverse-transcribed and integrated into the using the endonuclease and from LINE ORF2. New SINE insertions are typically flanked by short target site duplications of 7-20 base pairs and occur preferentially at AT-rich consensus sequences, facilitating their dispersal throughout the . This target-primed reverse transcription process underscores the parasitic nature of , as they lack the L1 ORF1 protein but still achieve high copy numbers through LINE partnership. Evolutionarily, SINE amplification has occurred in distinct bursts, with Alu elements showing major expansions coinciding with divergence, contributing to increase and structural variation over the past 65-130 million years. More recently, certain SINE-derived sequences, such as those in SINEUPs (SINE-encoded untranslated uORF-containing RNAs), have been co-opted for regulatory functions like enhancing translation of specific genes. These evolutionary dynamics highlight ' role in shaping genomic architecture without autonomous mobility.

SVA Elements

SVA elements, also known as SINE-VNTR-Alu (SVA), represent a family of composite non-long terminal repeat (non-LTR) retrotransposons that are exclusive to hominoid and constitute the youngest known class of such elements in the . These non-autonomous retrotransposons emerged approximately 25 million years ago during the early evolution of , distinguishing them from older retrotransposon families like LINEs and . Their recent origin is evidenced by the presence of full-length, potentially active copies and a lack of significant sequence divergence across species. Structurally, SVA elements are chimeric sequences typically ranging from 1 to 4 kb in length, owing to variability in their central . From the 5' end, they feature a short hexameric repeat (CCCTCT), followed by two Alu-like domains derived from Alu SINE components, a GC-rich (VNTR) region with 1–50 repeats of a 35–50 bp motif, a SINE-R homologous to the of ID elements, and a 3' signal with a variable poly-A tail often derived from L1 sequences. This modular architecture enables SVA elements to hijack the transcriptional and retrotransposition machinery of other elements while incorporating regulatory motifs. In the , there are approximately 2,700 full or partial SVA copies, accounting for about 0.1–0.2% of the total DNA content, with a notable enrichment in GC-rich, gene-dense regions rather than heterochromatic areas. Unlike older retrotransposons, SVAs show minimal 5' truncation and are predominantly intronic, reflecting their ongoing propagation. As the most recently evolved non-LTR family, their copy number continues to expand through active retrotransposition. SVA mobilization is entirely dependent on the enzymatic machinery of autonomous LINE-1 (L1) elements, including and endonuclease, which process SVA transcripts in trans to facilitate target-primed reverse transcription. This activity persists in the , leading to de novo insertions that can disrupt gene function; for instance, a truncated SVA insertion at a deletion in the NF1 gene has been implicated in atypical neurofibromatosis type 1 cases. Beyond mutagenesis, the VNTR domain harbors binding sites for transcription factors such as SP1, allowing SVA elements to act as cis-regulatory modules that modulate nearby in a lineage-specific manner.

Biological Roles

Genome Evolution

Retrotransposons significantly influence genome evolution through dynamic processes of expansion and contraction. Waves of amplification, such as those observed with LINE-1 (L1) elements in mammals, have led to bursts of retrotransposition activity that increase copy number and over evolutionary time. For instance, phylogenetic analyses of L1 families reveal multiple amplification events since the origin of , contributing to the proliferation of these elements across mammalian lineages. Counteracting this expansion, mechanisms like facilitate the excision and deletion of retrotransposon sequences, leading to genome contraction; unequal recombination between (LTR) retrotransposons, for example, removes redundant copies and helps maintain genome stability in with high repetitive content. Structurally, retrotransposons reshape genomes by disrupting genes, facilitating shuffling, and providing novel regulatory such as promoters. L1-mediated retrotransposition can mobilize s from donor genes to new locations, enabling the creation of chimeric transcripts and contributing to protein diversity during . Additionally, insertions of Alu , a type of (SINE), into introns influence patterns, thereby modulating and isoform variation in genomes. L1 can also supply bidirectional promoters that drive transcription of adjacent genes, altering regulatory landscapes and fostering adaptive genetic innovations. In , retrotransposon insertions act as evolutionary barriers by creating genetic differences between populations. Endogenous retroviruses (ERVs), which are LTR retrotransposons derived from ancient infections, exhibit distinct integration sites between humans and chimpanzees, with shared orthologous ERVs supporting common ancestry while species-specific insertions contribute to and divergence. Horizontal transfer of retrotransposons, though rare in animals, has been documented in , where interspecies exchanges via vectors like or fungi introduce novel elements that accelerate genomic diversification. Recent insights from the 2020s highlight retrotransposons' roles in and hybrid vigor, particularly in where they comprise a large portion of many (up to 85% in some , compared to about 42% in humans). In , activation of LTR retrotransposons during hybridization leads to rapid amplification, enhancing genome restructuring and epigenetic changes that underpin hybrid vigor and adaptation. These dynamics underscore retrotransposons as key drivers of evolutionary novelty in polyploid events.

Human Disease

Retrotransposons contribute to human disease primarily through , where their integration into the genome disrupts critical genes, leading to loss-of-function mutations. In hemophilia A, de novo insertions of L1 elements into the gene (F8) have been documented, causing severe bleeding disorders by interrupting coding sequences and preventing proper protein production. Similarly, Alu element insertions within introns of the F8 gene can induce , resulting in truncated or non-functional and manifesting as severe hemophilia A. In cancer, somatic L1 insertions into the () have been identified in colorectal tumors, inactivating APC and promoting tumorigenesis through the . For neurological disorders, SVA retrotransposon insertions, particularly those involving hexameric repeat expansions in the TAF1 gene, are associated with X-linked - (XDP), a progressive condition featuring , , and ataxia-like symptoms due to altered in medium spiny neurons. Reactivation of retrotransposons, often triggered by epigenetic derepression or environmental stressors, exacerbates autoimmune and neurodegenerative diseases. Human endogenous retroviruses (HERVs), particularly HERV-W elements, show elevated expression in multiple sclerosis (MS) patients, where they correlate with inflammatory lesions and may drive neuroinflammation via superantigen-like activity or mimicry of myelin antigens. In systemic lupus erythematosus (SLE), HERV-E expression is upregulated in response to chronic inflammation, contributing to autoantibody production and immune dysregulation through integration near immune-related genes. L1 retrotransposons exhibit age-related upregulation in the brain, with increased expression observed in late-onset Alzheimer's disease (LOAD), where heightened L1 activity in microglia promotes neuroinflammation and neuronal loss by generating double-strand breaks and inflammatory transcripts. APOBEC3A (A3A), a cytidine deaminase, serves as a host defense mechanism by restricting retrotransposon activity through deamination of single-stranded DNA intermediates during L1 integration, thereby introducing hypermutations that inactivate these elements. However, in cancer, dysregulated A3A expression leads to off-target hypermutation signatures, particularly at TpC motifs in hairpin loops, driving genomic instability and tumor evolution in various malignancies, including breast and lung cancers. This dual role highlights A3A's contribution to both protection against retrotransposition and pathological mutagenesis in oncogenic contexts. Therapeutic strategies targeting retrotransposons focus on inhibiting L1 to curb in cancer. reverse transcriptase inhibitors (NRTIs), such as lamivudine repurposed from therapy, suppress L1 retrotransposition and have shown stabilization of disease progression in 25% of patients with metastatic by blocking replication-like processes. Other NRTIs, such as emtricitabine, exhibit potent inhibition of L1 activity in tumor cells, reducing DNA damage and immune evasion without affecting endogenous retroviral . These inhibitors are being evaluated in clinical contexts for solid tumors, with ongoing efforts to integrate them into combination therapies to mitigate retrotransposon-mediated resistance.

Biotechnology Applications

Retrotransposons have been harnessed as non-viral vectors for , providing a safer alternative to viral systems by minimizing immunogenicity and risks associated with viral integration. LINE-1-based vectors facilitate stable integration through retrotransposition, enabling applications such as recombinant gene transfer in cells via regulated expression of LINE-1 open reading frames. These vectors have also been adapted for delivering small interfering RNAs to induce stable in mammalian cells, offering precise control over retrotransposition efficiency. In single-cell , retrotransposon barcoding supports tracing by generating diverse genomic mutations that serve as unique identifiers for reconstructing cell histories. A Cas9-deaminase fusion targeting LINE-1 sequences creates high-diversity barcodes, allowing simultaneous readout of lineage and transcriptional states in complex tissues. CRISPR-based tools enable activation of silenced retrotransposons to investigate their regulatory dynamics, bypassing epigenetic repression for functional studies. activation () efficiently reactivates young LINE-1 elements in human cell lines, uncovering cis-regulatory elements and transcriptional dependencies without altering the sequence. Synthetic retrotransposons further expand engineering capabilities for targeted modifications, integrating RNA-guided mechanisms to insert large DNA payloads. The -Enabled Autonomous (CREATE) system merges / with engineered LINE-1 components for site-specific gene insertions, achieving high specificity in mammalian . Similarly, the STITCHR editor leverages retrotransposon for scarless, large-scale DNA integrations using synthetic templates, demonstrating activity in primary cells for potential multiplex editing (as of April 2025). Therapeutic strategies target dysregulated retrotransposons in cancer, where LINE-1 hyperactivity drives genomic instability and inflammation. Lamivudine, a nucleoside reverse transcriptase inhibitor repurposed from HIV therapy, suppresses LINE-1 retrotransposition and stabilizes disease progression in 25% of patients with metastatic colorectal cancer by blocking viral-like replication. Other analogs, such as emtricitabine, exhibit potent inhibition of human LINE-1 activity in tumor cells, reducing DNA damage and immune evasion without affecting endogenous retroviral elements. Endogenous retrovirus (ERV)-derived peptides serve as tumor-specific antigens for vaccine development, eliciting both humoral and cellular responses against ERV-expressing cancers. Adenovirus-based virus-like vaccines targeting ERV envelopes eliminate established colorectal tumors in preclinical models by inducing T-cell infiltration and tumor regression. Shared ERV epitopes across low-mutational-burden tumors support personalized immunotherapy designs. Recent innovations in retrotransposon applications include advanced prediction pipelines for insertion sites and transgene-free mobilization in . Tools like GraffiTE integrate structural detection to map polymorphic retrotransposon insertions genome-wide, aiding in personalized . In , controlled retrotransposon activation generates novel allelic diversity for crop improvement, avoiding foreign DNA integration. Temporary inhibition of has been shown to mobilize endogenous retrotransposons in plants such as , with potential for producing heritable mutations that enhance traits like stress tolerance without stable transgenes. These approaches, combined with 2024-2025 editing platforms like STITCHR, underscore retrotransposons' growing role in precise, ethical biotechnological interventions.

References

  1. [1]
  2. [2]
    Ten things you should know about transposable elements
    Nov 19, 2018 · Class 1 elements, also known as retrotransposons, mobilize through a 'copy-and-paste' mechanism whereby a RNA intermediate is reverse- ...
  3. [3]
    Retrotransposon life cycle and its impacts on cellular responses - PMC
    This review focuses on the life cycle of human retrotransposons and summarizes their regulatory mechanisms and impacts on cellular processes.
  4. [4]
    SVA Elements Are Nonautonomous Retrotransposons that Cause ...
    —A nonautonomous retrotransposon depends on the retrotransposition machinery of an autonomous retrotransposon to propagate itself. L1 is the only active ...
  5. [5]
    Non-long terminal repeat (non-LTR) retrotransposons - Mobile DNA
    May 12, 2010 · Full length, autonomous non-LTR retrotransposons typically contain one or two open reading frames (ORFs). The general structure of three model ...
  6. [6]
    Structural features and mechanism of translocation of non-LTR ...
    Non-LTR retrotransposons also contain a reverse transcriptase domain. Unlike LTR retrotransposons, they have no LTR retrotransposons, either direct or indirect.
  7. [7]
    The Nobel Prize in Physiology or Medicine 1975 - Press release
    Karolinska institutet has decided to award the Nobel Prize in Physiology or Medicine for 1975 jointly to David Baltimore, Renato Dulbecco and Howard Temin.
  8. [8]
    Restricting retrotransposons: a review | Mobile DNA | Full Text
    Aug 11, 2016 · There are two classes of retrotransposon. Both move by a “copy and paste” mechanism, involving reverse transcription of an RNA intermediate and ...
  9. [9]
    Mechanisms of LTR‐Retroelement Transposition - MDPI
    Apr 16, 2017 · Most LTR retrotransposons non-specifically integrate into a target site. Site-specificity of integration at vertebrate retroviruses is rather ...
  10. [10]
    The reverse transcriptase encoded by the non-LTR retrotransposon ...
    The reverse transcriptase encoded by the non-LTR retrotransposon R2 is as error prone as that encoded by HIV-1 - PMC.
  11. [11]
  12. [12]
    The Genomic Distribution of L1 Elements: The Role of Insertion Bias ...
    LINE-1 (L1) retrotransposons constitute the most successful family of retroelements in mammals and account for as much as 20% of mammalian DNA.
  13. [13]
    Reverse Transcription of Retroviruses and LTR Retrotransposons
    This, coupled with the knowledge that HIV-1 RT (and most other retroviral RTs) are relatively error prone in vitro, having an error rate of approximately 10−4, ...
  14. [14]
    The diversity of retrotransposons and the properties of their reverse ...
    A number of abundant mobile genetic elements called retrotransposons reverse transcribe RNA to generate DNA for insertion into eukaryotic genomes.
  15. [15]
    Role of Integrase in Reverse Transcription of the Saccharomyces ...
    Reverse transcriptase (RT) with its associated RNase H (RH) domain and integrase (IN) are key enzymes encoded by retroviruses and retrotransposons.
  16. [16]
    Reverse Transcription of Retroviruses and LTR Retrotransposons
    Three enzymes participate in the replication of the genomes of retroviruses and retrotransposons: the host RNA Pol II, RT, and the host's replicative DNA ...Missing: seminal | Show results with:seminal
  17. [17]
    A prion-like domain in Gag capsid protein drives retrotransposon ...
    Aug 17, 2023 · The POL gene encodes the enzymatic activities protease (PR), integrase (IN), and reverse transcriptase ribonuclease H (RT-RH), required for ...
  18. [18]
    Video: LTR Retrotransposons - JoVE
    Nov 23, 2020 · While the pol gene encodes enzymes such as protease, reverse transcriptase, integrase, and RNase H; the gag gene encodes structural proteins ...
  19. [19]
    The diversity of LTR retrotransposons | Genome Biology | Full Text
    May 18, 2004 · Although most LTR retrotransposons have common structural features and encode similar genes, there is nonetheless considerable diversity in their genomic ...Missing: seminal papers
  20. [20]
    Template and target-site recognition by human LINE-1 in ... - Nature
    Dec 14, 2023 · L1 spreads through a mechanism termed target-primed reverse transcription, in which the encoded enzyme (ORF2p) nicks the target DNA to prime ...
  21. [21]
    LINE-1 retrotransposition and its deregulation in cancers
    Dec 13, 2023 · LINE-1 overexpression and retrotransposition are hallmarks of cancers. Here, we review mechanisms of LINE-1 regulation and how LINE-1 may promote genetic ...<|control11|><|separator|>
  22. [22]
    Friend or Foe: Epigenetic Regulation of Retrotransposons in ...
    Dec 23, 2016 · As noted earlier, direct relationship between DNA methylation and TE activity was described decades ago for IAP LTR retrotransposons [42].
  23. [23]
    Regulation of DNA methylation turnover at LTR retrotransposons ...
    Methylation of histone H3 at lysine K9 (H3K9) functions in concert with DNA methylation to maintain silencing of genes and repetitive elements in distantly ...
  24. [24]
    Retrotransposons in Plant Genomes: Structure, Identification ... - NIH
    Aug 6, 2019 · LTRs are generally composed of U3, R, and U5 domains [10,61], each one with a specific function in the retrotranscription process [62].
  25. [25]
    The structure and retrotransposition mechanism of LTR ... - NIH
    This review summarizes the structure, mechanism, and influence on organism diversity of LTR-retrotransposons found in C. albicans.
  26. [26]
    Not so bad after all: retroviruses and long terminal repeat ...
    Conclusions. In this review, we summarize evidence that retroviruses and LTR retrotransposons, which are mostly selfish and even infectious, have repeatedly ...
  27. [27]
    Endogenous retroviral solo-LTRs in human genome - PMC - NIH
    Mar 28, 2024 · The promoter activity of the LTR was relatively weak in renal cell carcinoma GS and acute T cell leukemia Jurkat cells while a high promoter ...
  28. [28]
    Retrotransposons - ScienceDirect.com
    Jun 5, 2012 · Figure 1. Structure of retrotransposons. LTR retrotransposons are generally 5–7 kb long. They are characterised by having long terminal direct ...Missing: solo- | Show results with:solo-
  29. [29]
  30. [30]
    Demystified . . . Human endogenous retroviruses - PMC - NIH
    Human endogenous retroviruses (HERVs) are a family of viruses within our genome with similarities to present day exogenous retroviruses.Missing: prevalence | Show results with:prevalence
  31. [31]
    Classification and characterization of human endogenous retroviruses
    Jan 22, 2016 · ... Class-I, Class-II, or Class-III. Typically, the backbone structure included one or two LTRs in 5′ and/or 3′ ends and internal hits belonging ...
  32. [32]
    Endogenous Retroviruses and Placental Evolution, Development ...
    Aug 8, 2022 · It has been generally accepted that successive integrations of ERV-env (syncytin) genes are the main reasons for placental diversity in mammals ...
  33. [33]
    Human endogenous retroviruses (HERV) and non-HERV viruses ...
    Dec 9, 2021 · This article reviews the virological, evolutionary, and pathogenic aspects of the association between HERVs presence in the human genome and autoimmunity.Missing: definition | Show results with:definition
  34. [34]
    Endogenous retroviruses mediate transcriptional rewiring in ...
    Jul 17, 2024 · Our findings reveal that ERV-derived enhancers contribute to transcriptional dysregulation in response to oncogenic signaling and shape the evolution of cancer ...
  35. [35]
    Human Endogenous Retroviruses as Novel Therapeutic Targets in ...
    Apr 15, 2025 · In this review, we provide an overview of HERVs′ biology, examine their role in neurodegenerative diseases such as amyotrophic lateral sclerosis, multiple ...
  36. [36]
    The impact of retrotransposons on human genome evolution - PMC
    About 45% of the human genome can currently be recognized as being derived from transposable elements, the vast majority of which are non-LTR retrotransposons ...
  37. [37]
    LINE-1 retrotransposition and its deregulation in cancers - NIH
    LINE-1 retrotransposons promote genomic instability and immune activation in cancer, and the authors explore their targetability in diagnostic and therapeutic ...<|control11|><|separator|>
  38. [38]
    The ORF1 Protein Encoded by LINE-1: Structure and Function ... - NIH
    LINE-1, or L1 is an autonomous non-LTR retrotransposon in mammals. Retrotransposition requires the function of the two, L1-encoded polypeptides, ...
  39. [39]
    The human LINE-1 retrotransposon: an emerging biomarker of ...
    We will review evidence that genomic LINE-1 methylation, LINE-1-encoded RNAs, and LINE-1 open reading frame 1 protein (ORF1p) may be useful in cancer diagnosis.
  40. [40]
    LINE-1 Retrotransposition Activity in Human Genomes - PMC - NIH
    Long Interspersed Element-1 (LINE-1 or L1) sequences comprise the bulk of retrotransposition activity in the human genome.Missing: seminal | Show results with:seminal
  41. [41]
    High Frequency Retrotransposition in Cultured Mammalian Cells
    ... 2 bp duplication, a blunt insertion, or an up to 4 bp ... Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition.
  42. [42]
    Human L1 element target-primed reverse transcription in vitro - NIH
    In the R2Bm model, called target-primed reverse transcription (TPRT), an element-encoded endonuclease nicks the target DNA, generating an exposed 3′ hydroxyl ...
  43. [43]
    L1 expression and regulation in humans and rodents - PMC
    L1 transcription is mainly controlled by its 5' untranslated region (5'UTR), which differs significantly among active human and rodent L1 families.
  44. [44]
    Factors Regulating the Activity of LINE1 Retrotransposons - PMC
    This review discusses the molecular genetic mechanisms of the retrotransposition and regulation of the activity of L1 elements.
  45. [45]
  46. [46]
    Alu elements: know the SINEs | Genome Biology | Full Text
    Dec 28, 2011 · Alu elements code for low levels of RNA polymerase III transcribed RNAs that contribute to retrotransposition.Missing: 1960s clarification 1980s key papers
  47. [47]
    SVA retrotransposons: Evolution and genetic instability - PMC - NIH
    However, one primary difference between L1 and SVA genomic insertions exists. Most SVAs are full-length, 63% and 42% in human and chimp, respectively [19].Sva Lifecycle And... · Figure 2 · Sva Origins
  48. [48]
    SVA Elements: A Hominid-specific Retroposon Family - ScienceDirect
    SVA is a composite repetitive element named after its main components, SINE, VNTR and Alu. We have identified 2762 SVA elements from the human genome draft ...<|control11|><|separator|>
  49. [49]
    5'- transducing SVA retrotransposon groups spread efficiently ...
    Jul 31, 2009 · The origin of SVA elements can be traced back to the beginnings of hominid primate evolution, only approximately 18 to 25 mya. Their very young ...
  50. [50]
    Structure and Expression Analyses of SVA Elements in Relation to ...
    Sep 30, 2013 · ... SVA element-associated genes are identified in the human genome. In an analysis of genomic structure, SVA elements are detected in the 5 ...<|control11|><|separator|>
  51. [51]
    Characterisation of the potential function of SVA retrotransposons to ...
    May 21, 2013 · The sequences of SVAs show potential for the formation of secondary structure including G-quadruplex DNA. We have shown that the human specific ...
  52. [52]
    SVA retrotransposons as modulators of gene expression - PMC
    SVA elements only represent 0.13% of the genome, representing ~2700 elements, constituting the youngest of the retrotransposable elements in the human genome ...
  53. [53]
    The landscape of human SVA retrotransposons - Oxford Academic
    Oct 12, 2023 · Abstract. SINE-VNTR-Alu (SVA) retrotransposons are evolutionarily young and still-active transposable elements (TEs) in the human genome.
  54. [54]
    The Role of SINE-VNTR-Alu (SVA) Retrotransposons in Shaping the ...
    Nov 27, 2019 · SINE-VNTR-Alu (SVA) retrotransposons are the most recently evolved class of retrotransposable elements, found solely in primates, including humans.
  55. [55]
    The non-autonomous retrotransposon SVA is trans - Oxford Academic
    S INE- V NTR- A lu (SVA) elements are non-autonomous, hominid-specific non-LTR retrotransposons and distinguished by their organization as composite mobile ...Abstract · INTRODUCTION · RESULTS · DISCUSSION
  56. [56]
    SVA retrotransposon insertion-associated deletion represents a ...
    Jun 2, 2014 · One of the most active SVA elements is H10_1, located on chromosome 10q24.2, which has been identified as the source element of at least 13 ...Missing: acrocentric | Show results with:acrocentric