Fact-checked by Grok 2 weeks ago

Protein splicing

Protein splicing is an autocatalytic posttranslational process in which an intervening protein sequence, termed an intein, is precisely excised from a precursor polypeptide, and the flanking external protein sequences, known as exteins, are ligated together via a native . This phenomenon was first discovered in 1990 through independent studies on the VMA1 (also known as TFP1) gene encoding the catalytic subunit of the vacuolar H⁺-ATPase in the yeast , where researchers observed an unexpected discrepancy between the predicted precursor size (approximately 120 kDa) and the mature protein (67-69 kDa), leading to the proposal of a novel self-processing mechanism. The mechanism of protein splicing typically proceeds in four sequential steps for most inteins (class 1), beginning with a nucleophilic attack by the intein N-terminal or serine residue on the upstream , forming a intermediate; this is followed by a to create a branched , cyclization to release the intein, and finally an S-to-N acyl shift to ligate the exteins. Variations exist, such as in class 2 and 3 inteins, which lack the N-terminal and rely on initiation, but all pathways ensure precise excision without external factors. Inteins, ranging from 360 to 600 , often contain endonuclease domains that promote their mobility as selfish genetic elements, and they are predominantly found in , , unicellular eukaryotes, and some viruses, with thousands identified as of 2024. Biologically, inteins may serve as parasitic elements invading host genes or as regulators responding to environmental cues like temperature, pH, oxidative stress, or DNA damage, thereby modulating host protein function in microbes adapted to extreme conditions. In biotechnology, engineered inteins enable powerful tools such as intein-mediated protein purification with an affinity chitin-binding tag (IMPACT), expressed protein ligation for semisynthesis of post-translationally modified proteins, protein trans-splicing for segmental isotope labeling in NMR studies, and conditional splicing for biosensors and therapeutic applications like proximity-dependent protein activation.

Fundamentals

Definition and Overview

Protein splicing is an autocatalytic wherein an intervening protein sequence, termed the intein, is precisely excised from a precursor polypeptide, and the adjacent N- and C-terminal sequences, known as exteins, are ligated together via a native to yield a functional mature protein. This process occurs without the need for additional enzymatic cofactors or cellular machinery beyond the intrinsic activity of the intein itself. The precursor protein is structured as a contiguous fusion of the N-extein, intein, and C-extein, translated from a single mRNA. Upon splicing, the intein is removed entirely, leaving no residual sequence in the ligated exteins, which distinguishes this event from other protein processing mechanisms like or ubiquitination. Inteins act as the self-catalytic domains driving this rearrangement. In contrast to , which removes introns from pre-mRNA transcripts at the level to produce mature mRNA, protein splicing operates directly on the translated polypeptide chain without any intermediates or reverse transcription steps. Although rare, protein splicing has been documented across all domains of life, including , , and eukaryotes, with over known intein instances primarily in microbial genomes. The phenomenon was first observed in the product of the VMA1 gene in , encoding the catalytic subunit of the vacuolar H⁺-ATPase, where a 120-kDa precursor undergoes splicing to generate the 69-kDa mature enzyme.

Biological Significance

Protein splicing occurs naturally in , , and unicellular eukaryotes, but is notably absent in multicellular eukaryotes. This process is particularly prevalent among extremophiles, such as hyperthermophilic and thermophilic , where it facilitates the maturation of essential proteins under harsh environmental conditions like high temperatures or . For instance, inteins enable the production of functional enzymes in , allowing these organisms to adapt to extreme habitats. Over 500 inteins have been identified across approximately 100 different host genes, primarily in , repair, and pathways, underscoring their widespread but sporadic distribution in microbial genomes. In terms of functional roles, inteins primarily act as selfish genetic elements that promote their own propagation through mobility, often inserting into host proteins without providing direct benefits to the host. However, they also contribute to host adaptation by enabling conditional protein splicing (CPS), which serves as a regulatory responsive to environmental cues. Examples include the intein in , which splices in response to DNA damage to activate recombination proteins, and the MoaA intein, which senses conditions to control molybdopterin biosynthesis. While inteins are generally non-essential for host viability—organisms can lose them without lethality—they can influence fitness by modulating protein activity during stress, potentially aiding phase variation or survival in fluctuating environments. From an evolutionary perspective, inteins function as mobile endonucleases that drive (HGT), allowing them to spread across microbial populations and even between domains of life. Comparative genomic analyses reveal their ancient origins, with evidence of intein invasion predating the divergence of and , as seen in conserved insertion sites in orthologous proteins like replicative polymerases. This mobility has led to dynamic patterns of gain and loss, with inteins persisting in lineages where they confer subtle selective advantages, such as enhanced adaptability in extremophilic niches, while being purged in others due to their parasitic nature.

Historical Development

Discovery

In the late 1980s, investigations into the VMA1 gene, which encodes the 69 kDa catalytic subunit of the vacuolar membrane H⁺-ATPase, revealed discrepancies between the predicted and observed protein products. The gene's suggested a 119 kDa precursor protein, but biochemical analyses identified only a 69 kDa mature subunit alongside an unanticipated 50 kDa polypeptide corresponding to an internal segment of the precursor. These findings hinted at a novel post-translational processing event distinct from known mechanisms like or simple . The phenomenon of protein splicing was formally identified in 1990 through detailed expression studies of the VMA1 precursor by independent groups led by Kane et al. and Hirata et al. Kane et al. demonstrated that the 119 kDa protein undergoes precise excision of the 50 kDa intervening sequence, with concomitant ligation of the amino-terminal (24 kDa) and carboxyl-terminal (45 kDa) segments to yield the functional 69 kDa subunit. This processing occurred entirely at the protein level, as confirmed by analyses ruling out RNA-level interventions and by the absence of intermediate fragments indicative of stepwise . Furthermore, the intervening sequence was shown to possess site-specific endonuclease activity, suggesting a role in genetic mobility akin to mobile introns. Early experimental evidence for the autonomous nature of protein splicing came from translation assays. Cooper et al. expressed the VMA1 precursor in rabbit reticulocyte lysates and observed efficient splicing to produce the 69 kDa mature protein and excised 50 kDa intervening protein without requiring additional cellular factors or cofactors. The reaction proceeded rapidly at physiological pH and was unaffected by broad-spectrum protease inhibitors, clearly distinguishing self-splicing from external protease-dependent cleavage. These results established protein splicing as an intramolecular, autocatalytic process. Independently, in 1992, Davis et al. reported protein splicing in a prokaryotic context, observing that the recA gene encodes an 85 kDa precursor processed into a 38 kDa mature protein and a 47 kDa intervening sequence. This discovery extended the splicing mechanism beyond , highlighting its occurrence in diverse organisms and reinforcing the generality of the process.

Key Milestones

In the 1990s, significant progress in understanding intein function built upon early observations of protein splicing. The endonuclease activity of inteins was identified in 1992 with the characterization of PI-SceI from the VMA1 gene, revealing its role in promoting intein mobility through site-specific DNA cleavage. This discovery highlighted how inteins function as selfish genetic elements. Concurrently, the first protein splicing assays were established in the early 1990s, enabling controlled studies of the autocatalytic excision and extein processes without cellular interference. In 1999, the InBase database was launched by as a comprehensive registry of known inteins, facilitating systematic and analysis of their sequences and hosts. The marked expansions in intein diversity and utility. Natural mini-inteins, lacking the endonuclease domain and consisting primarily of the , were first identified in the mid-1990s, with engineered mini-inteins developed in 1997 to simplify their application in due to smaller size and reduced off-target effects. In 1998, engineered split inteins were developed to enable trans-splicing, where fragmented intein halves reassemble to ligate separate polypeptides, opening avenues for segmental . InBase continued to be updated through the decade, incorporating new entries from bacterial and archaeal genomes to track intein evolution and distribution. During the 2010s and early 2020s, advancements shifted toward high-throughput and engineered approaches. High-throughput screening methods emerged around 2011, allowing directed evolution of inteins for optimized splicing under diverse conditions, such as temperature and extein compatibility. Engineered intein variants achieved faster splicing rates, with some split inteins like Cfa demonstrating exceptional efficiency and stability in 2016, completing trans-splicing in seconds rather than minutes. Post-2015, inteins were integrated with CRISPR systems, as in the development of split-Cas9 constructs using intein-mediated reassembly to overcome viral vector size limits for precise genome editing. InBase received updates into the 2010s, reflecting growing genomic data. Recent developments up to 2025 have leveraged to uncover novel inteins. Analyses of bacterial and archaeal metagenomes have revealed new insertion sites, such as eleven distinct positions in archaeal MCM helicases in , expanding the known repertoire and suggesting untapped evolutionary diversity. Commercialization of intein-based tools, pioneered by ' IMPACT system in the early 2000s, has continued to evolve, providing scalable platforms for tagless and ligation in workflows.

Molecular Mechanism

Overall Process

Protein splicing is an autocatalytic process that occurs within a precursor protein, where an intervening sequence known as the intein is precisely excised, and the flanking sequences, termed exteins, are joined together via a native . The general commences with the ribosomal of the precursor protein, which includes the N-extein, intein, and C-extein in a linear polypeptide chain. Upon folding or under appropriate conditions, the intein recognizes the junction sites and activates, triggering sequential bond rearrangements that lead to intein excision, extein , and release of the intein as a byproduct, often in a cyclized form. This results in the production of a mature protein from the ligated exteins alongside the excised intein. The reaction proceeds through distinct phases to ensure fidelity and efficiency. Initiation begins with an N-S acyl shift at the peptide bond between the N-extein and intein, converting it to a thioester linkage and enabling further reactivity. This is followed by intermediate formation, where a branched structure assembles through nucleophilic attack, temporarily linking the intein to both exteins. Termination completes the process via an S-N acyl shift, which forms the new amide bond between the exteins, accompanied by intein cyclization and its full release from the precursor. These phases occur in a coordinated, intramolecular manner without requiring external enzymes or cofactors. Protein splicing typically takes place post-translationally as a spontaneous biochemical modification after the precursor polypeptide is fully synthesized, although some inteins may initiate co-translationally to optimize folding and . The process is sensitive to environmental conditions, exhibiting an optimum at pH in most natural systems and showing dependence that aligns with the host —for instance, activation near 100°C in hyperthermophilic or around 37°C in mesophilic . This overall sequence is highly conserved across diverse inteins, underscoring a shared mechanistic framework derived from the HINT (protein splicing) domain superfamily, which facilitates reliable self-processing in various biological contexts. A schematic representation of the process illustrates the precursor protein transforming stepwise: starting from the full chain (N-extein–intein–C-extein), progressing through the initial shift and branched intermediate, and culminating in the separated ligated exteins and cyclized intein products.

Detailed Steps and Chemistry

Protein splicing in class 1 inteins proceeds through a series of four nucleophilic reactions, each facilitated by conserved residues at the splice junctions and within the intein core. The process begins with an N-S (or N-O) acyl shift at the N-terminal splice junction, where the side chain of a or serine residue at the first position of the intein (residue 1) acts as a to attack the carbonyl carbon of the upstream between the N-extein residue at position -1 and the intein. This generates a linear (or ) intermediate, transferring the N-extein to the intein side chain while activating the peptide bond for subsequent reactions. The reaction can be represented as: \ce{R-C(O)-NH-R' ->[N-S acyl shift] R-C(O)-S-R' + H2N-R''} where R represents the N-extein, R' the intein N-terminus, and the sulfur (or oxygen) from the nucleophilic cysteine (or serine). Side reactions, such as premature hydrolysis of the thioester intermediate, can occur but are minimized by the intein's catalytic residues, including histidines in motifs B and F that stabilize the transition state. In the second step, transesterification occurs as the side chain of a serine, cysteine, or threonine at the +1 position of the C-extein attacks the thioester carbonyl of the linear intermediate. This forms a branched intermediate in which the thioester linkage is transferred to the side chain of the C-extein +1 residue, with the N-extein now attached via thioester to the C-extein residue 1 side chain, while the intein remains linked via the peptide bond to the C-extein residue 1 alpha carbon. The conserved aspartate in motif F deprotonates the attacking nucleophile, enhancing reactivity, and this step is often rate-limiting in some inteins due to steric constraints in the active site. The third step involves cyclization of the residue at the C-terminal end of the intein (typically in motif ), whose side-chain attacks the upstream carbonyl, forming a five-membered ring and cleaving the intein from the C-extein. This releases the excised intein as a product, leaving the N-extein thioester-linked to the side chain of the C-extein +1 residue, with histidines in motifs F and facilitating proton transfers to lower the barrier. The formation marks it as a committed step with an energy barrier of approximately 20-25 kcal/ in model systems. Side reactions here include competing Asn cyclization leading to double-cleavage products. Finally, the fourth step entails an S-N (or O-N) acyl shift where the free alpha-amine of the C-extein +1 residue attacks the linking the N-extein to the of the same C-extein residue, forming the native between the exteins and releasing the as free or . Concurrently, the hydrolyzes non-enzymatically to an residue, completing intein excision. This step is spontaneous following activation, with overall splicing kinetics varying by intein (e.g., full process rate constants ranging from $10^{-6} to $10^{-3} s^{-1} depending on temperature and residues), and of intermediates competes as a side reaction, potentially yielding cleaved products if splicing efficiency is low.

Inteins

Structure and Naming Conventions

Inteins exhibit a modular typically spanning 100 to 600 residues, organized into a central domain (HED) optionally flanked by N- and C-terminal splicing domains that form the core /INTein (HINT) module. The HINT domain adopts a characteristic horseshoe-shaped β-sheet fold, which positions key catalytic residues at its to facilitate protein splicing. In full-length inteins, the HED is inserted between the splicing domains and often belongs to the LAGLIDADG family, featuring blocks A through I that enable site-specific DNA cleavage for intein propagation. The splicing activity relies on conserved motifs within the HINT domain, including blocks A (encompassing the N-terminal Cys, Ser, or Thr residue), B (His or Thr), F ( or His), and G (penultimate His and terminal Asn), which coordinate nucleophilic attacks during excision and . A , typically comprising the N-terminal Cys (C1), a central His, and the C-terminal Asn (or Asp in some variants), drives the chemistry of splicing by stabilizing reaction intermediates. These motifs ensure precise autocatalytic processing, with the HED providing mobility without directly participating in splicing. Standardized for inteins was established in 1994 to facilitate and , using the prefix "PI-" (for ) followed by a three-letter of the host and the affected , such as PI-SceI for the from the VMA1 intein in . Inteins themselves are denoted by the , name, and an for multiple occurrences (e.g., Sce VMA1 intein), with exteins labeled as N- or C-terminal flanks. The archived InBase database (last updated in 2010) provides these conventions, cataloging approximately 614 inteins and designations for variants like split inteins (e.g., Ssp DnaB for the split DnaB intein in Synechocystis sp. PCC6803). As of 2025, genomic surveys have identified thousands of putative inteins, though around 200-300 are manually curated in . This system supports systematic annotation and evolutionary analysis.

Full and Mini Inteins

Inteins are classified into full-length and mini variants based on their size and domain composition, with full inteins typically ranging from 400 to 600 in length and containing a central (HE) domain responsible for their mobility within genomes. This HE domain, often featuring motifs such as the LAGLIDADG in the PI-SceI intein from VMA1 (454 ), enables site-specific DNA cleavage to promote . In contrast, mini inteins are more compact, spanning 100 to 200 , and lack the HE domain, relying instead on cellular factors for splicing while exhibiting faster autocatalytic rates due to their streamlined . Structurally, full inteins feature the HE domain inserted between the N- and C-terminal splicing domains, forming a modular architecture that separates endonuclease activity from protein splicing. Mini inteins, however, consist of fused HINT (hedgehog/intein) domains that directly integrate the splicing machinery without the intervening HE, resulting in a more contiguous core fold. has illuminated these differences, as seen in the structure of the mini Mxe GyrA intein from Mycobacterium xenopi (PDB ID: 1AM2), which reveals a horseshoe-shaped HINT scaffold essential for splicing without endonuclease elements. An example of a mini intein is the 138-amino-acid Tth PolII intein from the gene in the Tetrahymena thermophila, exemplifying the compact form adapted for eukaryotic hosts. Full inteins predominate in and , where their mobility aids in genetic , whereas mini inteins are more common in unicellular eukaryotes such as and ascomycetes, comprising approximately 20% of all known inteins. This distribution reflects evolutionary pressures, with mini forms likely arising from HE domain loss in eukaryotic lineages, enhancing splicing speed in compact genomes.

Split Inteins

Split inteins are inteins that have evolved to function in a fragmented form, consisting of an N-terminal fragment (IntN) and a C-terminal fragment (IntC), which associate non-covalently to mediate protein trans-splicing. Unlike cis-splicing inteins, split inteins enable the intermolecular ligation of two separate extein polypeptides into a single protein, with the intein fragments excising themselves during the process. This trans-splicing capability arises from the same core chemical mechanism as cis-splicing but occurs across separate polypeptide chains after the IntN and IntC fragments dock to reconstitute the active intein structure. The mechanism of split intein-mediated trans-splicing begins with the rapid, non-covalent association of IntN and IntC, forming a transient intein complex that mimics the intact intein fold. Once assembled, the process follows the canonical splicing steps: an N-S acyl shift at the Cys/Ser-1 residue of the N-extein, to form a branched intermediate at the Cys+1 residue, and finally, Asn cyclization at the of the intein with concomitant extein ligation. A prototypical example is the Ssp DnaE split intein from Synechocystis sp. PCC6803, where the IntN (123 ) and IntC (36 ) fragments reassemble to splice the flanking exteins of the DnaE α subunit, as demonstrated in E. coli expression systems. This reconstitution enables efficient trans-splicing under physiological conditions, with the full process completing in minutes for optimized variants. Structurally, split inteins feature compact, horseshoe-shaped folds dominated by β-sheets, with the IntN containing most of the catalytic residues and the short IntC providing key elements for association. The docking interface is notably brief, typically involving 4-10 residues that form bonds and hydrophobic interactions, often centered on β-sheet extensions between the fragments. For instance, the NMR solution of the Npu DnaE split intein from punctiforme (PDB: 2KEQ) reveals segregated charged surfaces and ion-pair clusters that drive rapid fragment association, with the interface stabilized by complementary β-strands. Crystal , such as that of the Ssp DnaE intein (PDB: 1ZDE), further highlight conserved motifs (e.g., blocks A, B, F, G) that position catalytic residues for splicing without an endonuclease domain. Naturally occurring split inteins are rare, with approximately 50 identified sequences primarily from DnaE homologs in and other microbes, reflecting an evolutionary adaptation for modular protein assembly in DNA polymerases. Examples include the Ssp DnaE and DnaE inteins, which are encoded by separate genes and naturally perform trans-splicing in their host organisms. In contrast, most utilized split inteins are engineered by artificially dividing full-length cis-splicing inteins at permissive sites to create IntN and IntC fragments, enhancing splicing speed and tolerance; for instance, the DnaE split variant, derived from a natural full intein but optimized, achieves half-times under 1 minute at 37°C.

Applications

Biotechnology and Protein Engineering

Intein-mediated protein purification enables the isolation of target proteins without the need for proteolytic enzymes, leveraging the self-cleaving activity of engineered inteins fused to affinity tags. In the IMPACT system, a target protein is expressed as a fusion with a mini-intein (such as the Ssp DnaB intein) and a chitin-binding domain in E. coli. The fusion binds to a chitin resin, and treatment with thiols like dithiothreitol at pH 8 induces N-terminal cleavage, releasing the native target protein while the intein-tag remains immobilized. This single-step process achieves high purity and yields typically exceeding 80% for various prokaryotic and eukaryotic proteins, avoiding protease-related artifacts such as non-specific cleavage or incomplete removal. Expressed protein ligation (EPL) utilizes split inteins to facilitate by joining recombinant N-terminal protein segments to synthetic C-terminal peptides or proteins. A recombinant N-extein is fused to the C-terminal fragment of a split intein (e.g., from the Synechocystis DnaE intein), generating a C-terminal upon trans-splicing activation. This then reacts chemoselectively with a synthetic C-extein bearing an N-terminal via , forming a native . Developed independently in 1998, this method allows incorporation of non-natural or modifications into large proteins, with ligation efficiencies often surpassing 90% under mild aqueous conditions. Site-specific labeling exploits the residue generated at the ligation junction in , enabling precise attachment of probes without disrupting protein function. Post-splicing, the intermediate or ligated can be conjugated to fluorophores (e.g., via maleimide derivatives) or isotopic labels under controlled conditions, optimizing yield and minimizing side reactions. This approach supports applications in (NMR) spectroscopy for structural studies and fluorescence tagging for imaging protein dynamics, as demonstrated with labeled variants of enzymes like Csk kinase achieving near-quantitative modification. Representative examples include the production of ubiquitin conjugates and cyclic peptides. EPL with split inteins has been used to semisynthesize K63-linked diubiquitin, ligating recombinant thioesters to synthetic ubiquitin peptides with N-terminal , yielding functional conjugates for studying deubiquitinase activity. Similarly, intein-mediated cyclization in E. coli enables backbone closure of randomized peptides (5–12 residues) via split-intein ligation, producing libraries of stable cyclic peptides with cyclization efficiencies over 90% and facilitating extracellular secretion for .

Antimicrobial Development

Protein splicing, mediated by inteins, enables the production of () that are often toxic to host cells by fusing them to an intein tag, allowing controlled release through self-splicing without exogenous proteases. This strategy involves expressing the AMP-intein fusion in heterologous hosts like , where the intein facilitates purification via and subsequent splicing under specific conditions, such as thiol-induced cleavage, to yield the active peptide. A representative example is the intein-mediated expression of cecropin A1, a potent AMP active against Gram-negative bacteria, in E. coli ER2566 using the Mxe GyrA mini-intein from the IMPACT system. The fusion protein is induced at 22°C for solubility, purified on chitin resin, and cleaved with dithiothreitol (DTT) to release cecropin A1 with a yield of approximately 2.5 mg/L from culture volumes, demonstrating micromolar-level antimicrobial activity against Vibrio ordalii, Vibrio alginolyticus, and E. coli. Similarly, β-defensins such as human β-defensin 2 (HBD2), rabbit β-defensin 1 (RBD1), and rat β-defensin 1b (rBin1b) have been produced using intein fusions in E. coli BL21-CodonPlus strains, achieving 17–25% soluble yields at reduced induction temperatures (22°C), with the released peptides showing broad-spectrum activity against E. coli K12D31 and Candida albicans SC5314. This approach addresses key challenges in production, such as host and low , by sequestering the within the inactive until splicing, enabling site-specific in environments like bacterial . It also supports without complex , making it economical for research and potential therapeutic development. Despite these benefits, challenges include incomplete splicing efficiency and peptide stability in vivo, which can limit therapeutic efficacy against resistant pathogens. Post-2010 advances incorporate split inteins, such as the DnaE intein, for and ligation of circular bacteriocins like garvicin ML and enterocin AS-48, enabling combinatorial engineering and dual-peptide delivery to enhance antimicrobial potency while mitigating toxicity in production hosts like E. coli BL21(DE3). These split-intein systems confirm activity for novel via and bioassays, offering modular platforms to combat .

Emerging Therapeutic Uses

Intein-mediated protein trans-splicing has emerged as a promising strategy in for addressing packaging limitations of viral vectors, particularly for large genes like CFTR in models. By splitting the CFTR coding sequence and flanking the fragments with split intein halves, the full-length protein can be reconstituted post-translationally in target cells, bypassing size constraints of (AAV) vectors. Recent optimizations of split inteins, such as those from the NpuDnaE family, have demonstrated up to 80% splicing efficiency in transfected HEK293 cells and 55% in AAV-transduced cells when modeling protein assembly for therapeutic delivery, highlighting potential for correction of CFTR mutations. In vaccine development, split inteins facilitate precise display on nanoparticles, enhancing through controlled protein . For instance, split intein trans-splicing has been employed to couple receptor-binding domain (RBD) variants onto core (HBc) nanoparticles derived from , enabling ultrafast assembly and multivalent presentation that elicits robust neutralizing antibodies in preclinical models. Similarly, circular dimeric RBD vaccines against Delta-XBB.1.5 variants, formed via split intein mediation, have shown protection against multiple strains in mice by promoting stable conformation and T-cell responses. These approaches, developed between 2021 and 2024, leverage intein chemistry for rapid, modular prototyping without chemical linkers. For cancer therapy, conditional protein splicing via split inteins enables tumor-specific activation of cytotoxic payloads, mimicking mechanisms to minimize off-target effects. In a 2020 study, receptor-targeted delivery of split intein fragments reconstituted a ribosome-inactivating in the of tumor cells, achieving selective in mixed populations and xenograft models with minimal impact on healthy tissues. More recent of caged inteins responsive to tumor-associated signals, such as SUMO protease, allows spatiotemporal control of splicing for prodrug-like release of therapeutic proteins, with applications demonstrated in photoactivatable cytotoxic constructs that selectively kill cancer cells upon light exposure. These strategies, advanced in 2023–2024 studies, exploit tumor-specific promoters or microenvironments to trigger intein activity. As of 2025, intein technologies continue to advance toward clinical , with no FDA approvals for intein-engineered biologics yet achieved, though preclinical with mRNA platforms shows promise for conditional therapeutics. For example, caged inteins engineered into synthetic mRNA enable multi-input logic gates that regulate based on intracellular proteins, potentially reducing by limiting protein expression to diseased cells. Challenges persist, including intein and splicing fidelity , but optimizations in intein variants have improved and reduced immune responses in mammalian systems. In June 2025, SpliceBio secured $135 million in Series B to advance intein-based protein splicing technologies for therapeutic applications, including and targeted protein delivery. Additionally, a May 2025 study in Science demonstrated intracellular protein editing using pairs of split inteins to incorporate noncanonical into proteins, expanding therapeutic potential for customized biologics.

References

  1. [1]
    Inteins—mechanism of protein splicing, emerging regulatory roles ...
    Nov 8, 2023 · Protein splicing is a posttranslational process in which an intein segment excises itself from two flanking peptides, referred to as exteins.
  2. [2]
  3. [3]
    Protein Splicing Converts the Yeast TFP1 Gene Product to the 69 ...
    Kane et al. ,. Protein Splicing Converts the Yeast TFP1 Gene Product to the 69-kdDSubunit of the Vacuolar H+-Adenosine Triphosphatase.Science250,651-657(1990).
  4. [4]
    Biological Applications of Protein Splicing - PMC - PubMed Central
    Protein splicing is an autocatalytic process in which an intervening protein domain (intein) excises itself from the polypeptide in which it is embedded, ...
  5. [5]
    Protein Splicing: How Inteins Escape from Precursor Proteins - PMC
    Protein splicing results in ligation of the N-extein (EN) and C-extein (EC), as directed by the intein (I). When inteins are mutated or inserted in heterologous ...
  6. [6]
    Enigmatic Distribution, Evolution, and Function of Inteins - PMC
    Inteins are mobile genetic elements capable of self-splicing post-translationally. They exist in all three domains of life including in viruses and ...
  7. [7]
  8. [8]
    Protein splicing converts the yeast TFP1 gene product to the 69-kD ...
    Protein splicing converts the yeast TFP1 gene product to the 69-kD subunit of the vacuolar H(+)-adenosine triphosphatase ... Science. 1990 Nov 2; ...
  9. [9]
    InBase, the New England Biolabs Intein Database - Oxford Academic
    The Intein Registry (Section 2A) lists all known inteins and their properties. Clicking on any intein name displays individual intein records containing: intein ...
  10. [10]
    Genetic definition of a protein-splicing domain: Functional mini ...
    Inteins are protein-splicing elements, most of which contain conserved sequence blocks that define a family of homing endonucleases.
  11. [11]
    InBase, the Intein Database | Nucleic Acids Research
    Received September 29, 1999; Accepted October 7, 1999. INTRODUCTION. Inteins are in-frame intervening sequences that disrupt the coding region of a host gene.
  12. [12]
    Article Directed Evolution of a Small-Molecule-Triggered Intein with ...
    To improve the splicing characteristics of the evolved 4-HT dependent inteins, we modified the high-throughput fluorescence-activated cell sorting (FACS) screen ...
  13. [13]
    Design of a Split Intein with Exceptional Protein Splicing Activity
    Feb 8, 2016 · Protein splicing is an autocatalytic process where an "intein" self-cleaves from a precursor and ligates the flanking N- and C-"extein" ...
  14. [14]
    Development of an intein-mediated split–Cas9 system for gene ...
    Jun 16, 2015 · We developed a split–Cas9 system, bypassing the packaging limit using split-inteins. Each Cas9 half was fused to the corresponding split-intein moiety.T7 Endonuclease I Assay And... · Results · The Intein-Mediated...
  15. [15]
    InBase, The Old Intein Database and Registry
    InBase is a curated database devoted to inteins. When referencing InBase, please use the following InBase Reference: Perler, F. B. (2002).
  16. [16]
    Inteins at Eleven Distinct Insertion Sites in Archaeal Helicase ... - MDPI
    Results: In total, 11 active MCM intein insertion sites were identified, expanding on the previously known five. The insertion sites have varied invasion ...
  17. [17]
  18. [18]
    The mechanism of protein splicing and its modulation by mutation
    Protein splicing results in the expression of two mature proteins from a single gene. After synthesis of a precursor protein, an internal segment (the intein) ...
  19. [19]
    Biotechnological applications of protein splicing - PMC
    Apr 6, 2020 · The accepted canonical mechanism for protein splicing involves four steps (Fig. 3). First, the thiol/alcohol group of the first residue of the ...
  20. [20]
  21. [21]
    Structure of the branched intermediate in protein splicing - PNAS
    Inteins are autoprocessing domains that cut themselves out of host proteins in a traceless manner. This process, known as protein splicing, involves multiple ...
  22. [22]
    Inteins in Science: Evolution to Application - MDPI
    Dec 16, 2020 · Inteins are mobile genetic elements that apply standard enzymatic strategies to excise themselves post-translationally from the precursor protein via protein ...Inteins In Science... · 3. Intein Structure · 5. Intein Applications
  23. [23]
    Convergent evolution of the Hedgehog/Intein fold in protein splicing
    Mar 20, 2020 · The catalytic triad being at the interface of the two subdomains of the HINT fold resembles the common catalytic triad of serine/cysteine ...
  24. [24]
    (PDF) Protein splicing elements: Inteins and exteins - ResearchGate
    Aug 7, 2025 · INTRODUCTION Several archaeal, eubacterial and eucaryotic genes have been identified with in-frame insertions that are excised at the ...
  25. [25]
    InBase, the Intein Database - PMC - NIH
    InBase is the home of the INTEIN REGISTRY, which lists all known putative inteins. Inteins are sorted both by organism (Section 2A) and by insertion site in the ...
  26. [26]
    Structural Basis for the Propagation of Homing Endonuclease ...
    Mar 15, 2022 · Inteins make use of homing endonuclease domains for efficient invasion by directing sequence insertion via horizontal gene transfer (HGT) initiated by DNA- ...Abstract · Introduction · Results · Discussion<|separator|>
  27. [27]
    1AM2: GYRA INTEIN FROM MYCOBACTERIUM XENOPI - RCSB PDB
    Jun 24, 1998 · 1AM2 is a GyrA intein from Mycobacterium xenopi, an internal protein that auto-splices and ligates to form a functional external protein.Missing: 1AM5 | Show results with:1AM5
  28. [28]
    The dynamic intein landscape of eukaryotes | Mobile DNA | Full Text
    Jan 24, 2018 · The first set of full-length precursors and intein protein sequences for eukaryotes was collected from the intein database, InBase [28], and the ...
  29. [29]
    Split Inteins: Nature's Protein Ligases - PMC - NIH
    Split inteins carry out a naturally occurring process known as protein trans-splicing, where two protein fragments bind to form a catalytically competent enzyme ...
  30. [30]
    Protein trans-splicing by a split intein encoded in a split DnaE gene ...
    Protein splicing is a post-translational event involving precise excision of the intein sequence and concomitant ligation of the flanking sequences (N- and C- ...Dna Sequence Analysis And... · Protein Production And... · Results
  31. [31]
    Naturally Split Inteins Assemble through a “Capture and Collapse ...
    Nov 15, 2013 · We show that one split intein fragment is partly folded, while the other is completely disordered. These polypeptides capture each other through their ...
  32. [32]
    2KEQ: Solution structure of DnaE intein from Nostoc punctiforme
    May 19, 2009 · Based on the NMR structure and the backbone dynamics of the single chain NpuDnaE intein, we designed a functional split variant of the ...Missing: 2NOX | Show results with:2NOX
  33. [33]
  34. [34]
    Single-column purification of free recombinant proteins using a self ...
    Jun 19, 1997 · A novel protein purification system has been developed which enables purification of free recombinant proteins in a single chromatographic step.Missing: URL | Show results with:URL
  35. [35]
    Expressed protein ligation: A general method for protein engineering
    This work illustrates that expressed protein ligation is a simple and powerful new method in protein engineering to introduce sequences of unnatural amino acids ...
  36. [36]
    Semisynthesis of cytotoxic proteins using a modified protein splicing ...
    A novel semisynthetic approach that utilizes a protein splicing element, an intein, to generate a reactive thioester at the C-terminus of a recombinant protein.
  37. [37]
    Optimized Conjugation of a Fluorescent Label to Proteins via Intein ...
    Intein-mediated ligation provides a site-specific method for the attachment of molecular probes to proteins. The method is inherently flexible with regard ...
  38. [38]
    Genetically Directed Production of Recombinant, Isosteric and ...
    May 20, 2016 · The genetic incorporation of aminooxy functionality into recombinant proteins allows the production of ubiquitin conjugates by biorthogonal ...<|separator|>
  39. [39]
    Intein-mediated cyclization of randomized peptides in the periplasm ...
    Jul 16, 2010 · In this work we show that cysteine-mediated splicing can be performed in the oxidative environment of the periplasm of Escherichia coli.Missing: yields paper
  40. [40]
    Intein-mediated expression of cecropin in Escherichia coli
    Mar 15, 2012 · Our results show that the intein carrier can be used successfully to express cecropin in E. coli. The use of this technology using a self- ...<|control11|><|separator|>
  41. [41]
    Intein-mediated expression is an effective approach in the study of β ...
    This approach overcomes the difficulties in β-defensin production and provides a convenient and economical peptide-production platform.
  42. [42]
    In vitro and in vivo production and split-intein mediated ligation ... - NIH
    This synthetic biology tool holds great potential for production, engineering, improving and testing the antimicrobial activity of circular bacteriocins.
  43. [43]
    Protein trans-splicing: optimization of intein-mediated GFP assembly ...
    This study discusses the key points of working with Ssp, Npu, and Ava inteins of the DnaE group, known as the most effective for assembly of large proteins.Missing: prodrug | Show results with:prodrug
  44. [44]
    Protein trans-splicing: optimization of intein-mediated GFP assembly ...
    Nov 20, 2024 · The amazing integration speed of intein-based protein trans splicing technology makes it a versatile tool for a variety of applications, albeit ...Missing: cystic fibrosis
  45. [45]
    Engineering Escherichia coli-Derived Nanoparticles for Vaccine ...
    Nov 18, 2024 · Therefore, split intein-mediated trans-splicing was implemented to couple antigens onto the HBc nanocarrier, enabling an ultrafast reaction ...
  46. [46]
    A Novel Circular Delta-XBB15 RBD Dimeric Protein Subunit Vaccine ...
    A Novel Circular Delta-XBB15 RBD Dimeric Protein Subunit Vaccine Mediated by Split Intein Elicits an Immune Response and Protection Against Multiple SARS-CoV-2 ...
  47. [47]
    Intein-mediated cytoplasmic reconstitution of a split toxin ... - PNAS
    Aug 24, 2020 · We designed a protocol for delivering the split-intein protein fragments to the cell cytoplasm via a receptor-specific toxin-based delivery machinery.
  48. [48]
    Conditional protein splicing triggered by SUMO protease
    Conditional protein splicing is a powerful biotechnological tool that can be used to post-translationally control the activity of target proteins.Missing: prodrug | Show results with:prodrug
  49. [49]
    (PDF) Spatio‐Temporal Photoactivation of Cytotoxic Proteins
    The platform was successfully demonstrated for two cytotoxic proteins to selectively kill cancer cells after photoactivation of intein splicing. This ...<|separator|>
  50. [50]
    Development of a new caged intein for multi-input conditional ...
    May 1, 2024 · We developed an mRNA-based logic gate that regulates translation based on the expression of multiple intracellular proteins.
  51. [51]
    (PDF) Development of a new caged intein for multi-input conditional ...
    To achieve such multi-input translational regulation of mRNA medicines, in this study, we engineered Rhodothermus marinus (Rma) DnaB intein to develop “caged ...