Genome editing
Genome editing encompasses methods for precisely altering DNA sequences in living organisms, enabling targeted insertions, deletions, replacements, or modifications of genetic material to study gene function or treat diseases.[1] These techniques rely on engineered nucleases that create double-strand breaks at specific genomic loci, which are then repaired by cellular mechanisms such as non-homologous end joining or homology-directed repair, often incorporating desired changes.[2] Early approaches included zinc-finger nucleases (ZFNs) developed in the 1990s and transcription activator-like effector nucleases (TALENs) in the 2000s, but the clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein 9 (Cas9) system, adapted from bacterial adaptive immunity and demonstrated for eukaryotic genome editing in 2012, has dominated due to its simplicity, efficiency, and versatility.[3][4] Significant achievements include rapid gene knockout in model organisms for functional genomics, crop improvement for enhanced yield and resistance, and therapeutic applications in humans, such as the 2023 FDA approval of CRISPR-based ex vivo editing for sickle cell disease and beta-thalassemia, marking the first clinical use of the technology to correct pathogenic mutations.[5] These advances stem from empirical validation of editing precision in controlled trials, demonstrating causal links between corrected genotypes and phenotypic improvements without widespread off-target effects in targeted contexts.[6] However, challenges persist, including unintended mutations from incomplete specificity and delivery inefficiencies in vivo, necessitating ongoing refinements like base editing and prime editing for higher fidelity.[5] Controversies arise primarily from germline editing, where changes are heritable, raising concerns over safety risks like mosaicism and long-term ecological impacts, as well as ethical debates on eugenics and inequality, though somatic applications avoid inheritance issues and focus on individual therapeutic benefits supported by clinical data.[7][8] The 2018 case of unauthorized human embryo editing by He Jiankui highlighted regulatory gaps but also underscored the technology's potential when responsibly applied, with scientific consensus favoring paused heritable use until risks are empirically mitigated.[4][9]History
Pre-Engineered Nuclease Era
Homologous recombination (HR), a conserved cellular mechanism for repairing double-strand breaks using a homologous DNA template, formed the basis of early genome editing efforts before the advent of engineered site-specific nucleases. In this era, spanning the 1970s to late 1980s, researchers introduced linear DNA constructs with flanking homology arms matching the target genomic locus, relying on spontaneous, low-frequency HR events to achieve precise insertions, deletions, or replacements without inducing targeted DNA breaks. This approach yielded high-fidelity modifications when successful but suffered from extreme inefficiency, typically 10^{-4} to 10^{-6} in mammalian cells, dominated by random non-homologous integrations.[2] To counter this, positive-negative selection strategies were developed, using markers like neomycin resistance for enrichment and herpes simplex virus thymidine kinase for counterselection against random integrants.[2] Pioneering work in yeast, where HR is naturally more efficient, demonstrated feasibility early on. In 1979, Scherer and Davis achieved targeted chromosomal segment replacement in Saccharomyces cerevisiae by transfecting hybrid plasmids, marking one of the first instances of precise genomic alteration via HR.[10] This success in unicellular eukaryotes informed extensions to mammals. Oliver Smithies advanced the field in 1985 by reporting HR-mediated insertion of a functional gene into the human beta-globin locus in cultured mouse erythroleukemia cells, confirming targeted events at frequencies around 1 in 10^3 to 10^4 transformants under selection.[11] The integration of mouse embryonic stem (ES) cells, first isolated in 1981 by Martin Evans and Matthew Kaufman, enabled heritable modifications.[12] By 1987, Smithies and Mario Capecchi independently applied HR in mouse ES cells to disrupt specific genes, such as Aph-3, using isogenic targeting vectors to boost efficiency.[13] Capecchi's group refined selection protocols, achieving targeted disruptions at rates improved to about 1 in 10^5 electroporated cells. These methods culminated in the first germline-transmissible gene knockouts in mice by 1989, allowing systematic functional analysis of genes via "knockout" models.[14] Despite these advances, the absence of DSB induction limited scalability, confining applications largely to tractable systems like yeast and mouse ES cells for basic research into gene function and disease modeling.[2]Development of Site-Specific Nucleases
Site-specific nucleases emerged as tools for targeted genome editing by inducing double-strand breaks (DSBs) at predetermined DNA sequences, leveraging cellular DNA repair pathways for precise modifications. Early efforts focused on meganucleases, naturally occurring homing endonucleases from microbes with recognition sites of 12-40 base pairs, first characterized in the 1980s such as I-SceI from yeast mitochondria.[15] These enzymes demonstrated enhanced gene targeting via DSB-stimulated homologous recombination in yeast and mammalian cells by the early 1990s, but their rigid protein-DNA interfaces limited redesign for new specificities, restricting broad applicability.[13] Engineering attempts in the 2000s involved semi-rational design and directed evolution to alter specificity, yet success remained low due to coupled recognition and cleavage domains.[16] To overcome meganuclease limitations, researchers developed modular nucleases by fusing customizable DNA-binding domains to separate, non-specific nuclease modules. Zinc finger nucleases (ZFNs), invented in 1996 by fusing zinc finger proteins—discovered in 1985—with the FokI endonuclease cleavage domain, enabled programmable targeting of 9-18 base pair sites as dimers.[17] [18] Initial demonstrations achieved targeted DSBs and gene disruption in mammalian cells by 1998, with therapeutic applications emerging in the 2000s, including HIV resistance via CCR5 knockout in human cells.[19] However, ZFNs required expertise in zinc finger assembly, often via phage display or structure-based design, and off-target effects arose from context-dependent binding affinities.[15] Transcription activator-like effector nucleases (TALENs) advanced programmability in 2010, following the 2009 deciphering of the TALE DNA-binding code from Xanthomonas bacteria, where repeat-variable di-residues (RVDs) specify one base pair each.[20] TALENs pair FokI domains flanking a central spacer for dimerization and cleavage, offering higher specificity than ZFNs due to independent modular recognition and reduced toxicity in applications like zebrafish and human cell editing.[21] First used for targeted gene knockouts and insertions in 2011, TALENs facilitated multiplex editing and expanded genome engineering to non-model organisms, though assembly of lengthy TALE arrays remained labor-intensive compared to later RNA-guided methods.[22] These innovations established DSB-based editing principles, paving the way for scalable technologies while highlighting trade-offs in specificity, ease of design, and delivery.[23]CRISPR Breakthrough and Rapid Adoption
The CRISPR-Cas9 system emerged as a transformative tool for genome editing following a 2012 study by Emmanuelle Charpentier and Jennifer Doudna, who demonstrated in vitro that the Cas9 endonuclease from Streptococcus pyogenes, guided by a dual RNA structure (crRNA and tracrRNA, later simplified into single-guide RNA), could be reprogrammed to induce site-specific double-strand breaks in DNA.[24] This work, published online on June 28, 2012, in Science, repurposed the bacterial adaptive immune mechanism—previously characterized in the early 2000s—for precise nucleic acid targeting, offering advantages in simplicity, multiplexing potential, and cost over prior nucleases like ZFNs and TALENs.[25] The system's RNA-guided specificity stemmed from base-pairing between the guide RNA and target DNA, adjacent to a protospacer-adjacent motif (PAM) sequence, enabling predictable cleavage without protein engineering for each target.[26] Adaptation to cellular genome editing occurred rapidly, with demonstrations in prokaryotic and eukaryotic systems by early 2013. Independent studies by Feng Zhang's group at the Broad Institute and George Church's lab at Harvard achieved targeted modifications via non-homologous end joining (NHEJ) and homology-directed repair (HDR) in human and mouse cell lines, as reported in Science on January 3, 2013. These applications exploited the endogenous DNA repair pathways to introduce insertions, deletions, or precise substitutions, validating CRISPR-Cas9's efficiency in mammalian genomes where off-target effects, though present, were manageable compared to earlier tools.[27] Concurrently, Virginijus Šikšnys's group in Lithuania reported similar prokaryotic editing, underscoring the technology's versatility.[28] Adoption accelerated exponentially, evidenced by a surge in research output: CRISPR-related publications rose from fewer than 100 annually pre-2012 to over 3,900 by 2018, reflecting its integration into diverse fields like functional genomics and model organism engineering.[29] Patent filings intensified, with the University of California (representing Doudna's work) submitting the first provisional application in May 2012, followed by the Broad Institute's December 2012 filing—expedited to yield the initial U.S. patent in April 2014 for eukaryotic applications—sparking ongoing interference proceedings that highlighted competing claims but did not impede lab proliferation.[30] By 2015, CRISPR had supplanted prior methods in most academic and industrial workflows due to its accessibility, enabling high-throughput screens and multiplexed edits unattainable with protein-based nucleases.[5]Post-2020 Advancements and Commercialization
Following the rapid adoption of CRISPR-Cas9 in the late 2010s, post-2020 developments emphasized enhanced precision, reduced off-target effects, and in vivo delivery to expand therapeutic applicability. Researchers introduced refined Cas variants, such as smaller Cas12 and Cas13 orthologs, enabling better packaging into viral vectors for systemic administration and multiplex editing capabilities. These variants improved editing efficiency in non-dividing cells, addressing limitations in earlier systems.[31][32] Base editing and prime editing matured as DSB-free alternatives, minimizing unwanted insertions or deletions. Base editors, which convert specific nucleotides via deaminase fusion, entered clinical trials post-2020 for conditions like alpha-1 antitrypsin deficiency, with early 2025 data showing successful single-base corrections in human subjects. Prime editing, leveraging a reverse transcriptase-pegsRNA complex, advanced to support diverse modifications including insertions up to dozens of bases, with preclinical models demonstrating up to 50% efficiency in therapeutically relevant genes by 2025. These tools expanded the editable genome fraction beyond traditional CRISPR's reach.[33][34][35] Delivery innovations accelerated in vivo applications, with nanoparticle and lipid-conjugated systems achieving tissue-specific targeting, as shown in 2024 studies editing subsets of neurons or hepatocytes in animal models without broad toxicity. Clinical translation progressed, with over 15 base-editing trials registered by mid-2025 targeting immunodeficiencies and metabolic disorders.[36][37] Commercialization gained momentum with regulatory approvals validating ex vivo CRISPR therapies. In December 2023, the FDA approved Casgevy (exagamglogene autotemcel), a CRISPR-Cas9-edited autologous stem cell therapy from Vertex Pharmaceuticals and CRISPR Therapeutics, for sickle cell disease in patients aged 12 and older with recurrent vaso-occlusive crises; approval for transfusion-dependent beta thalassemia followed in January 2024 for those aged 12 and older. Casgevy disrupts the BCL11A enhancer to reactivate fetal hemoglobin, achieving transfusion independence in 94% of beta thalassemia patients and reducing vaso-occlusive events by 91% in sickle cell cases across trials. This marked the first CRISPR-based therapy commercialization, though high costs exceeding $2 million per treatment and manufacturing complexities limited initial access.[38][39][40] By 2025, the sector saw expanded pipelines, with CRISPR Therapeutics prioritizing in vivo programs for cardiovascular and autoimmune diseases, alongside Phase 3 trials for hereditary angioedema. The global CRISPR gene-editing market reached $4.01 billion in 2024, projected to grow to $13.50 billion by 2033, driven by diagnostics, agriculture, and therapeutics, though intellectual property disputes and ethical concerns over germline editing persisted. Ongoing trials numbered over 50 worldwide, focusing on oncology, HIV, and rare diseases, signaling broader clinical maturation.[41][42][43]Biological and Mechanistic Foundations
DNA Repair Mechanisms Exploited in Editing
Genome editing technologies, such as those employing site-specific nucleases, induce double-strand breaks (DSBs) in DNA that are subsequently repaired by cellular pathways, enabling targeted genetic modifications. The primary pathways exploited are non-homologous end joining (NHEJ) and homology-directed repair (HDR), with NHEJ predominating in most cell types due to its efficiency and lack of requirement for a homologous template.01131-X) [44] NHEJ directly ligates DSB ends, often introducing small insertions or deletions (indels) at the junction, which frequently results in frameshift mutations that disrupt gene function and are harnessed for gene knockouts.00111-9) This pathway operates throughout the cell cycle, making it suitable for editing in both dividing and quiescent cells, though its error-prone nature limits applications to loss-of-function edits.[44] In contrast, HDR utilizes a homologous donor template to accurately repair DSBs, facilitating precise insertions, deletions, or substitutions and is thus exploited for corrective edits or transgene integration. HDR, including subpathways like synthesis-dependent strand annealing and double Holliday junction resolution, is restricted to S and G2 phases when sister chromatids are available as templates, rendering it less efficient—typically competing with NHEJ at ratios where NHEJ prevails in non-synchronized cells.00111-9) [45] To enhance HDR, strategies such as inhibiting NHEJ factors (e.g., DNA-PK) or synchronizing cells to proliferative phases have been developed, though HDR remains challenging in primary and non-dividing cells.[46] An alternative DSB repair mechanism, microhomology-mediated end joining (MMEJ), serves as a backup pathway involving short homologous sequences (5-25 base pairs) flanking the break for annealing, leading to precise but error-prone joining with deletions of intervening sequences. In genome editing, MMEJ is leveraged in approaches like precise integration into target chromosomes (PITCh) for scarless insertions without long homology arms, particularly useful when HDR is inefficient, though it can also contribute to unintended large deletions.[47] [48] MMEJ activity increases under conditions suppressing classical NHEJ, such as in certain cancer cells deficient in NHEJ components, highlighting its role in editing outcomes influenced by cellular context.[49] These pathways' competition determines editing fidelity, with outcomes varying by locus, cell type, and DSB-end processing factors like end resection, which favors HDR over NHEJ.00200-6)Principles of Sequence-Specific Targeting
Sequence-specific targeting in genome editing fundamentally relies on engineered nucleases that bind to and cleave DNA at predetermined loci, exploiting endogenous repair pathways for modifications such as insertions, deletions, or substitutions.[50] This specificity is achieved through modular DNA-binding domains that recognize particular nucleotide sequences, typically 12–20 base pairs long, fused to a catalytic nuclease domain that induces double-strand breaks (DSBs).[51] The binding domains operate via direct interactions with DNA bases, ensuring localized nuclease activity while minimizing off-target effects, though imperfect specificity remains a challenge requiring ongoing optimization.[50] In protein-DNA recognition systems, such as those in zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), specificity arises from arrays of protein modules that contact DNA through hydrogen bonds and hydrophobic interactions. Each zinc finger module in ZFNs typically recognizes a 3–4 base pair subsite, with 3–6 fingers forming a recognition arm of 9–18 bp, though binding affinity is influenced by adjacent fingers and spacer sequences of 5–7 bp between dimerizing monomers.[52] TALENs utilize TALE repeats from plant pathogens, where each 34-amino-acid repeat's repeat-variable di-residues (RVDs) specify a single nucleotide—e.g., NI for adenine—enabling straightforward programming of longer targets (12–20 bp) with spacers of 12–19 bp.[50] These systems often employ FokI nuclease domains, which require dimerization for cleavage, enhancing specificity by necessitating paired binding events.[51] RNA-guided mechanisms, exemplified by CRISPR-Cas systems, achieve targeting through Watson-Crick base pairing between a single-guide RNA (sgRNA) and the target DNA, typically spanning 20 nucleotides adjacent to a protospacer-adjacent motif (PAM) required for Cas nuclease activation, such as NGG for Streptococcus pyogenes Cas9.[50] This RNA-DNA hybridization simplifies programming compared to protein engineering, as only the sgRNA sequence needs alteration, but specificity depends on minimizing mismatches, with off-target cuts occurring at sites with partial complementarity.[51] Unlike protein-based dimers, Cas9 functions as a single polypeptide, scanning DNA for PAMs before R-loop formation and cleavage, though variants like high-fidelity Cas9 mutants reduce unintended activity by altering contact dynamics.[50] Across all approaches, target site accessibility, chromatin state, and cellular repair context influence editing efficiency, underscoring the need for empirical validation of specificity.[51]Primary Editing Technologies
Meganucleases
Meganucleases, also known as homing endonucleases, are naturally occurring site-specific endonucleases derived primarily from microbial mobile genetic elements, such as introns and inteins.[53] These enzymes recognize extended DNA sequences, typically 12 to 40 base pairs in length, which enables highly precise cleavage with minimal off-target effects due to the rarity of such long motifs in genomes.[54] Unlike modular nucleases, meganucleases integrate DNA-binding and catalytic activities within a single polypeptide, often exhibiting a saddle-shaped structure that cradles the DNA helix.[55] The adaptation of meganucleases for genome editing began in the early 1990s, with natural variants like I-SceI from yeast mitochondria used to induce double-strand breaks (DSBs) in mammalian cells as early as 1994, demonstrating enhanced homologous recombination efficiency.[56] Engineered custom meganucleases emerged in the late 1990s and early 2000s, pioneered by groups including those at Cellectis, which developed variants through protein engineering starting around 1999.[13] Redesign strategies, such as semi-rational mutagenesis and in vitro recombination of monomeric domains from dimeric scaffolds like I-CreI (a 22-base-pair recognizer from Chlamydomonas reinhardtii), allow tailoring to novel targets, though success rates remain low owing to the coupled evolution of recognition and cleavage domains.[57][58] In editing applications, meganuclease-induced DSBs trigger cellular repair pathways, including error-prone non-homologous end joining for gene knockouts or homology-directed repair for precise insertions and corrections when donor templates are provided.[53] Their specificity arises from multiple hydrogen bonds and van der Waals contacts across the target, conferring advantages like reduced toxicity and immunogenicity compared to heterologous fusion proteins.[1] However, engineering challenges—such as the need for extensive screening to avoid partial specificities or catalytic inactivity—limit versatility, with redesign often requiring months of iterative optimization.[59][60] Early applications targeted therapeutic corrections, such as disrupting HIV proviral DNA or correcting mutations in severe combined immunodeficiency models, and agricultural modifications in plants.[16][13] Despite these proofs-of-concept, meganucleases have seen limited clinical translation due to design complexity, paving the way for successor technologies like zinc finger nucleases that offer modular assembly.[15] Ongoing refinements, including machine learning-assisted design, aim to enhance predictability for bespoke nucleases.[59]Zinc Finger Nucleases
Zinc finger nucleases (ZFNs) are engineered restriction enzymes comprising zinc finger protein domains for sequence-specific DNA recognition fused to the non-specific DNA cleavage domain of the FokI endonuclease.[61] These modular proteins induce targeted double-strand breaks (DSBs) at predetermined genomic loci, exploiting cellular DNA repair pathways such as non-homologous end joining (NHEJ) for gene disruption or homology-directed repair (HDR) for precise insertions or corrections.[62] Each zinc finger module typically binds a 3-base-pair subsite, with arrays of 3–6 fingers providing specificity spanning 9–18 base pairs; FokI dimerization requires two adjacent ZFNs binding in a tail-to-tail orientation, spaced 4–6 base pairs apart, to generate the DSB.[61] Development of ZFNs began with the identification of zinc finger motifs in the transcription factor TFIIIA from Xenopus laevis in 1985, followed by demonstrations of their customizable DNA-binding properties in the early 1990s.[15] Pioneering work in the late 1990s and early 2000s by researchers including Carlos Barbas and David Liu enabled the fusion of zinc finger arrays to FokI, achieving the first targeted DSBs in mammalian cells around 2002–2005.[23] Key milestones include the 2009 demonstration of efficient ZFN-mediated editing in human cells via modular assembly, facilitating broader adoption for gene targeting.[63] Despite early promise as the first programmable genome editing tool, ZFN design proved labor-intensive due to context-dependent interactions between adjacent fingers, often requiring empirical selection or proprietary oligomerized assembly methods like OPEN or zinc finger phage display.[52] ZFNs have been applied in preclinical models for gene knockouts, insertions, and corrections, notably in disrupting the CCR5 gene for HIV resistance in human cells and hematopoietic stem cells (HSCs).[64] Clinically, Sangamo Therapeutics advanced ZFN-based therapies, with Phase 1/2 trials for HIV (SB-728) initiating in 2009, showing transient viral load reductions but limited long-term efficacy due to editing efficiency constraints.[65] For hemophilia B, an in vivo ZFN approach via AAV delivery (SB-525/ST-920) entered trials in 2018, aiming to insert a factor IX transgene into the albumin locus; a 2022 first-in-human study reported safe dosing up to 5×10^13 vg/kg with FIX activity increases, though no approvals have been granted as of 2025.[66] Recent studies in 2024 confirmed high-efficiency ZFN editing in HSCs for multilineage engraftment, underscoring persistent utility in ex vivo applications.[67] Advantages of ZFNs include proven clinical tolerability, reduced immunogenicity compared to some alternatives, and high specificity when optimized, with off-target effects mitigated by paired nuclease design and transient expression.[68] However, challenges persist: design complexity limits accessibility, potential FokI toxicity at high expression levels, and off-target cleavage at sites with partial homology, though rates are generally lower than early CRISPR iterations when using validated ZFNs.[52] Persistent plasmid expression risks promiscuous binding, prompting strategies like mRNA electroporation for ephemeral activity.[69] While eclipsed by simpler tools like CRISPR-Cas9 post-2012, ZFNs remain relevant for applications demanding compact payloads or established safety profiles in viral vectors.[70]TALENs
Transcription activator-like effector nucleases (TALENs) are engineered restriction enzymes consisting of a customizable DNA-binding domain derived from transcription activator-like (TAL) effectors of Xanthomonas bacteria fused to the nonspecific DNA cleavage domain of the FokI endonuclease.[22] TAL effectors, first characterized in 2009, contain tandem repeats with repeat-variable di-residues (RVDs) that confer nucleotide-specific DNA binding, where each RVD typically recognizes a single base pair.[71] The initial demonstration of TALEN-mediated genome editing was reported in 2010, with key publications in 2011 enabling targeted double-strand breaks (DSBs) in various organisms.[72][73] TALENs function by designing pairs of proteins that bind to adjacent DNA sequences separated by a spacer of 12-20 base pairs; the FokI domains dimerize across this spacer to generate a DSB, which is repaired via non-homologous end joining (NHEJ) for gene disruption or homology-directed repair (HDR) for precise edits when a donor template is provided.[22] This modular one-to-one RVD-nucleotide recognition simplifies target design compared to zinc finger nucleases (ZFNs), which rely on zinc finger modules recognizing three nucleotides each, often requiring empirical optimization due to context-dependent binding.[52] TALEN assembly, though initially labor-intensive via methods like Golden Gate cloning, has been streamlined with kits allowing construction in 1-2 days.[74] TALENs exhibit higher specificity than ZFNs, with studies showing reduced off-target cleavage at sites like CCR5; for instance, TALENs produced fewer unintended mutations than ZFNs targeting the same locus.[52] Relative to CRISPR-Cas9, TALENs demonstrate lower off-target activity in some contexts due to the absence of guide RNA mismatches and reliance on protein-DNA interactions, though CRISPR's ease of use has led to its dominance.[75][70] However, TALENs' larger size (around 3 kb per monomer) complicates delivery, particularly in viral vectors, and multiplexing multiple targets remains challenging without custom engineering.[76] Applications of TALENs span basic research and therapeutics, including gene knockouts in human pluripotent stem cells for disease modeling, such as generating CCR5 mutants resistant to HIV.[74] In agriculture, TALENs have conferred rice resistance to Xanthomonas oryzae by disrupting susceptibility genes and enabled the first genome-edited pigs in 2015 via embryo injection.[77] Therapeutically, TALENs achieved the first cure of chronic lymphocytic leukemia in a patient in 2015 by editing T cells ex vivo for adoptive transfer, highlighting their clinical potential despite subsequent shifts toward CRISPR.[20] TALENs also facilitate mitochondrial DNA editing for diseases like Leber's hereditary optic neuropathy, exploiting their protein-based delivery to bypass nuclear RNA interference issues.[78]CRISPR-Cas Systems
CRISPR-Cas systems derive from clustered regularly interspaced short palindromic repeats (CRISPR) and associated Cas proteins, which form an adaptive immune mechanism in bacteria and archaea to defend against invading bacteriophages and plasmids.[79] These systems acquire short DNA sequences from foreign invaders, integrate them as spacers into the host CRISPR array, and transcribe them into CRISPR RNAs (crRNAs) that guide Cas effector proteins to cleave matching nucleic acids during subsequent exposures.[80] The functional role was first demonstrated in 2007 when spacers from phage DNA conferred resistance in Streptococcus thermophilus.[81] Classified into two main classes, six types, and numerous subtypes, CRISPR-Cas systems vary in complexity and effectors; type II systems, prevalent in genome editing applications, rely on a single large Cas9 endonuclease.[82] In natural type II systems, such as from Streptococcus pyogenes, Cas9 forms a complex with crRNA and trans-activating crRNA (tracrRNA), which base-pairs with the target DNA to form an R-loop structure, enabling double-strand breaks (DSBs) adjacent to a protospacer adjacent motif (PAM), typically 5'-NGG-3'.[24] The dual crRNA-tracrRNA was simplified into a single guide RNA (sgRNA) for programmable targeting.[83] Adaptation for genome editing began with the 2012 demonstration that S. pyogenes Cas9 (SpCas9), guided by dual RNAs, cleaves plasmid DNA in vitro at sites specified by the crRNA spacer sequence, provided a PAM is present.[24] This RNA-guided nuclease activity was harnessed for eukaryotic genome engineering in early 2013, when Cong et al. reported targeted cleavage and homology-directed repair in human and mouse cells using SpCas9 and sgRNA expressed from plasmids, achieving up to 25% modification efficiency at select loci.[84] Independent work by Mali et al. confirmed multiplex editing capabilities, altering up to five endogenous sites simultaneously via NHEJ or HDR pathways. The editing mechanism exploits Cas9-induced DSBs repaired by non-homologous end joining (NHEJ), often introducing insertions/deletions (indels) for gene disruption, or homology-directed repair (HDR) with donor templates for precise insertions or substitutions.[85] Targeting specificity stems from ~20-nucleotide sgRNA-DNA complementarity, though mismatches can reduce efficiency; off-target effects arise from partial hybridization at non-canonical sites with compatible PAMs.[86] SpCas9 requires a 3' PAM, limiting accessible targets to ~12.5% of the human genome, prompting variants like SpCas9-NG (recognizing 5'-NG-3') or smaller Cas12a (Cpf1) from Francisella novicida, which uses 5'-TTV-3' PAM and generates staggered cuts for scarless cloning.[87][88] Cas13 variants, such as LwaCas13a, target and cleave single-stranded RNA rather than DNA, enabling transcript knockdown or editing without genomic alterations, though with collateral RNA cleavage upon activation.[89] These systems' simplicity, multiplexing potential, and low cost—relative to protein-based nucleases like ZFNs or TALENs—drove rapid adoption, with over 10,000 publications by 2017 citing CRISPR for editing.[90] Challenges include immunogenicity of bacterial Cas proteins and delivery barriers in vivo, addressed through humanized variants or alternative Cas orthologs.[91]Base Editing and Prime Editing
Base editing, developed by Alexandrox Komor and colleagues in David Liu's laboratory and first reported in 2016, enables the precise conversion of a target cytosine (C) to thymine (T) in DNA without generating double-strand breaks (DSBs). This approach fuses a catalytically impaired Cas9 protein—either a nickase variant (nCas9) that creates a single-strand nick or a dead Cas9 (dCas9) lacking nuclease activity—with a cytidine deaminase enzyme, such as APOBEC1, to form a cytosine base editor (CBE).[92] A single-guide RNA (sgRNA) directs the complex to the target site, where the deaminase chemically modifies cytosine to uracil (U) within a narrow editing window of 4-5 nucleotides; during replication or repair, U is recognized as T, resulting in a C·G to T·A conversion.[93] This DSB-free mechanism substantially reduces insertions/deletions (indels) compared to traditional CRISPR-Cas9 editing, which relies on error-prone non-homologous end joining (NHEJ), though CBEs can produce bystander edits at adjacent cytosines and exhibit some off-target activity.[94] In 2017, Gaudelli et al. extended base editing to adenine (A) bases with an adenine base editor (ABE), fusing an evolved tRNA adenosine deaminase (TadA*) to nCas9, enabling programmable A·T to G·C changes via deamination of A to inosine (I), which is templated as G during replication.[95] Subsequent optimizations, including second- and third-generation editors with uracil glycosylase inhibitor (UGI) fusions to suppress base excision repair pathways that could revert edits, have improved efficiency to over 50% in mammalian cells for many targets while minimizing indels to below 1%.[94] Base editors have demonstrated utility in correcting pathogenic point mutations, such as those in sickle cell disease models, but limitations persist, including restricted transition types (only C·G→T·A and A·T→G·C), PAM sequence constraints from Cas9, and potential RNA off-targeting from deaminase activity.[96] Prime editing, introduced by Andrew Anzalone and colleagues in David Liu's group in 2019, represents an advanced iteration that permits "search-and-replace" modifications, including all four transition types, small insertions (up to 44 nucleotides), and deletions (up to 80 nucleotides), without DSBs or donor DNA templates.[97] The system employs a prime editor protein—a fusion of nCas9 and a Moloney murine leukemia virus reverse transcriptase (M-MLV RT)—guided by a prime editing guide RNA (pegRNA) that extends beyond standard sgRNAs to include a reverse transcriptase template (RTT) specifying the desired edit and a primer binding site (PBS) for reverse transcription initiation.[98] Upon binding, nCas9 nicks the target strand (typically the non-template strand), the exposed 3' flap hybridizes to the PBS, and RT copies the RTT into a new DNA flap, which ligases into the genome after flap resolution, displacing the original sequence.[97] Initial efficiencies reached 20-50% for transitions in human cells with minimal indels (<1-5%), outperforming homology-directed repair (HDR) in non-dividing cells, though prime editing historically suffers from lower yields for insertions/deletions and sensitivity to pegRNA design.[98] Engineered variants, such as PE2 with an improved RT and PE3 incorporating an additional sgRNA for nicking the non-edited strand to bias repair, have boosted efficiencies up to 2.3-fold, while recent ePE and ProPE systems further expand the editing window and reduce byproducts.[99] Prime editing's versatility addresses base editing's limitations by enabling transversions indirectly via multi-step edits or hybrid approaches, with off-target rates comparable to or lower than Cas9 due to the requirement for precise RT priming.[98] However, challenges include pegRNA production complexity, potential cellular toxicity from RT activity, and efficiencies still lagging behind DSB-based methods for some large edits, prompting ongoing refinements like smaller Cas variants for delivery.[97] Both technologies, recognized with the 2025 Breakthrough Prize in Life Sciences awarded to David Liu, exemplify a shift toward DSB-independent editing to enhance safety and precision in therapeutic contexts.[100]Novel and Hybrid Approaches
Novel approaches in genome editing extend beyond conventional nuclease-based systems by incorporating elements such as transposases, integrases, and retrons to enable precise insertions, reductions in double-strand breaks (DSBs), and multiplexing capabilities. Hybrid systems, which fuse CRISPR-Cas components with other molecular machinery, aim to mitigate off-target effects and DSB-associated risks like indels or chromosomal rearrangements, while facilitating large payload integrations up to several kilobases. These innovations, emerging prominently since the early 2020s, prioritize DNA repair-independent mechanisms to enhance efficiency in therapeutic and research applications.[101][102] CRISPR-associated transposases (CASTs) represent a hybrid class that couples type I CRISPR RNA-guided targeting with transposase activity for programmable DNA insertion. Unlike DSB-dependent methods, CASTs catalyze strand transfer to insert payloads without breaks, achieving efficiencies up to 40% in bacterial systems and demonstrating adaptability to eukaryotic cells through laboratory evolution. For instance, evoCAST variants, optimized via directed evolution, have enabled precise integrations in human cell lines with minimal off-target activity. These systems, identified in diverse prokaryotes, bypass homology-directed repair limitations and support cargo sizes exceeding 10 kb, positioning them for applications in gene therapy where stable, large-scale modifications are required.[101][103][101] Programmable addition via site-specific targeting elements (PASTE) exemplifies a hybrid nuclease-integrase fusion, employing a CRISPR-Cas9 nickase linked to a reverse transcriptase and serine integrase for DSB-free large-sequence insertions. Developed in 2022, PASTE uses prime editing-inspired pegRNA to prime reverse transcription of donor DNA, followed by integrase-mediated attachment at nicked sites, yielding up to 25% efficiency for 36 kb inserts in human cells. This approach excels in replacing entire defective genes, such as modeling Duchenne muscular dystrophy by inserting micro-dystrophin cassettes, and avoids DSB toxicity, though delivery challenges persist for in vivo use.[102][102][102] Retron-based editing leverages bacterial retrons—RNA-templated reverse transcriptases producing multi-copy single-stranded DNA (ssDNA)—as donor templates for precise homology-directed repairs, often hybridized with CRISPR-Cas for targeting. In a 2025 advancement, retron systems corrected large disease-related mutations in vertebrate models by excising defective regions and inserting healthy sequences, achieving higher fidelity than traditional donors due to in situ ssDNA generation. Efficiencies reach over 50% in mammalian cells when paired with Cas9, with retrons enabling multiplex edits via parallel msDNA production, though optimization for payload size and host compatibility continues. This method's repair independence from cell cycle phase broadens its utility across kingdoms.[104][105][106] Multiplex automated genome engineering (MAGE), a non-nuclease hybrid relying on recombineering with short ssDNA oligos and phage-derived recombinases, facilitates simultaneous edits at hundreds of loci in prokaryotes. Introduced around 2009 and refined for eukaryotes, MAGE cycles oligonucleotide electroporation with selection to evolve genomes rapidly, as seen in recoding E. coli with over 300 changes for non-canonical amino acid incorporation. While less reliant on sequence-specific nucleases, its integration with CRISPR hybrids enhances scalability for synthetic biology, though eukaryotic efficiencies lag at under 10% per site without further engineering.[107][107][108]Delivery Systems and Implementation Strategies
Viral and Non-Viral Delivery Methods
Viral vectors leverage the natural infectivity of viruses to deliver genome editing components, such as CRISPR-Cas nucleases and guide RNAs, into target cells with high efficiency. Adeno-associated viruses (AAVs), particularly serotypes like AAV2 and AAV9, are favored for their non-pathogenic nature, ability to transduce post-mitotic cells, and episomal persistence without genomic integration, supporting transient or long-term expression depending on the application. AAVs have a packaging limit of about 4.7-5 kb, restricting delivery to compact editing systems like SaCas9 or base editors, and have been used in over 150 clinical trials for gene therapies by 2023, including the FDA-approved Luxturna (voretigene neparvovec) in 2017 for RPE65-mediated retinal dystrophy via subretinal AAV delivery achieving sustained vision improvement. However, AAVs can elicit pre-existing neutralizing antibodies in up to 50-70% of humans, potentially reducing efficacy, and high doses may trigger innate immune responses or hepatotoxicity, as observed in a 2020 tragic trial outcome involving AAV for muscular dystrophy. Lentiviral vectors, derived from HIV-1, provide larger cargo capacity (up to 9 kb) and integrate into the host genome for stable expression, making them suitable for ex vivo editing of hematopoietic stem cells; they underpinned the FDA approval of Kymriah (tisagenlecleucel) in 2017 for leukemia via CD19 knockout. Drawbacks include risks of insertional oncogenesis, evidenced by rare leukemia cases in early SCID trials, and production scalability issues despite advances in integrase-defective variants that promote non-integrating episomal delivery to mitigate genotoxicity. Adenoviral vectors offer high transient expression and larger payloads but provoke strong inflammatory responses, limiting their use to short-term editing in non-immunoprivileged tissues.[109][110] Non-viral delivery systems circumvent viral immunogenicity and integration risks by employing synthetic or physical carriers for editing components, often as mRNA, plasmids, or ribonucleoproteins (RNPs) to enable transient activity that curtails prolonged off-target editing. Lipid nanoparticles (LNPs), composed of ionizable lipids, cholesterol, and PEG-lipids, encapsulate Cas9 mRNA and guide RNA for systemic in vivo delivery, achieving biodegradability and endosomal escape; they facilitated the first in vivo human CRISPR trial in 2021 for transthyretin amyloidosis (NTLA-2001), with a single dose yielding up to 96% serum protein reduction at 87% liver editing efficiency in phase 1 data reported in 2023. LNPs excel in scalability—billions of doses produced for COVID-19 mRNA vaccines by 2021—and lower mutagenesis risk, but suffer from hepatic tropism, transient expression (hours to days), and potential lipid toxicity at high doses, with editing efficiencies often below 50% in extrahepatic tissues without targeting ligands. Polymer-based nanoparticles, such as polyethyleneimine (PEI) or poly(lactic-co-glycolic acid) (PLGA), offer customizable surface modifications for tissue specificity and protection against nuclease degradation, demonstrating 30-70% editing in mouse glioblastoma models via intracranial injection in 2021 studies. Physical methods like electroporation apply electric pulses to transiently permeabilize membranes, achieving high ex vivo efficiencies (up to 90%) in hard-to-transfect cells like primary T lymphocytes or stem cells without chemical additives, as in Casgevy (exagamglogene autotemcel) approved in 2023 for sickle cell disease following electroporation-mediated BCL11A editing. Yet, electroporation induces cytotoxicity (10-30% cell death) and is impractical for in vivo use due to tissue damage, while hydrodynamic injection or ultrasound-mediated delivery remains experimental with variable yields. Microinjection and nucleofection variants enhance precision in embryos or organoids but scale poorly for therapeutics.[109][111][112]| Delivery Type | Key Advantages | Key Disadvantages | Typical Applications |
|---|---|---|---|
| Viral (e.g., AAV, Lentiviral) | High transduction efficiency (50-90% in vivo); natural tropism for tissues like liver, retina, CNS | Immunogenicity; limited cargo size (AAV); insertional risks (lentiviral); manufacturing complexity | In vivo therapeutics (e.g., ocular, hepatic editing); ex vivo stem cell modification |
| Non-Viral (e.g., LNPs, Electroporation) | Reduced immunogenicity; transient expression minimizing off-targets; scalable production; no replication risk | Lower efficiency (10-70%); poor in vivo targeting without modifications; potential cytotoxicity | Ex vivo cell therapies (e.g., CAR-T); emerging in vivo mRNA/RNP delivery for systemic diseases |
Ex Vivo versus In Vivo Applications
Ex vivo genome editing involves extracting cells from a patient, modifying their genomes in a controlled laboratory environment using tools such as CRISPR-Cas9, and subsequently reintroducing the edited cells into the patient.[114] This approach allows for precise manipulation under optimized conditions, including electroporation or viral transduction for delivery, followed by selection or expansion of successfully edited cells to achieve high purity before transplantation.[115] It is particularly suited for accessible cell types like hematopoietic stem and progenitor cells (HSPCs) or T cells, enabling applications in blood disorders and immunotherapies.[116] A primary advantage of ex vivo editing is the ability to mitigate off-target effects and immune responses by editing in isolation, with post-editing validation and enrichment ensuring only viable, correctly modified cells are used.[117] For instance, in the treatment of sickle cell disease (SCD) and transfusion-dependent beta-thalassemia, ex vivo CRISPR-Cas9 editing of patient-derived HSPCs to disrupt the BCL11A enhancer has led to approved therapies like Casgevy (exagamglogene autotemcel), authorized by the FDA in December 2023, demonstrating durable fetal hemoglobin induction and symptom amelioration in clinical trials with over 90% reduction in vaso-occlusive crises.[118] However, challenges include scalability of manufacturing, engraftment efficiency post-infusion, and limitation to ex vivo-accessible tissues, restricting broader use for non-hematopoietic conditions.[119] In contrast, in vivo genome editing delivers editing components—typically via lipid nanoparticles, adeno-associated virus (AAV) vectors, or other systemic/local methods—directly into the patient's body to target cells in situ.[120] This method holds potential for treating organs like the liver, retina, or muscle, where cell extraction is impractical, by achieving site-specific modifications without surgical intervention.[121] Delivery innovations, such as liver-tropic AAVs or nanoparticle-encapsulated Cas9 ribonucleoproteins, have enabled transient expression to reduce prolonged off-target risks.[120] Key benefits of in vivo approaches include broader tissue applicability and avoidance of ex vivo processing complexities, as evidenced by NTLA-2001, an in vivo CRISPR-Cas9 therapy targeting the TTR gene in hereditary transthyretin amyloidosis (hATTR), which achieved up to 87% serum TTR reduction in phase 1 trials via intravenous lipid nanoparticle delivery as of June 2021.[122] Early trials for Leber congenital amaurosis type 10 (LCA10) have also used subretinal AAV-CRISPR injections to edit CEP290 mutations, restoring partial visual function in preclinical models and initial human dosing by 2020.[123] Nonetheless, hurdles persist, including inefficient targeting of non-dividing cells, potential immunogenicity of bacterial-derived Cas proteins, and amplified safety concerns from systemic distribution, with most trials still in early phases compared to ex vivo successes.[120] Ongoing refinements in delivery specificity aim to bridge these gaps for scalable clinical translation.[124]| Aspect | Ex Vivo | In Vivo |
|---|---|---|
| Primary Applications | Hematologic disorders (e.g., SCD, beta-thalassemia), T-cell therapies for cancer | Liver diseases (e.g., hATTR), ocular disorders (e.g., LCA10), neuromuscular conditions |
| Delivery Methods | Electroporation, lentiviral/retroviral vectors in vitro | AAV vectors, lipid nanoparticles, direct injection |
| Advantages | High editing purity via selection; controlled environment reduces immunogenicity | Targets inaccessible tissues; no cell extraction needed |
| Challenges | Limited to harvestable cells; manufacturing and engraftment variability | Delivery efficiency; off-target effects in vivo; immune clearance of editors |
| Clinical Status (as of 2025) | Multiple approvals (e.g., Casgevy, 2023); dozens of trials | Phase 1/2 trials dominant (e.g., NTLA-2001, 2021 onward); no approvals yet |