A frameshift mutation is a type of genetic mutation resulting from the insertion or deletion of a number of nucleotides in a DNA sequence that is not divisible by three, thereby altering the reading frame of the genetic code and leading to a completely different translation of the genetic information from the mutation site onward.[1] This disruption was first experimentally demonstrated in 1961 by Francis Crick, Sydney Brenner, and colleagues through studies on bacteriophage T4, where combinations of insertions and deletions restored function, confirming the triplet nature of the genetic code.[2] Mechanistically, the mRNA is read in codons of three nucleotides during translation; an insertion or deletion of one or two bases shifts this frame, changing all subsequent codons and typically producing a garbled amino acid sequence in the resulting protein, often culminating in a premature stop codon that truncates the polypeptide.[3] Such mutations are usually deleterious, rendering proteins nonfunctional and contributing to a wide array of genetic disorders, including Crohn's disease via a frameshift in the NOD2 gene,[4] certain forms of cystic fibrosis,[5] and Tay-Sachs disease.[6] Despite their predominantly harmful effects, frameshift mutations play a role in evolutionary processes by potentially generating novel protein functions or regulatory elements, though they are far more often associated with loss-of-function phenotypes in human health.[7]
Background and Fundamentals
Definition and characteristics
A frameshift mutation is a type of genetic mutation caused by the insertion or deletion of a number of nucleotides in a DNAsequence that is not divisible by three, thereby shifting the reading frame of the genetic code and altering the translation of the subsequent codons into amino acids.[8] This shift occurs because the genetic code is read in triplets, known as codons, each specifying a particular amino acid or stop signal during protein synthesis.[9]The concept of frameshift mutations was first experimentally demonstrated in 1961 by Francis Crick, Sydney Brenner, and colleagues through their work on proflavin-induced mutations in bacteriophage T4, which provided key evidence for the triplet nature of the genetic code and the disruptive effects of such shifts.Key characteristics of frameshift mutations include their tendency to produce a cascade of incorrect amino acid substitutions downstream from the mutation site, often introducing a premature stop codon that truncates the protein.[10] These alterations typically result in non-functional or loss-of-function proteins, as the changed amino acid sequence disrupts the protein's structure and activity.[1]Frameshift mutations differ from in-frame insertions or deletions, which involve multiples of three nucleotides and thus add or remove whole codons without altering the downstream reading frame, potentially preserving partial protein function depending on the affected residues.[8]
The genetic code and reading frames
The central dogma of molecular biology outlines the unidirectional flow of genetic information within cells, proceeding from DNA to messenger RNA (mRNA) through transcription, and subsequently from mRNA to proteins via translation.[11] This framework, first proposed by Francis Crick, establishes that genetic instructions encoded in DNA are copied into mRNA, which serves as the template for synthesizing polypeptide chains that fold into functional proteins.[12]The genetic code operates on a triplet basis, where sequences of three consecutive nucleotides, known as codons, specify individual amino acids or termination signals during translation. With four possible nucleotides (adenine, cytosine, guanine, and uracil in mRNA), there are 64 possible codons (4³ = 64), which encode the 20 standard amino acids and three stop signals (UAA, UAG, and UGA) that halt protein synthesis.[13] This degeneracy, or redundancy, means most amino acids are represented by multiple codons, allowing for flexibility while minimizing the impact of certain mutations.[14] The code is read in a non-overlapping manner by ribosomes, which scan mRNA from a defined starting point to assemble amino acids in the precise order dictated by successive codons.In any given DNA or mRNA sequence, three possible reading frames exist, corresponding to the three alternative ways to group nucleotides into triplets starting from different positions (e.g., positions 1-3, 2-4, or 3-5).[13] The correct reading frame, often termed the open reading frame (ORF), begins at an initiation codon (typically AUG in mRNA, coding for methionine) and extends continuously to a stop codon without intervening stops, ensuring accurate translation of the intended protein sequence. Maintenance of this frame is essential, as any disruption—such as the insertion or deletion of nucleotides not in multiples of three—shifts the reading frame, misaligning all subsequent codons and typically resulting in a completely altered amino acid sequence downstream, often culminating in a premature stop codon.[13] Frameshift mutations exemplify this vulnerability, scrambling the genetic message beyond the alteration site.
Causes
Genetic mechanisms
Frameshift mutations primarily arise from small insertions or deletions (indels) of nucleotides that are not multiples of three, disrupting the reading frame of the genetic code. These indels often occur during DNA replication through a process known as replication slippage, where DNA polymerase temporarily dissociates from the template strand and reanneals out of register, particularly in repetitive DNA sequences such as microsatellites. Microsatellites, consisting of short tandem repeats, are especially prone to this slippage because the repetitive nature facilitates misalignment, leading to the addition or loss of repeat units that manifest as frameshifts.[15][16]DNA polymerase errors contribute significantly to frameshift generation, as replicative polymerases like Pol δ and Pol ε exhibit varying fidelity during synthesis, with lower accuracy in repetitive regions where proofreading may fail to correct slippage-induced mismatches. Error-prone polymerases, such as those involved in translesion synthesis, further exacerbate indel rates by incorporating nucleotides imprecisely during replication stress. Deficiencies in mismatch repair (MMR) pathways amplify these errors, as MMR normally excises and replaces mismatched bases or small loops formed by slippage; impaired MMR, as seen in conditions like Lynch syndrome caused by mutations in MLH1 or MSH2 genes, results in a markedly elevated frequency of frameshift indels, particularly in microsatellite loci.[17][18]Recombination processes also generate frameshifts through error-prone mechanisms. Unequal crossing-over during homologous recombination in regions with tandem repeats can produce net insertions or deletions by misaligning homologous chromosomes, shifting the reading frame in the recombinant products. Similarly, non-homologous end joining (NHEJ), a pathway for repairing double-strand breaks, often introduces small indels at junctions due to imprecise ligation of blunt or overhanging ends, converting breaks into frameshift mutations when occurring within coding sequences.[19][20]Frameshift mutations occur at higher frequencies in non-coding regions of the genome compared to coding exons, where strong purifying selection limits their fixation due to the predominantly deleterious effects on protein function. While most frameshifts are harmful, reducing fitness by producing truncated or aberrant proteins, they can play an evolutionary role by generating genetic diversity, as evidenced by compensatory frameshifts that restore reading frames and contribute to protein evolution in vertebrates and insects. Environmental factors can amplify these intrinsic mechanisms by inducing replication stress, but the core processes remain rooted in cellular replication and repair fidelity.[21]
Environmental and mutagenic factors
Chemical mutagens, particularly intercalating agents such as acridines (e.g., proflavin), induce frameshift mutations by inserting between DNA base pairs, which distorts the DNA helix and promotes the addition or deletion of nucleotides during replication.[22] These agents stabilize slipped DNA structures, leading to misalignment and indel formation in repetitive sequences.[23]Physical mutagens like ultraviolet (UV) radiation generate cyclobutane pyrimidine dimers, especially thymine dimers, which block replication forks and trigger error-prone translesion synthesis, resulting in insertions or deletions that cause frameshifts.[24]Ionizing radiation produces double-strand breaks that, during non-homologous end joining repair, often lead to small insertions or deletions, manifesting as frameshift mutations in surviving cells.[25] These physical agents are prevalent in environmental exposures such as sunlight and cosmic rays.Biological factors, including viral integrations, can insert genetic material into host genomes, disrupting coding sequences and introducing frameshifts if the insertion length is not a multiple of three.[26] Transposon activity similarly causes frameshift mutations through the excision or insertion of transposable elements within exons, altering the reading frame and protein function.[27] Occupational exposure to chemical mutagens, such as certain chemotherapy drugs (e.g., alkylating agents), heightens risk by promoting DNA adducts that lead to indel formation during replication.[28]In human exposure contexts, smoking introduces polycyclic aromatic hydrocarbons and other chemicals that elevate somatic mutation rates, including frameshifts, in lung epithelial cells through oxidative damage and replication errors.[29] Air pollution, particularly particulate matter like PM2.5, correlates with increased somatic mutations in lung tissues by inducing chronic inflammation and DNA damage, amplifying mutation burdens in non-smokers.[30] These environmental factors interact with genetic repair deficiencies, worsening frameshift accumulation in vulnerable populations.[31]
Molecular Consequences
Effects on mRNA and protein synthesis
Frameshift mutations, which involve the insertion or deletion of nucleotides not divisible by three in the DNA coding sequence, are faithfully transcribed by RNA polymerase II into messenger RNA (mRNA), resulting in a corresponding indel in the mRNA transcript that shifts the reading frame downstream of the mutation site. This alteration does not directly impair the transcription process itself, such as the initiation, elongation, or termination by RNA polymerase, but instead modifies the template sequence, leading to an mRNA with a disrupted codon alignment for subsequent translation.[32][33]During translation, the ribosome begins decoding the mRNA from the start codon in the original reading frame until it reaches the mutation site, after which the shifted frame causes all subsequent codons to be read incorrectly, often producing a polypeptide with an entirely different amino acid sequence. This frameshift frequently introduces a premature stop codon (such as UGA, UAA, or UAG in the new frame) shortly downstream, truncating the protein and halting synthesis prematurely. For instance, a +1 frameshift can realign the sequence to create a novel stop codon, preventing the full-length protein from being assembled.[34]Many frameshift-induced premature stop codons trigger nonsense-mediated decay (NMD), a surveillance mechanism that identifies and degrades mRNAs with termination codons located more than 50-55 nucleotides upstream of an exon-exon junction, thereby preventing the production of potentially harmful truncated proteins. NMD involves factors like UPF1, which recruit exonucleases to degrade the aberrant mRNA, often significantly reducing mutant transcript levels compared to wild-type, typically by more than 90% in efficient cases. This degradation significantly diminishes overall protein output from the affected allele.[35][36]The combined effects of frameshifting and NMD typically result in a near-complete loss of functional protein from the mutated allele, leading to approximately 50% total protein levels from the wild-type allele in heterozygous cells. In genes where one functional copy is insufficient for normal dosage, this can lead to pronounced phenotypic effects due to haploinsufficiency, as the wild-type allele alone cannot compensate fully. Experimental studies, such as those using minigene constructs, confirm that inhibiting NMD partially restores protein expression but often yields nonfunctional products due to the underlying frameshift.[35][37]
Types of resulting proteins
Frameshift mutations disrupt the reading frame of the genetic code, typically resulting in proteins that are dysfunctional due to altered amino acid sequences downstream of the mutation site. The predominant type of aberrant protein produced is a truncated polypeptide, arising from the frequent introduction of a premature termination codon in the new reading frame. These shortened proteins often lack critical functional domains necessary for proper structure, stability, or interactions, leading to rapid degradation via nonsense-mediated decay pathways or inherent instability that precludes effective cellular roles.[34][38]In contrast, elongated proteins emerge in rare instances where the frameshift mutation shifts the reading frame in a way that bypasses the original stop codon, permitting translation to extend into the 3' untranslated region and generate an extended chain with a novel C-terminal sequence. Such proteins may incorporate additional amino acids that alter subcellular localization, dimerization capabilities, or regulatory motifs, though they generally exhibit diminished or absent biological activity compared to the wild-type form. This outcome is less common because most frameshifts align with in-frame stop codons earlier in the sequence.[38]Beyond length alterations, the shifted codon usage often produces proteins prone to misfolding, where the incorrect amino acid string disrupts secondary and tertiary structures, promoting aggregation into insoluble complexes or triggering endoplasmic reticulum stress responses. These misfolded or aggregated forms can confer toxic gain-of-function properties, such as sequestering chaperone proteins or wild-type counterparts, thereby exacerbating cellular dysfunction independent of simple loss of function.[34]The pathological impact of these mutant proteins varies by mechanism: haploinsufficiency occurs when the reduced protein levels from the mutant allele (approximately 50% total in heterozygotes) fall below a threshold required for normal physiology, as the wild-type allele alone cannot compensate. Alternatively, dominant-negative effects arise if the aberrant protein actively interferes with wild-type function, for example, by co-assembling into heteromeric complexes that impair activity or by forming aggregates that deplete available functional monomers. Frameshift-derived truncated proteins are particularly associated with dominant-negative interference in multimeric assemblies.[39][40]
Detection and Diagnosis
Molecular detection techniques
Frameshift mutations, caused by insertions or deletions of nucleotides not divisible by three, are detected at the molecular level using techniques that identify sequence alterations in DNA or RNA. These methods focus on laboratory-based approaches to pinpoint indels, often combining direct sequencing with electrophoretic or fluorescent assays for confirmation. High-throughput strategies have become essential for genome-wide screening, while targeted tools provide precision for specific loci.[41]DNA sequencing remains the cornerstone for detecting frameshift mutations. Sanger sequencing is widely used for targeted validation of indels, offering high accuracy in resolving small insertions or deletions through direct chromatogram analysis, where frameshifts appear as shifts in the sequence trace. This method is particularly effective for confirming suspected mutations in diploid genomes, with simple post-sequencing tools like wildcard searches enhancing indel identification efficiency. For broader detection, next-generation sequencing (NGS), including whole-exome sequencing (WES), enables high-throughput analysis of coding regions. WES has identified frameshift mutations in numerous studies of genetic disorders, highlighting NGS's role in scaling detection across thousands of genes. Large-scale WES efforts, such as those in population genomics, routinely catalog millions of indels, many of which are frameshifting, providing a comprehensive view of variant landscapes.[42][43][44][45]Long-read sequencing technologies, such as Pacific Biosciences (PacBio) HiFi and Oxford Nanopore, have emerged as complementary methods for accurate detection of frameshift mutations, particularly for resolving complex or repetitive indels that challenge short-read NGS. As of 2025, these approaches improve diagnostic yield by 10-15% in undiagnosed cases by providing phased variant calls and structural context, enabling better characterization of frameshift consequences in clinical genomics.[46][47]PCR-based methods offer sensitive, cost-effective screening for frameshift mutations by exploiting physicochemical changes induced by indels. Denaturing gradient gel electrophoresis (DGGE) separates PCR-amplified fragments based on melting behavior differences; insertions or deletions alter DNA stability, causing mobility shifts under denaturing conditions, with detection rates approaching 100% for small indels like those in polyadenine tracts. DGGE is especially useful for heterogeneous samples, as it distinguishes mutant from wild-type alleles even in low-abundance scenarios. Similarly, single-strand conformation polymorphism (SSCP) detects frameshifts through altered electrophoretic mobility of single-stranded DNA, where indels disrupt secondary structure folding. SSCP has successfully identified frameshift mutations in coding repeats, such as in the BAX gene, with sensitivity up to 90% under optimized conditions, making it a staple for initial mutation scanning before sequencing. These techniques are often combined with PCR to amplify suspect regions, providing rapid preliminary evidence of frameshifts.[48][49][50][51]Fluorescence-based techniques allow visualization of frameshift mutations, particularly in cellular or in vivo contexts. Fluorescent in situ hybridization (FISH) can assess chromosomal copy number in genome editing workflows like CRISPR, aiding interpretation of indel-induced disruptions, as demonstrated in cell lines where frameshift efficiency was quantified post-transfection. More commonly, reporter assays using green fluorescent protein (GFP) provide a functional readout; in these systems, an upstream stop codon or out-of-frame sequence is placed before the GFP open reading frame, such that indels restore the frame and enable GFP expression, resulting in detectable fluorescence. This approach has been refined for high-sensitivity detection in model organisms like zebrafish, where injected mutant reporters confirm frameshift alleles in embryos with low background noise. Such assays are quantitative, allowing flow cytometry-based sorting of frameshift-positive cells.[52][53][54]Bioinformatics tools complement wet-lab methods by predicting and analyzing frameshift mutations from sequence data. Algorithms employing sequence alignment, such as those in SIFT Indel, score indels for their likelihood to cause frameshifts and predict functional impacts by assessing changes in protein sequence downstream of the variant. Open reading frame (ORF) analysis tools like ORFfinder scan DNA sequences to identify disrupted ORFs, flagging potential frameshifts where translation would prematurely terminate or produce aberrant proteins. These computational approaches integrate with NGS pipelines, using alignment software to detect indels via gap penalties and then applying ORF prediction to evaluate frameshift consequences, as seen in tools like TransPPMP for pathogenicity assessment. In large datasets, such methods facilitate automated filtering of frameshifting variants from millions of candidates.[41][55][56]
Clinical diagnostic methods
Clinical diagnostic methods for frameshift mutations primarily involve targeted genetic testing in at-risk populations and routine screening programs to identify carriers or affected individuals early in life. Genetic testing panels utilize next-generation sequencing (NGS) to detect insertions or deletions (indels) in disease-associated genes, such as the CFTR gene for cystic fibrosis, where frameshift mutations account for about 16% of known pathogenic variants. These panels are recommended for confirmatory diagnosis following initial biochemical screens and for carrier screening, particularly in high-prevalence groups like Ashkenazi Jews, where testing for HEXA gene frameshifts, such as the 1278insTATC insertion responsible for Tay-Sachs disease, detects over 90% of carriers. The American College of Medical Genetics (ACMG) endorses panels covering at least 23 common CFTR variants, including indels, for broad clinical utility in diagnosing autosomal recessive disorders.[57][58][59][60]Prenatal and newborn screening programs further enhance early detection of frameshift mutations through invasive and non-invasive approaches. Prenatal testing, such as amniocentesis performed between 15 and 20 weeks of gestation, involves NGS analysis of fetal DNA to identify indels in genes like CFTR when both parents are known carriers, allowing for informed reproductive decisions. Newborn screening, standard in many countries including all U.S. states for cystic fibrosis, begins with a heel-prick blood test measuring immunoreactive trypsinogen (IRT) levels; elevated IRT prompts confirmatory genetic testing via NGS to detect CFTR frameshifts among other mutations. These methods have significantly reduced morbidity by enabling timely interventions, though they focus on high-risk or screened conditions rather than genome-wide indel detection.[61][62][63]Frameshift mutations underlie certain inherited disorders at frequencies around 1 in 10,000 births, depending on the condition and population; for instance, in cystic fibrosis, which has an overall incidence of about 1 in 3,000 live births, frameshifts represent a substantial subset of cases. In oncology, somatic frameshift mutations are prevalent in approximately 15% of colorectal tumors exhibiting high microsatellite instability, where they drive oncogenesis through loss-of-function in tumor suppressor genes. These estimates highlight the clinical relevance of screening in both germline and somatic contexts.[64][65]Distinguishing pathogenic frameshift mutations from benign indels poses significant diagnostic challenges, addressed through standardized frameworks like the ACMG/AMP guidelines. Null variants such as frameshifts in genes where loss-of-function is a known disease mechanism receive a "very strong" pathogenic score (PVS1), but interpretation requires integrating populationdata, computational predictions, and functional evidence to avoid misclassification, especially for novel indels in regions of variable tolerance. Variants of uncertain significance (VUS) are common in indel testing, necessitating periodic re-evaluation as databases grow, to ensure accurate risk assessment in clinical settings.[66][67]
Programmed Ribosomal Frameshifting
Mechanism in viruses and eukaryotes
Programmed ribosomal frameshifting (PRF) is a regulated translational recoding event in which the ribosome intentionally slips by one nucleotide in the 5' (-1 PRF) or 3' (+1 PRF) direction during mRNA decoding, allowing the production of distinct protein isoforms from a single mRNA transcript.[68] This mechanism contrasts with error-prone frameshifts caused by mutations, as PRF is precisely controlled by cis-acting mRNA elements to ensure accurate alternative protein synthesis.[69]In viruses, PRF is essential for expressing polyproteins that combine structural and enzymatic functions, optimizing compact genomes. A prominent example is human immunodeficiency virus type 1 (HIV-1), where -1 PRF at the gag-pol overlap site produces the Gag-Pol fusion protein, incorporating protease, reverse transcriptase, and integrase domains necessary for viral replication.[70] The efficiency of this -1 PRF in HIV-1 is approximately 5%, maintaining an optimal Gag-to-Gag-Pol ratio of about 20:1 for virion assembly, and is tightly regulated to prevent overproduction of enzymatic components.[71] Similar -1 PRF mechanisms occur in other retroviruses and many positive-strand RNA viruses, enabling the translation of replicase genes.[72]Eukaryotic organisms also employ PRF, though less frequently than viruses, to regulate gene expression in cellular contexts. In yeast (Saccharomyces cerevisiae), +1 PRF is used in the EST3 gene, which encodes a component of the telomerase complex; frameshifting produces the full-length Est3p protein required for telomere maintenance, with efficiency of 75-90%.[73] Additionally, in yeast retrotransposons like Ty3, PRF efficiency increases more than twofold during amino acid starvation, a stress response that enhances transposon activity.[74] In Drosophila melanogaster, +1 PRF occurs in the ornithine decarboxylase antizyme (OAZ) gene, generating a full-length protein that regulates polyamine levels, a process linked to cellular stress adaptation; the frameshift is modulated by upstream open reading frames and polyamine concentrations.[75]The core structural elements driving PRF include slippery heptanucleotide sequences in the mRNA, such as XXXYYYZ for -1 PRF (where X, Y, and Z represent specific nucleotides allowing tRNA slippage, e.g., UUUUUUA in HIV-1), paired with downstream stimulatory RNA structures like pseudoknots or stem-loops that pause the ribosome and promote frameshifting.[68] In +1 PRF cases, such as in yeast and Drosophila, motifs often involve P-site codon-anticodon pairing disruptions and upstream Shine-Dalgarno-like sequences, with pseudoknots or hairpins enhancing efficiency by stabilizing the paused ribosomal state.[76] These elements ensure frameshifting occurs at specific sites with high fidelity, typically without requiring trans-acting factors in basal conditions.[69]
Biological significance
Programmed ribosomal frameshifting (PRF) contributes significantly to gene economy in viruses by enabling the translation of multiple functional proteins from a single open reading frame, thus maximizing coding capacity within constrained genome sizes. This mechanism is particularly vital for RNA viruses, where genome compactness is essential for efficient replication and packaging. For instance, in coronaviruses like SARS-CoV-2, a -1 PRF event at the junction of ORF1a and ORF1b allows ribosomes to shift frames and produce the ORF1ab polyprotein, which includes essential replicase components, thereby avoiding the need for separate promoters or initiation sites.[68][77]Beyond structural efficiency, PRF exerts precise regulatory control over protein stoichiometry and adapts to environmental cues. In retroviruses such as HIV-1, -1 PRF at the gag-pol overlap site produces the Gag-Pol fusion protein at a low frequency (approximately 1-5%), maintaining an optimal Gag:Pol ratio that balances viral assembly with enzymatic function while preventing excess polymerase toxicity. PRF efficiency can also be modulated by cellular stress or antiviral responses; trans-acting factors like RNA-binding proteins or small molecules alter slippage rates, fine-tuning expression during infection or host defense activation.[78]00044-4)From an evolutionary perspective, PRF elements exhibit remarkable conservation across viral taxa and select cellular systems, underscoring their adaptive value. This recoding strategy is widespread among positive-strand RNA viruses, including retroviruses, coronaviruses, and flaviviruses, facilitating genome optimization over millions of years. In bacteria, the +1 PRF in the prfB gene, which encodes release factor 2, traces back to the last common ancestor, with slippery sequences and stimulatory structures preserved to regulate termination efficiency. Although infrequent in eukaryotic genomes, PRF appears in specialized contexts, such as yeast prion-like domains that enhance -1 frameshifting for stress adaptation, and in bacterial genes where antibiotic-induced PRF, as seen with macrolides, drives expression of resistance determinants.[79][80][81]Recent investigations (2023–2025) have illuminated PRF's implications in human pathophysiology, particularly neurodegeneration, where perturbed frameshifting disrupts protein homeostasis and generates aberrant isoforms. In repeat-expansion disorders like amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), altered PRF at expanded nucleotide tracts leads to shifted reading frames, producing toxic dipeptide repeats that exacerbate neuronal damage through imbalanced ratios of wild-type to mutant proteins. These findings highlight PRF's dual role in normal regulation and disease, with frameshifting efficiency influenced by RNA structures and cellular stressors in affected neurons.[82]
Pathological Associations
Inherited genetic disorders
Frameshift mutations in the HEXA gene, which encodes the alpha subunit of beta-hexosaminidase A, are a primary cause of Tay-Sachs disease, a fatal lysosomal storage disorder characterized by progressive neurodegeneration due to accumulation of GM2 gangliosides. The most common such mutation among Ashkenazi Jewish individuals is a 4-base pair insertion (1278insTATC) in exon 11, leading to a premature stop codon and complete loss of enzyme activity.[59] This frameshift accounts for approximately 75-80% of disease-causing alleles in this population, where the carrier frequency is about 1 in 27, contributing to an incidence of 1 in 3,600 births.[83] The resulting enzyme deficiency disrupts lipid degradation in neurons, leading to severe developmental regression, seizures, and death by early childhood.[84]In cystic fibrosis, frameshift mutations in the CFTR gene disrupt the cystic fibrosis transmembrane conductance regulator protein, impairing chloride ion transport and causing thick mucus buildup in organs like the lungs and pancreas. These mutations, classified as class I defects, represent about 5% of all CFTR variants and typically result in no functional protein due to premature termination.[85] An illustrative example is the 3905insT mutation (c.3773_3774insT) in exon 19, a 1-base pair insertion that causes a frameshift and premature stop codon, yielding a nonfunctional protein.[86] This leads to classic symptoms including recurrent infections, pancreatic insufficiency, and reduced life expectancy, with frameshifts contributing to severe disease phenotypes in affected individuals.[87]Frameshift mutations in the NOD2 gene (also known as CARD15) are strongly associated with Crohn's disease, an inflammatory bowel disorder involving chronic intestinal inflammation due to dysregulated immune responses to gut microbiota. The 3020insC insertion in exon 11, one of three major NOD2 variants, causes a truncated protein that impairs recognition of bacterial peptidoglycans, leading to excessive inflammation and reduced autophagy.80138-X/fulltext) This frameshift is found in 10-30% of Crohn's patients, particularly those with ileal involvement, increasing disease risk 3- to 40-fold depending on zygosity and conferring susceptibility to structuring complications.[88] The resulting immune dysregulation promotes Th1-mediated granulomatous inflammation, exacerbating tissue damage in the gastrointestinal tract.00169-X)Smith-Magenis syndrome, a neurodevelopmental disorder, arises from frameshift mutations in the RAI1 gene on chromosome 17p11.2, causing haploinsufficiency of the retinoic acid-inducible 1 protein, which regulates gene expression in neural and metabolic pathways. These mutations, often small insertions or deletions leading to premature truncation, account for about 10% of non-deletion cases and result in intellectual disability, sleep disturbances, and behavioral issues like self-injurious behavior.[89] Affected individuals exhibit delayed speech, motor skills deficits, and autism-like features due to disrupted circadian rhythm and synaptic function.[90] A hotspot for such frameshifts exists in the polyglutamine repeat region of RAI1, amplifying the dosage-sensitive effects on brain development.[91]Frameshift mutations in sarcomeric genes such as MYH7 (encoding beta-myosin heavy chain) and TNNT2 (encoding cardiac troponin T) contribute to hypertrophic cardiomyopathy, a monogenic condition marked by abnormal thickening of the heart muscle and risk of arrhythmias or heart failure. In MYH7, frameshifts like c.5769delG produce truncated proteins that disrupt myosin assembly and force generation in cardiomyocytes, leading to asymmetric septal hypertrophy.[92] Similarly, frameshifts in TNNT2, such as those causing early termination, alter calcium sensitivity and myofilament function, promoting hypercontractility and fibrosis.[93] These variants, though less common than missense mutations, are linked to early-onset disease and sudden cardiac death in families, emphasizing the role of protein truncation in sarcomere instability.[94]
Role in oncogenesis
Frameshift mutations play a critical role in oncogenesis by introducing somatic alterations that disrupt tumor suppressor genes and drive tumorigenesis, particularly in cancers with defective DNA repair mechanisms. These mutations often result in truncated or aberrant proteins that lose functional domains, thereby promoting uncontrolled cell proliferation and genomic instability. In colorectal cancer, frameshift mutations in the APC gene are a hallmark initiating event, frequently occurring at hotspots like codon 1309, leading to the inactivation of this key regulator of the Wnt signaling pathway and facilitating adenoma formation.[95] Similarly, in breast and ovarian cancers, frameshift indels in BRCA1 are prevalent, with over 1,800 distinct mutations reported in databases like BIC, many of which cause premature termination and impair homologous recombination repair, elevating cancer risk.[96]A major mechanism amplifying frameshift mutations in cancer is microsatellite instability (MSI) arising from mismatch repair (MMR) deficiency, which impairs the correction of replication errors in repetitive DNA sequences. MSI-high (MSI-H) tumors exhibit elevated rates of frameshift mutations, particularly in coding microsatellites of tumor suppressors and oncogenes, contributing to approximately 15% of all colorectal cancers.[97] This deficiency leads to a hypermutated phenotype, with frameshifts enriched in MSI-H tumors across various cancer types, significantly increasing the tumor mutational burden (TMB) and fostering an environment conducive to tumor evolution and immune evasion.[65] For instance, in MMR-deficient colorectal cancers, recurrent frameshifts in genes like TGFBR2 and ACVR2A disrupt TGF-β signaling, promoting metastatic progression.[98]Beyond inactivation, frameshift mutations generate novel immunogenic neoantigens by producing out-of-frame peptides that can be presented on MHC molecules, potentially eliciting antitumor immune responses. These neoantigens are particularly abundant in MSI-H tumors due to their high frameshift burden, offering opportunities for immunotherapy. Recent studies have identified shared frameshift-derived neoantigens in renal cell carcinoma (RCC), where personalized neoantigen vaccines targeting these mutations induced durable T-cell responses and prevented recurrence in high-risk patients, as demonstrated in a 2025 clinical trial.[99] Such findings highlight the dual oncogenic and therapeutic potential of frameshifts, with their frequency correlating inversely to peptide immunogenicity in some contexts, guiding vaccine design strategies.[100]
Implications in infectious diseases
Frameshift mutations play a significant role in host-pathogen interactions during HIV infection. The CCR5Δ32 variant, a 32-base-pair deletion in the CCR5gene, introduces a premature stop codon via frameshift, resulting in a truncated, non-functional receptor that prevents HIV entry into CD4+ T cells and confers near-complete resistance to R5-tropic HIV-1 strains in homozygous individuals.[101] This mutation, prevalent in about 10-15% of European populations, highlights how frameshifts can evolve as protective adaptations against viral invasion.[102]In HIV itself, programmed -1 ribosomal frameshifting is essential for expressing the Pol polyprotein from the gag-pol overlapping reading frame, maintaining a precise Gag-to-Gag-Pol ratio of approximately 20:1 for viral assembly and replication.[70] However, spontaneous or induced frameshift mutations that disrupt this signal—such as alterations in the slippery sequence or stimulatory stem-loop—reduce frameshifting efficiency by 30-60%, leading to imbalanced protein production and diminished viral replication fitness in cell culture and animal models.[103] These fitness costs underscore the evolutionary pressure to preserve programmed frameshifting in HIV quasispecies.[104]Bacterial pathogens also exploit frameshift mutations for adaptation, particularly in acquiring antibiotic resistance. For instance, frameshifts in the rpoB gene, encoding the RNA polymerase β subunit, can suppress deleterious effects of rifampicin resistance mutations, enabling high-level resistance without severe fitness penalties through intrinsic translational suppression.[105] Similarly, frameshift indels in regulatory genes like mmpR5 in Mycobacterium species disrupt repressor function, upregulating efflux pumps and conferring resistance to bedaquiline, a key drug for multidrug-resistant tuberculosis.[106] Such mutations illustrate how frameshifts facilitate rapid evolutionary responses to antimicrobial selective pressures in bacterial populations.In RNA viruses like influenza A, mutational frameshifts contribute to quasispecies diversity, enabling adaptation to host immune pressures and environmental changes. High error-prone replication generates frequent indels, and compensatory frameshifts—where a second indel restores the reading frame—allow preservation of functional protein domains while introducing sequence variation, as observed in hundreds of influenza genes across global isolates.[107] This mechanism drives intra-host evolution and antigenic drift, enhancing viral persistence and evasion of humoral immunity.[108]Host antiviral defenses further implicate frameshifts in infectious disease dynamics through APOBEC3-mediated hypermutation of viral genomes. APOBEC3G, a cytidine deaminase, induces extensive G-to-A mutations in HIV-1 reverse transcripts, preferentially targeting GG dinucleotides to generate premature stop codons (e.g., TGG to TAG), which truncate viral proteins in a manner akin to frameshift-induced disruption, often rendering progeny virions non-infectious.[109] This hypermutation strategy, evaded by HIV Vif but potent in its absence, exemplifies how host-induced genetic alterations limit viral propagation across diverse pathogens.[110]
Therapeutic Strategies
Corrective gene editing
Corrective gene editing employs precision genome engineering technologies to directly repair or reverse frameshift mutations, restoring the proper reading frame and protein function in affected genes. Among these, CRISPR-Cas9 systems facilitate indel correction through homology-directed repair (HDR), where a donor template guides the precise insertion or deletion of nucleotides to counteract the frameshift caused by non-multiples-of-three indels. This approach has demonstrated up to 30-fold enhancement in HDR efficiency for repairing frameshift mutations in cellular assays, enabling restoration of gene function without reliance on error-prone non-homologous end joining (NHEJ).[111] For cystic fibrosis transmembrane conductance regulator (CFTR) mutations, preclinical studies from 2023 to 2025 explore CRISPR-Cas9 variants, including base editing strategies that introduce precise nucleotide changes without double-strand breaks (DSBs) to mitigate frameshift effects, such as by correcting associated splicing defects or revertant sequences in patient-derived airway cells (as of November 2025).[112][113]Prime editing represents a more versatile DSB-free method for frameshift reversal, utilizing a fusion of Cas9 nickase, reverse transcriptase, and a prime editing guide RNA (pegRNA) to directly install small insertions or deletions at the mutation site. This technique achieves correction efficiencies of 20-50% in human cell models for various frameshift-inducing indels, surpassing traditional HDR rates in non-dividing cells and minimizing unintended genomic alterations.[114] In applications targeting CFTR mutations, prime editing has restored function in primary airway epithelial organoids from cystic fibrosis patients, though primarily demonstrated for point mutations like L227R and N1303K, with optimized systems yielding up to 25% editing rates (as of 2024).[115]Representative examples illustrate the therapeutic potential of these tools. In Tay-Sachs disease models, CRISPR-based editing of the HEXA gene corrects frameshift mutations by base editing or HDR in neuronal cells, partially restoring β-hexosaminidase A enzyme activity and reducing ganglioside accumulation.[116] Similarly, studies on von Hippel-Lindau (VHL) frameshifts in renal cell carcinoma employ CRISPR to model and investigate neoantigen generation, where editing introduces or corrects frameshift indels to enhance tumor-specific immune responses, informing personalized vaccine strategies against driver mutations prevalent in over 90% of clear cell renal cancers.[100][117]Despite these advances, challenges persist, including off-target effects from unintended Cas9 cleavage and inefficient in vivo delivery to target tissues like the lungs or brain. Recent innovations, such as enhanced Cas9 specificity variants and lipid nanoparticle carriers, have improved editing precision by reducing off-target rates to below 1% in 2025 preclinical reviews, while addressing delivery hurdles through tissue-specific tropismengineering (as of 2025).[118][119] As of 2025, base editing variants have shown promise in preclinical models for precise frameshift correction without DSBs, though clinical translation remains challenged by delivery and no FDA-approved gene therapies exist for frameshift disorders.[120] These developments underscore the shift toward safer, more efficient corrective strategies for frameshift-associated disorders like cystic fibrosis and inherited neuropathies.
Pharmacological and other interventions
Pharmacological interventions for frameshift mutations primarily target the downstream consequences of these genetic alterations, such as premature termination codons (PTCs) that arise from reading frame disruptions, leading to truncated or nonfunctional proteins. Read-through drugs, particularly aminoglycosides like gentamicin, promote ribosomal read-through of PTCs primarily induced by nonsense mutations, allowing translation to continue and produce full-length proteins; however, for frameshifts, efficacy is limited as read-through does not restore the original reading frame, resulting in aberrant proteins. In cystic fibrosis (CF), class I mutations in the CFTR gene (primarily nonsense, but including some frameshifts leading to PTCs and absent CFTR protein) have been investigated with gentamicin, showing partial restoration in patient-derived cells, though clinical benefits are modest due to toxicity and limited applicability to frameshifts. Similarly, in Duchenne muscular dystrophy (DMD), nonsense mutations in the dystrophin gene are targeted by gentamicin and other aminoglycosides such as G418 and paromomycin, demonstrating suppression of stops and increased dystrophin in preclinical models; for frameshift mutations (out-of-frame deletions), exon skipping is the preferred approach over read-through. These agents bind to the ribosomal decoding site, stabilizing near-cognate tRNAs at PTCs, with efficacy enhanced by a uracil upstream of the stop codon.Chaperone therapies aim to stabilize misfolded proteins resulting from certain mutations, but for frameshifts causing PTCs, little protein is produced for correction unless partial read-through occurs. In CF, chemical chaperones such as glycerol and 4-phenylbutyrate have been investigated to correct folding defects in mutant CFTR proteins, enhancing trafficking to the cell membrane and chloride channel activity in cellular models. Pharmacological chaperones like lumacaftor (VX-809), approved for CF, primarily target class II mutations such as the common ΔF508 in-frame deletion and have shown investigational potential in combination with read-through agents for class I nonsense mutations by stabilizing rescued CFTR, though specificity for frameshift-derived proteins remains limited. Clinical studies indicate modest improvements in lung function when chaperones are used adjunctively, though primarily for non-frameshift variants.Symptom management strategies for frameshift mutation-associated disorders focus on mitigating disease effects rather than correcting the genetic lesion. In Tay-Sachs disease, caused by frameshift mutations in the HEXA gene leading to hexosaminidase A deficiency and GM2 ganglioside accumulation, enzyme replacement therapy (ERT) delivers recombinant hexosaminidase A to lysosomes, reducing substrate buildup in preclinical and early clinical settings. For instance, intrathecal ERT has improved neurological outcomes in animal models of late-onset Tay-Sachs by crossing the blood-brain barrier. In oncology, frameshift mutations in microsatellite instability-high (MSI-H) cancers generate immunogenic neoantigens due to altered peptide sequences; immunotherapy with immune checkpoint inhibitors like pembrolizumab exploits these neoantigens, with tumor frameshift burden predicting response rates up to 50% in MSI-H colorectal cancers. Shared frameshift-derived neoantigens, such as those from recurrent indels in TGFB2R and ACVR2A, enable off-the-shelf vaccines that elicit CD8+ T cell responses in multiple patients.Emerging interventions include antisense oligonucleotides (ASOs) designed to skip exons harboring frameshift mutations, restoring the reading frame and producing partially functional proteins. In DMD, ASOs like eteplirsen target dystrophin exons disrupted by frameshifts, inducing skipping to yield in-frame transcripts and increase dystrophin levels by 5-10% in muscle biopsies of treated patients. Cocktails of ASOs have shown promise for multi-exon skips in frameshift cases, with phase III trials demonstrating improved ambulation. For antiviral applications, modulators of programmed ribosomal frameshifting (PRF) inhibit viral replication by disrupting -1 PRF signals in coronaviruses. Recent 2024-2025 studies identified small-molecule inhibitors from high-throughput screens that reduce SARS-CoV-2 PRF efficiency by over 70%, impeding replication in cell cultures without host toxicity, paving the way for broad-spectrum antivirals targeting frameshift-dependent viral polyproteins. A 2025 bioRxivpreprint detailed novel frameshift inhibitors derived from natural products, effective against multiple coronaviruses at micromolar concentrations.[121]