Fact-checked by Grok 2 weeks ago

Genetic variation

Genetic variation refers to the naturally occurring differences in the DNA sequences among individuals of the same species, encompassing a range of genetic differences from single nucleotide changes to larger structural alterations. This diversity is fundamental to biology, as it underlies phenotypic differences, adaptability, and evolutionary processes across populations. The primary sources of genetic variation include , which introduces new alleles through changes in DNA sequences; during , which shuffles existing alleles to create novel combinations; and , the movement of alleles between populations via . Additionally, processes like and mating patterns, such as inbreeding or outbreeding, can influence the distribution and maintenance of variation within populations. Mutations, in particular, serve as the ultimate origin of all genetic novelty, occurring at rates that vary by organism and genomic region, and are filtered by and drift. Genetic variation manifests in several types, with single nucleotide polymorphisms (SNPs) being the most common, representing single base-pair differences that occur approximately once every 1,000 base pairs in the . Other forms include insertions and deletions (indels), which alter sequence length; copy number variations (CNVs), involving duplications or losses of larger DNA segments; and structural variants like inversions or translocations. These types contribute variably to phenotypic traits, with SNPs often affecting gene regulation and indels potentially disrupting protein function. The significance of genetic variation lies in its role as the substrate for , enabling populations to adapt to environmental changes, resist diseases, and avoid . Without it, could not operate effectively, as there would be no differential survival or reproduction among individuals. In , maintaining genome-wide variation is critical for population viability, particularly in facing habitat loss or climate shifts. Furthermore, understanding genetic variation informs fields like , where it explains disease susceptibility and responses to treatments, highlighting the interplay between and .

Fundamentals

Definition

Genetic variation refers to the differences in DNA sequences, the potential for gene expression, and chromosome structures among individuals within a species. These variations arise from changes at the molecular level and contribute to the genetic diversity observed in populations. A key distinction in genetic variation is between heritable and non-heritable forms, with the former involving germline changes that can be transmitted to offspring through reproductive cells. Germline variations occur in egg or sperm cells and are passed across generations, whereas somatic variations, which happen in non-reproductive body cells, are not inherited by descendants. This heritability is fundamental to the transmission of genetic differences in populations. The scope of genetic variation includes both neutral variations, which have no significant effect on , and adaptive variations, which influence and in specific environments. For instance, single nucleotide polymorphisms (SNPs) exemplify basic units of such variation, where a base pair differs among individuals. These elements collectively form the raw material for evolutionary processes like . Genetic variation pertains to the genotype—the underlying genetic makeup—rather than the , which is the observable traits resulting from gene-environment interactions. While genetic variation provides the foundation for phenotypic diversity, the two are not equivalent, as environmental factors and gene regulation modulate how genetic differences manifest.

Importance

Genetic variation provides the essential raw material for evolutionary processes, allowing to favor beneficial traits that enhance survival and reproduction in changing environments. Without such variation, mechanisms like and would lack the heritable differences needed to drive and, ultimately, . Greater within populations increases the potential for evolutionary change by offering a wider array of alleles for selection to act upon. In natural ecosystems, genetic variation underpins and fosters by enabling to respond to environmental perturbations, such as climate shifts or habitat alterations, through adaptive evolution. Diverse gene pools buffer populations against stressors, reducing the likelihood of widespread decline and supporting overall ecosystem stability. Genetic variation forms the basis for , where differences in individual genomes influence disease susceptibility and treatment responses; for instance, in the CFTR gene lead to , guiding targeted therapies like CFTR modulators. In agriculture, leverages this variation to develop crops with improved yields, pest resistance, and nutritional quality, significantly advancing . Conservation genetics similarly relies on preserving in to mitigate and enhance long-term viability against risks.

Forms

Sequence-level variations

Sequence-level variations refer to changes in the DNA sequence at the level of individual nucleotides or small segments, typically spanning a few bases. These variations include substitutions, insertions, and deletions that can occur within genes or non-coding regions, influencing traits, disease susceptibility, and evolutionary processes. Unlike larger structural alterations, these changes are point-specific and often result in subtle but significant genomic diversity. Single nucleotide polymorphisms (SNPs) represent the most common form of sequence-level variation, where a single at a specific position in the differs between individuals. In the , SNPs occur approximately once every 300 to 1,000 base pairs, leading to an estimated 4 to 5 million SNPs per individual's . These polymorphisms are biallelic, with one being more frequent (major ) and the other less so (minor , typically at least 1% in the population), and they account for about 90% of . SNPs can be located in or non- regions, potentially altering protein sequences or regulatory elements without necessarily causing . Insertions and deletions (indels) involve the addition or removal of small numbers of nucleotides, often up to 50 bases, at a specific locus in the DNA sequence. These variations can disrupt the reading frame of a gene if the number of nucleotides affected is not a multiple of three, leading to a frameshift mutation that shifts the grouping of codons during translation and typically results in a non-functional protein. Even in-frame indels, where the length change preserves the reading frame, may alter protein structure or function by inserting or deleting amino acids. Indels are less frequent than SNPs but contribute significantly to genetic diversity, particularly in coding regions where they can have pronounced effects. Microsatellites and minisatellites are sequences consisting of short motifs repeated multiple times, classified by length: microsatellites (1-6 base pairs) and minisatellites (10-60 base pairs). These repeats exhibit high rates, often 10^{-3} to 10^{-4} per locus per generation—orders of magnitude higher than the typical rate of about 10^{-9}—due to mechanisms like replication slippage. Their variability in repeat number makes them valuable for applications such as forensic , where they enable individual identification through distinct allelic patterns. Codon usage bias refers to the non-random preference for certain synonymous codons—those encoding the same —across genes and organisms, influenced by factors like efficiency and mRNA stability. Synonymous , which do not change the , can still affect protein function by altering speed, mRNA folding, or splicing efficiency, potentially impacting levels and proteome-wide folding. In contrast, non-synonymous substitute one for another, directly modifying and function, often with more immediate consequences for . These distinctions highlight how sequence-level changes at the codon level fine-tune biological processes beyond mere amino acid identity. A classic example of sequence-level variation's impact is sickle cell anemia, caused by a single base substitution () in the β-globin gene on , where replaces in the sixth codon, changing to in the protein. This non-synonymous change leads to abnormal hemoglobin polymerization under low oxygen conditions, resulting in rigid, sickle-shaped red blood cells and associated clinical symptoms. The mutation's heterozygous form provides resistance, illustrating sequence-level variations' role in adaptation.

Structural variations

Structural variations (SVs) refer to large-scale alterations in the genomic architecture, typically involving segments of DNA spanning thousands of base pairs or more, which differ from smaller nucleotide-level changes by their potential to reshape structure and function. These variations include copy number variations, inversions, translocations, and , each capable of influencing multiple genes simultaneously. In humans, SVs contribute substantially to and are implicated in evolutionary as well as susceptibility. Unlike sequence-level variations, SVs often exert broader effects on . Copy number variations (CNVs) are duplications or deletions of DNA segments, ranging from approximately 50 kb to 5 Mb in size, that result in altered copy numbers of genes or regulatory elements. For instance, CNVs at the 16p11.2 locus are associated with autism spectrum disorders through changes in gene dosage. These variations can lead to haploinsufficiency in deletions or overexpression in duplications, thereby modulating protein levels and cellular phenotypes. Inversions involve the reversal of a DNA segment within a chromosome, often spanning large regions and potentially disrupting long-range regulatory interactions or gene orientation. A notable example is the inversion at the 17q21.31 locus, which is polymorphic in human populations and may influence microtubule-associated protein tau expression, contributing to neurological traits. Such rearrangements can suppress recombination or alter chromatin looping, with implications for gene regulation. Translocations entail the relocation of DNA segments between non-homologous chromosomes, which may be balanced (no net loss or gain) or unbalanced. In , the t(8;14) translocation juxtaposes the with enhancers, driving aberrant proliferation. These events frequently create fusion genes or reposition regulatory elements, promoting oncogenesis or developmental disorders. represents an abnormal number of chromosomes, such as or , affecting entire chromosomal complements and thus thousands of genes. , or , exemplifies this, where the extra chromosome 21 elevates gene expression genome-wide, leading to characteristic physical and cognitive features. Aneuploidies often arise from meiotic errors and have profound dosage effects. Overall, structural variations impact gene dosage and expression more dramatically than point mutations, frequently altering chromatin architecture and contributing to both adaptive evolution and pathologies like cancer and congenital syndromes.

Measurement

Molecular techniques

Molecular techniques encompass a range of laboratory methods designed to identify and characterize genetic variations at the DNA level, enabling researchers to detect differences in nucleotide sequences, insertions, deletions, and larger structural changes across genomes. These approaches have evolved from targeted, low-throughput assays to high-throughput platforms that facilitate large-scale population studies and clinical diagnostics. By amplifying, sequencing, or hybridizing specific DNA fragments, these techniques provide the foundational data for understanding genetic diversity without relying on phenotypic observations. Sanger sequencing, developed in the 1970s, remains a cornerstone for precise analysis of targeted genomic regions, offering high accuracy for confirming single variants (SNVs) and small indels in genetic variation studies. This chain-termination generates readable sequences up to 1,000 base pairs long, making it ideal for validating variants identified in broader screens, though it is labor-intensive and limited to smaller scales compared to modern alternatives. Its low error rate, typically below 0.001%, positions it as the gold standard for clinical confirmation of genetic variants in targeted genes. Next-generation sequencing (NGS) technologies, such as those from Illumina, have revolutionized the detection of genetic variation by enabling whole-genome sequencing at unprecedented scale and affordability. Illumina's sequencing-by-synthesis approach involves massively parallel analysis of millions of DNA fragments, allowing simultaneous identification of SNVs, copy number variations, and structural variants across entire genomes in a single run. This method has democratized genomic research, with costs dropping to under $1,000 per by the mid-2020s, facilitating population-level studies of genetic diversity. Seminal applications include the , which cataloged millions of variants using early NGS platforms. PCR-based methods provide accessible tools for detecting genetic variation through amplification of specific DNA fragments, particularly useful in resource-limited settings. (RFLP) involves digesting DNA with restriction enzymes and separating fragments by to reveal polymorphisms based on differences that alter cut sites, historically enabling early mapping of genetic markers in populations. (AFLP), an extension of RFLP, combines restriction digestion with selective amplification to generate hundreds of polymorphic bands per reaction, offering a multilocus approach for assessing without prior knowledge. These techniques, though largely supplanted by NGS for throughput, remain valuable for fingerprinting and linkage analysis in non-model organisms. Microarrays, particularly single nucleotide polymorphism (SNP) chips, enable high-throughput genotyping of predefined genetic variants by hybridizing labeled DNA samples to immobilized probes on a solid surface. Platforms like Illumina's Infinium arrays interrogate over 1.9 million SNPs simultaneously, providing cost-effective snapshots of common genetic variation for association studies and ancestry inference. This hybridization-based detection excels in scalability, with signal intensities distinguishing homozygous and heterozygous states, and has been pivotal in genome-wide association studies (GWAS) identifying variants linked to traits and diseases. Emerging CRISPR-based detection tools in the 2020s leverage the programmable nuclease activity of enzymes for rapid, specific identification of genetic variants, often integrated with isothermal amplification for point-of-care applications. Systems like (using Cas13) and DETECTR (using Cas12) detect SNVs and insertions by collateral cleavage of reporter molecules upon target recognition, achieving sensitivity down to single-molecule levels without complex equipment. These methods, adapted from , visualize variants through or lateral flow assays, supporting real-time surveillance of mutations in pathogens and . Recent advancements include multiplexed CRISPR diagnostics for simultaneous variant screening. By 2025, long-read sequencing technologies, such as (PacBio) HiFi sequencing and (ONT), have advanced the resolution of structural variations that short-read methods often miss. PacBio produces accurate reads exceeding 15 kb to span repetitive regions and complex rearrangements, while ONT enables ultra-long reads over 100 kb with portable devices, both generating high-fidelity assemblies for detection of large insertions, deletions, and inversions with over 99.9% accuracy, as demonstrated in diverse human cohorts revealing previously hidden structural variants contributing to disease. These s complement NGS by providing phased haplotypes and full structural variant catalogs, enhancing comprehensive genetic variation analysis.

Population genetics metrics

Population genetics metrics provide quantitative tools to evaluate the amount and structure of genetic variation within and between populations, enabling inferences about evolutionary processes such as drift, selection, and . These metrics are derived from frequencies, counts, or sequence data and are foundational for testing hypotheses in . Key measures include heterozygosity, nucleotide diversity, statistics based on the , , and estimates of . Heterozygosity quantifies the proportion of individuals carrying two different at a given locus, serving as a direct indicator of . Observed heterozygosity () is the actual frequency of heterozygotes in a sample, calculated as the number of heterozygous individuals divided by the total number sampled at that locus. In contrast, expected heterozygosity (He) assumes Hardy-Weinberg equilibrium and represents the predicted proportion of heterozygotes based on allele frequencies; for a biallelic locus with alleles at frequencies p and q = 1 - p, He = 2pq. Departures between and He can signal non-random mating or other forces, with < He often indicating inbreeding. Nucleotide diversity (π) measures the average number of nucleotide differences per site between any two DNA sequences chosen randomly from a population, capturing sequence-level variation. Introduced by , it is computed as π = ∑_{i ≠ j} xi xj πij, where xi and xj are the frequencies of the ith and jth sequences, and πij is the proportion of differing sites between them. This metric is particularly useful for comparing polymorphism levels across species or genomic regions, with higher π indicating greater diversity; for example, π in humans is approximately 0.001, reflecting low variation relative to mutation rates. The allele frequency spectrum describes the distribution of allele frequencies in a population, providing insights into demographic history and selection. A key statistic derived from it is , which tests for deviations from neutral evolution by comparing two estimates of the mutation parameter θ: the pairwise difference-based π and the segregating sites-based estimator. = (π - θ_S) / √Var(π - θ_S), where θ_S is based on the number of segregating sites; negative values suggest excess rare alleles (e.g., from expansion or purifying selection), while positive values indicate excess common alleles (e.g., from balancing selection or bottlenecks). This test has been widely applied to detect selective sweeps in genomic data. F-statistics, developed by Wright, partition genetic variance to assess population structure and inbreeding. F_IS measures inbreeding within subpopulations as F_IS = (He - Ho) / He, where positive values indicate a deficit of heterozygotes relative to expectations, often due to non-random mating. F_ST quantifies differentiation among subpopulations as F_ST = (Ht - Hs) / Ht, with Ht the total heterozygosity across populations and Hs the average within-subpopulation heterozygosity; values range from 0 (no differentiation) to 1 (complete isolation), and F_ST > 0.15 typically signals moderate structure. These statistics are essential for understanding and subdivision. Effective population size (Ne) represents the size of an idealized population that experiences the same as the actual population, influencing the maintenance of variation. In , the scaled mutation rate θ = 4 Ne μ, where μ is the per-generation per site, links observable to underlying ; thus, Ne can be estimated as Ne = θ / (4 μ) once θ is inferred from data like segregating sites or π. Lower Ne amplifies drift, reducing , and is often much smaller than census size due to factors like variance in .

Patterns in Populations

Within populations

Genetic variation within populations manifests primarily through polymorphisms, defined as the occurrence of two or more alleles at a given locus with the exceeding 1%. In outbred populations characterized by random mating among diverse individuals, the proportion of polymorphic loci is generally high, often averaging 25-35% based on allozyme surveys across animal and species. This elevated polymorphism supports essential for adaptability, as seen in rat populations where heterozygosity reaches 0.07 per locus, far exceeding levels in controlled outbred strains (0.006-0.012). The Hardy-Weinberg equilibrium serves as a foundational null model for predicting allele and genotype frequencies in randomly mating populations without evolutionary forces acting upon them. For a diallelic locus with allele frequencies p and q (where p + q = 1), the expected genotype frequencies are given by: p^2 \text{(homozygous for the first allele)} + 2pq \text{(heterozygous)} + q^2 \text{(homozygous for the second allele)} = 1 This equilibrium, independently formulated by G. H. Hardy and Wilhelm Weinberg in 1908, implies constant allele frequencies across generations under ideal conditions and provides a benchmark for assessing non-random mating or other influences. Deviations from random mating, such as , elevate homozygosity and trigger , a reduction in fitness due to the unmasking of deleterious recessive alleles in homozygous states. This manifests in lowered survival, , and overall vigor, with effects observable across developmental stages in species like wild . Population bottlenecks further diminish within-population variation by randomly fixing alleles and purging others; for instance, the ( jubatus) underwent a severe approximately 10,000-12,000 years ago, yielding monomorphic loci at over 90% of surveyed sites and enabling skin grafts between unrelated individuals due to minimal . The (HLA) locus exemplifies extreme within-population polymorphism, harboring over 4,000 alleles across its class I and II genes and representing the most variable region in the . This diversity, maintained at frequencies exceeding typical loci by orders of magnitude, bolsters heterogeneous immune responses to pathogens, thereby conferring population-wide resilience. Metrics like expected heterozygosity often exceed 0.8 at HLA loci in outbred human groups, underscoring their role in internal diversity.

Between populations

Genetic differentiation between populations refers to the observed across distinct groups, which can manifest as either clinal variation—gradual changes in frequencies along geographic gradients—or discrete variation, where populations form distinct clusters with sharp boundaries. Clinal patterns arise from continuous and isolation by distance, leading to smooth transitions in genetic composition, as seen in many traits like skin pigmentation that vary latitudinally without clear breaks. In contrast, discrete structures emerge in scenarios like , where peripheral populations interbreed but central ones do not, creating isolated genetic clusters, such as in the salamanders of . Modern statistical frameworks, like the conStruct method, enable simultaneous inference of these patterns by modeling ancestry proportions across spatial layers, distinguishing continuous decay in relatedness from discrete admixture barriers. Admixture occurs in hybrid zones where historically separated populations interbreed, resulting in intermediate genetic profiles that can be detected using ancestry informative markers (AIMs)—genetic variants with substantial differences between ancestral groups. These markers facilitate admixture mapping, which identifies genomic regions of or selection by correlating local ancestry with phenotypic traits, as demonstrated in studies of tree hybrid zones where divergent ecological adaptations are mapped. In human contexts, AIMs have revealed in North American populations along a coyote-wolf zone, highlighting 60 regions of differential shaped by selection. Such zones often exhibit tension between homogenizing and reinforcing barriers, preserving population distinctions. Phylogeography reconstructs historical migrations and population expansions by analyzing uniparental markers like (mtDNA) , which trace maternal lineages due to their lack of recombination. For instance, mtDNA U5, originating around 35,000 years ago, spread across during the , reflecting post-glacial recolonization from refugia in Iberia and the . Similarly, in , M7 subclades indicate ancient between Vietnamese ethnic groups and neighboring populations, with migrations dated to 100–600 years ago in expansions to and . These patterns reveal how bottlenecks, expansions, and barriers like mountains have sculpted inter-population variation over millennia. Genetic divergence between populations can result from neutral processes, such as in isolated groups, or adaptive driven by local selection pressures. Neutral divergence accumulates randomly, leading to differentiation without functional consequences, whereas adaptive divergence involves alleles conferring fitness advantages in specific environments. A prominent example is , the ability to digest into adulthood, which evolved independently in and pastoralist populations through mutations in the LCT gene enhancer, such as the -13910T variant in Europeans around 7,500 years ago following . This spread rapidly under strong positive selection in dairy-reliant societies, contrasting with neutral mtDNA variations that show no such targeted enrichment. Quantifying between-population variation in humans reveals that approximately 85% of total occurs within populations, with only about 15% attributable to differences between continental groups, based on analyses of protein loci and blood groups. This apportionment—85.4% within populations, 8.3% between populations within races, and 6.3% between races—highlights the predominance of intra-group diversity, though critics argue it understates the utility of between-group structure for classification, as correlations across multiple loci enable reliable population assignment despite low pairwise differentiation (known as Lewontin's fallacy). Modern genomic studies using single nucleotide polymorphisms (SNPs) confirm these proportions, with over 90% of variation within populations when accounting for . Compared to the high within-population diversity, these between-group differences underscore subtle but structured inter-population contrasts.

Maintenance

Evolutionary forces

Evolutionary forces actively shape genetic variation through non-random processes that favor certain alleles or , thereby altering their frequencies in populations over time. operates in several forms, including , which shifts the population toward one extreme of a trait distribution by favoring individuals with that enhance survival or reproduction in a changing ; , which maintains the population mean by reducing variation around an optimal value; and disruptive selection, which favors individuals at both extremes of a trait distribution, potentially leading to bimodal phenotypes. Balancing selection, a key mechanism for preserving , includes , where heterozygous individuals exhibit higher fitness than either homozygote, as exemplified by the sickle cell allele (HbS) in humans, which confers resistance to in heterozygotes while causing in homozygotes. Frequency-dependent selection further maintains variation by conferring an advantage to rare alleles, where the fitness of a depends on its in the population; dependence, in particular, promotes polymorphism by benefiting rarer types, often through ecological interactions like predator avoidance or resource competition. , a specialized form of driven by or competition for mates, can amplify genetic variation in traits unrelated to survival, such as the elaborate tail feathers in peacocks (Pavo cristatus), where females prefer males with more ornate displays, leading to heritable exaggeration of these sexually dimorphic features despite potential viability costs. A classic example of in action is in the (Biston betularia), where the frequency of the dark melanic form increased dramatically in polluted industrial areas of 19th-century due to against soot-darkened trees, conferring a survival advantage against bird predation; this shift reversed with cleaner air post-industrialization. Gene-environment interactions influence how evolutionary forces manifest, with allowing the same to produce different phenotypes in varying environments, which can mask underlying genetic variation and modulate the strength of selection by enabling adaptive responses without immediate genetic change.

Neutral processes

Neutral processes in genetic variation refer to mechanisms that influence frequencies without regard to their adaptive significance, primarily through random sampling effects in finite populations. These processes can lead to the fixation or loss of s over time, eroding unless counterbalanced by other factors. Unlike directional forces such as selection, neutral processes operate randomly and are more pronounced in smaller populations where chance events have a greater impact on the . Genetic drift describes the random fluctuations in frequencies from one to the next due to in finite populations. In the Wright-Fisher model, which underpins much of , drift arises because each 's is a random sample of the previous one, leading to variance in frequencies that increases with time. The rate of drift is inversely proportional to , making it a dominant force in small populations where alleles can rapidly reach fixation ( of 1) or be lost ( of 0). For instance, in a population of N_e = 100, the probability of fixation for a neutral allele starting at p is simply p, and the expected time to fixation or loss scales with $4N_e . Population bottlenecks and founder effects exemplify extreme manifestations of , where a sudden reduction in population size drastically curtails genetic variation. A occurs when a large experiences a sharp decline, such as due to environmental , leaving only a subset of alleles to propagate; this reduces heterozygosity and can increase the frequency of deleterious alleles through random sampling. similarly arise when a small group colonizes a new area, carrying only a fraction of the original variation, leading to distinct genetic profiles in the derived . A notable example is the high prevalence of complete (total ) on Atoll in , where a 1775 reduced the to about 20 survivors, one of whom carried the recessive mutation; today, roughly 10% of Pingelapese exhibit the condition due to this founder effect and subsequent drift in isolation. Coalescent theory provides a retrospective framework for understanding how neutral processes shape genetic variation by tracing lineages backward in time to their most recent common ancestor. Developed by John Kingman, this approach models the genealogy of a sample of genes as a process, where pairs of lineages merge at rates proportional to the inverse of , approximating the effects of drift in large s. Under the standard Kingman coalescent for a Wright-Fisher , the time to the most recent common ancestor for n samples scales as $4N_e (1 - 1/n) generations, enabling inferences about historical sizes and demographic events from modern genetic data. This theory has become foundational for analyzing patterns of variation, such as decay, without forward simulation of entire s. Migration-drift balance occurs when between populations introduces new alleles, counteracting the homogenizing or diversifying effects of drift within demes. In subdivided populations, at rate m per generation maintains variation by replenishing alleles lost to local drift, reaching an equilibrium where the overall genetic differentiation, measured by F_{ST}, approximates $1/(1 + 4N_e m) for loci under the infinite-island model. This balance prevents complete fixation or loss across the , preserving polymorphism levels that reflect both local drift intensity and interpopulation connectivity; empirical studies in populations confirm that even low rates can substantially reduce drift-induced divergence over time. The , proposed by , posits that the majority of genetic variation at the molecular level arises from and is governed by neutral mutations that evolve primarily through rather than . In this framework, most substitutions fixed in populations are selectively neutral, with their rate equaling the neutral mutation rate \mu, leading to an evolutionary clock where divergence between species reflects time since separation. Kimura's 1968 formulation resolved the apparent of rapid molecular evolution by emphasizing drift's role in small effective population sizes, predicting that polymorphic sites within species should outnumber fixed differences between species by a factor related to $4N_e \mu. This theory has profoundly influenced the interpretation of genomic data, highlighting that much observed variation lacks adaptive significance and is maintained stochastically.

Special Cases

RNA viruses

RNA viruses exhibit exceptionally high levels of genetic variation primarily due to the error-prone nature of their replication machinery. Unlike DNA polymerases in cellular organisms or some DNA viruses, the RNA-dependent RNA polymerases (RdRps) of most RNA viruses lack 3'–5' exonuclease proofreading activity, leading to mutation rates typically ranging from 10^{-4} to 10^{-5} substitutions per nucleotide site per replication cycle.00055-4) This high fidelity deficit results in approximately one mutation per genome per replication round for many RNA viruses, given their compact genomes of 3–30 kb. Exceptions exist among coronaviruses, which possess an accessory proofreading exonuclease (nsp14), reducing their mutation rates to around 10^{-6} per site, though still orders of magnitude higher than DNA-based organisms. These elevated mutation rates foster quasispecies dynamics, where RNA virus populations exist not as uniform clones but as dynamic swarms of closely related genetic variants centered around a master sequence. This concept, originally proposed by , describes how high mutation rates and selection pressures maintain a diverse spectrum within a host, enabling rapid adaptation while the fittest variants predominate. Quasispecies structure arises from continuous and competition among variants, with the population's often representing the weighted average of the swarm rather than a single dominant . In practice, this manifests in viruses like or , where intrahost diversity can exceed 1% divergence, far surpassing that in DNA viruses. Genetic variation in RNA viruses is further amplified by recombination and, in segmented genomes, reassortment. Recombination involves template switching during replication, generating chimeric genomes that combine segments from co-infecting strains, which is common in non-segmented viruses like HIV.35515-2/fulltext) Reassortment, prevalent in viruses with segmented genomes such as A, allows entire segments to shuffle between strains, often during dual infections in intermediate hosts like pigs. This mechanism has driven major pandemics, including the 1957 H2N2 and 2009 H1N1 outbreaks, by rapidly generating novel antigenic combinations that evade population immunity. The rapid accumulation of variation has profound implications for evolution, enabling swift adaptation to host immune responses and antiviral therapies. In , high and recombination rates facilitate the emergence of drug-resistant variants, with quasispecies diversity allowing the to persist despite combination antiretroviral therapy.35515-2/fulltext) Similarly, has evolved numerous variants of concern by 2025, such as those with conferring escape and enhanced transmissibility, underscoring the challenges in sustaining long-term immunity.00621-9) These dynamics highlight RNA viruses' capacity for antigenic drift and shift, complicating design and contributing to recurrent epidemics. Within-host genetic diversity is also influenced by host factors, notably hypermutation mediated by APOBEC3 enzymes. In HIV infections, APOBEC3G and APOBEC3F deaminate cytidine residues in viral cDNA, introducing G-to-A hypermutations that can inactivate up to 50% of proviruses in some cases, though the virus counters this via the Vif protein. This process generates a subset of defective viral genomes within the quasispecies, modulating overall population fitness and occasionally driving adaptive evolution by creating variant diversity. Such host-virus interactions exemplify how intrinsic defenses shape RNA virus variation at the individual level.

Somatic variation

Somatic mutations are genetic alterations that occur in non-germline cells after and are not transmitted to offspring, accumulating throughout an individual's lifetime in various tissues. These mutations arise from errors in , environmental exposures such as ultraviolet radiation, or endogenous processes like , leading to changes in DNA sequence that can affect cellular function but remain confined to lineages. For instance, UV-induced mutations in cells, often manifesting as C>T transitions at dipyrimidine sites, contribute to the development of non-melanoma skin cancers by disrupting genes involved in and control. Somatic mosaicism refers to the presence of two or more genetically distinct populations within an individual, resulting from mutations that occur early in embryonic development and propagate to descendant cells. Such mosaicism can lead to tissue-specific variations, influencing traits or disease susceptibility at the individual level without affecting the . In healthy tissues, low-level mosaicism is common and generally tolerated, but higher levels may contribute to developmental disorders or age-related pathologies by altering or function in affected lineages. In cancer, mutations drive tumorigenesis through clonal evolution, where advantageous "" mutations confer proliferative benefits, while neutral or deleterious "" mutations hitchhike along without direct impact. The TP53, mutated in over 50% of cancers, exemplifies a key by impairing DNA damage response and promoting genomic instability, enabling the expansion of malignant clones. This process underscores how variation fuels intratumor heterogeneity, with subclones competing under selective pressures like , ultimately shaping cancer progression and . Somatic genetic variation also plays a critical role in the adaptive immune system, particularly through V(D)J recombination, a programmed process in developing lymphocytes that rearranges variable (V), diversity (D), and joining (J) gene segments to generate diverse antigen receptors.00039-X) This site-specific recombination, mediated by RAG1 and RAG2 enzymes, introduces nucleotide additions and deletions at junctions, producing up to 10^12 unique T-cell receptors and immunoglobulins per individual, essential for recognizing diverse pathogens.00039-X) Unlike random mutations, V(D)J is tightly regulated to ensure genomic stability outside immune loci.00039-X) While somatic changes can include epigenetic modifications that alter without sequence alterations, genetic somatic variation specifically pertains to DNA sequence variants, distinguishing it from heritable or non-sequence-based mechanisms.30204-5)

History

Pre-Darwinian concepts

Early concepts of genetic variation were rooted in ancient philosophical frameworks that emphasized fixed in nature rather than dynamic heritable changes. Aristotle's scala naturae, or "ladder of nature," proposed a static of living beings, from simplest to most complex, where each organism occupied an eternally fixed position without the possibility of ascent or through generations. This view implied limited variation, as species were seen as immutable essences defined by their essential characteristics, with any observed differences attributed to environmental influences rather than heritable traits. In the 18th century, formalized biological classification in his (1758), treating species as distinct, fixed entities created by divine order, with variation viewed merely as minor deviations within these immutable boundaries. Linnaeus's essentialist approach reinforced the idea that species essences remained unchanged over time, allowing only superficial intraspecific differences that did not alter the core type. Jean-Baptiste Lamarck introduced a transformative perspective in Philosophie Zoologique (1809), proposing that organisms could inherit acquired characteristics developed through use or disuse of organs in response to environmental needs, such as the lengthening of a giraffe's over generations. However, Lamarck's theory lacked any identified genetic mechanism for transmission, relying instead on fluid inner forces directing change. Complementing this, the prevailing blendism hypothesis suggested that offspring traits resulted from an average of parental characteristics, leading to a progressive dilution of variation across generations as extremes blended into uniformity. Prior to Gregor Mendel's work, farmers and animal breeders practiced empirically, choosing individuals with desirable traits for reproduction to enhance qualities like or size, without understanding underlying hereditary principles. These pre-Mendelian techniques demonstrated observable heritable improvements but were guided by trial-and-error observation rather than a for variation's persistence or transmission.

Darwinian concepts

Charles Darwin's by , introduced in (1859), positioned variation as the essential raw material for adaptive change, emphasizing its role in enabling organisms to diverge and adapt to their environments over generations. Darwin distinguished between two primary forms of variation: slight, continuous differences among individuals within a population and more abrupt "sports" or sudden deviations that could produce novel traits. He argued that acts cumulatively on these variations, preserving favorable ones and eliminating disadvantageous ones, thereby driving the gradual modification of species. This process, he contended, mirrors the diversification seen in domesticated animals and plants, where breeders exploit existing variations to create distinct breeds. To illustrate the power of selection on heritable variation, Darwin drew an analogy between artificial selection by humans and natural selection in the wild, using his extensive observations of pigeon breeding as a key example. In The Variation of Animals and Plants under Domestication (1868), he detailed how fanciers had produced dramatically diverse pigeon varieties—such as tumblers, pouters, and carriers—from a common rock dove ancestor (Columba livia) through selective breeding of heritable traits like feather patterns, body shape, and behavior. These artificial interventions, he explained, demonstrate how targeted preservation of variations can yield profound changes in just a few generations, suggesting that nature's selective pressures operate similarly but without human intent, favoring traits that enhance survival and reproduction in specific habitats. Darwin stressed that such variations must be heritable to sustain evolutionary progress, though he did not specify their underlying mechanism. Darwin grappled with significant challenges in explaining inheritance, particularly the prevailing idea of blending inheritance, where offspring traits mix like paints, potentially leading to the rapid dilution and loss of variation across generations. In , he acknowledged this issue, noting that under blending, intermediate forms would dominate, eroding the distinct variations necessary for selection to produce lasting adaptations unless new variations continually arise to replenish diversity. This concern highlighted a gap in his framework, as he assumed blending as the default mode of heredity without a clear resolution for maintaining variation's persistence. To address these inheritance problems, Darwin proposed the theory of in The Variation of Animals and Plants under Domestication (1868), positing that every cell in the body produces tiny particles called , which carry information about traits and circulate through the organism to the reproductive organs. These , he suggested, could aggregate and be transmitted to , allowing for the blending of parental characteristics while also permitting the of acquired traits modified by environmental influences during an individual's lifetime. aimed to explain phenomena like regeneration and reversion to ancestral forms, but it ultimately failed to resolve the core issue of variation's maintenance under blending. Throughout his work, consistently emphasized variation's functional importance in facilitating through selection, rather than delving deeply into its ultimate origins, which he viewed as secondary to the evolutionary process.

Post-Darwinian and modern developments

The rediscovery of Gregor Mendel's principles in 1900 by botanists , , and marked a pivotal shift in understanding as occurring through discrete units rather than blending, resolving long-standing paradoxes in evolutionary theory. published his findings on hybrid ratios in , independently deriving Mendel's laws without initial knowledge of the original 1866 paper, while and von Tschermak confirmed similar patterns in their experiments on peas and other species. This breakthrough established as a field, emphasizing particulate that preserved variation for . In and 1940s, the Modern Synthesis unified Mendelian genetics with Darwinian evolution through mathematical , led by , J.B.S. , and . Fisher's 1930 book demonstrated how Mendelian segregation enables gradual adaptive change under , quantifying the efficiency of in large populations. Haldane's 1932 work modeled the genetic costs of and mutation rates driving evolution, while Wright's path analysis and shifting balance theory in explored how in subdivided populations facilitates . These contributions formed the basis of , explaining continuous traits like height through polygenic variation. The molecular era began with the 1953 Watson-Crick model of DNA as a double helix, revealing how nucleotide sequences encode and transmit genetic variation through base pairing and replication. This structure enabled direct study of mutations and polymorphisms at the molecular level. The Human Genome Project's completion in 2003 produced a reference sequence covering 99% of the euchromatic genome, uncovering that over 98% is non-coding and highlighting regulatory and structural variants as key sources of human diversity. Building on this, the 1000 Genomes Project's 2015 phase cataloged 88 million variants across 2,504 individuals from 26 populations, showing that rare alleles (frequency <0.5%) comprise 86% of variation and are often population-specific, with African genomes exhibiting the highest diversity. In the , tools like CRISPR-Cas9, characterized in 2012 as a dual-RNA-guided endonuclease for precise DNA cleavage, have revolutionized the manipulation of genetic variation by enabling targeted edits in living organisms. By 2025, advances in single-cell have overcome amplification biases to detect low-frequency somatic mosaicism, revealing clonal dynamics in , aging, and diseases like cancer. approaches, exemplified by graph-based representations of structural variants in over 1,000 diverse human genomes, better capture insertions, deletions, and inversions that linear references miss, enhancing variant interpretation and population diversity studies.

References

  1. [1]
    The Genetic Variation in a Population Is Caused by Multiple Factors
    Genetic variation describes naturally occurring genetic differences among individuals of the same species. This variation permits flexibility and survival ...
  2. [2]
    Understanding Human Genetic Variation - NCBI - NIH
    Genetics is the scientific study of inherited variation. Human genetics, then, is the scientific study of inherited human variation.How Much Genetic Variation... · What Is the Significance of...
  3. [3]
    Studying Mutation and Its Role in the Evolution of Bacteria - PMC - NIH
    Genetic variation is ultimately all generated by mutation. It is therefore clear that mutation is a major evolutionary force that must be studied and understood ...
  4. [4]
    Genetic Variations and Precision Medicine - PMC - NIH
    Apr 1, 2019 · In this review article, we introduce genetic variations, including their data types, relevant databases, and some currently available analysis methods and ...
  5. [5]
    The crucial role of genome-wide genetic variation in conservation
    Nov 12, 2021 · Decades of theoretical (1) and empirical (2, 3) research suggest that conserving genome-wide genetic variation improves population viability.
  6. [6]
    Human Genomic Variation
    Feb 1, 2023 · Genomic variation reflects the differences in a person's DNA compared to other peoples' DNA. There are multiple types of variants in human ...
  7. [7]
    Sources of gene expression variation in a globally diverse human ...
    Jul 17, 2024 · Genetic variation that affects gene expression and splicing accounts for a large proportion of phenotypic differences within and between species ...
  8. [8]
    Human structural variation: mechanisms of chromosome ... - NIH
    Chromosome structural variation (SV) is a normal part of variation in the human genome, but some classes of SV can cause neurodevelopmental disorders.
  9. [9]
    What is a gene variant and how do variants occur? - MedlinePlus
    Mar 25, 2021 · A gene variant is a permanent change in the DNA sequence that makes up a gene. This type of genetic change used to be known as a gene mutation.
  10. [10]
    Relationships between adaptive and neutral genetic diversity and ...
    Adaptive variants influence the phenotype and fitness of the organisms that carry them; neutral variants, on the other hand, are selectively neutral (full ...
  11. [11]
    Genetic variation - Understanding Evolution
    Without genetic variation, some key mechanisms of evolutionary change like natural selection and genetic drift cannot operate.
  12. [12]
    The Genotype/Phenotype Distinction
    Jun 6, 2017 · The phenotype is the physical and behavioral traits of the organism, for example, size and shape, metabolic activities, and patterns of movement.Setting the Scene: Different... · The Goals and Open... · Control and Reintegration
  13. [13]
    What are single nucleotide polymorphisms (SNPs)? - MedlinePlus
    Mar 22, 2022 · They occur almost once in every 1,000 nucleotides on average, which means there are roughly 4 to 5 million SNPs in a person's genome. These ...
  14. [14]
    Large-Scale Validation of Single Nucleotide Polymorphisms in Gene ...
    Single nucleotide polymorphisms (SNPs) are the most abundant genetic variations in the human genome. They occur, on average, once every 300 base pairs of ...
  15. [15]
    Non-STR DNA Markers: SNPs, Y-STRs, LCN and mtDNA | Single ...
    Jul 20, 2023 · The most common form of genetic variation in the human genome (approximately 90%) is a class of genetic marker known as a single nucleotide polymorphism (SNP).
  16. [16]
    Methods for Discovering and Scoring Single Nucleotide ...
    Mar 7, 2012 · They are termed single nucleotide polymorphisms (SNPs) when the variant sequence type has a frequency of at least 1 percent in the population.
  17. [17]
    The origin, evolution, and functional impact of short insertion ...
    Short insertion and deletion polymorphisms (indels, here defined as a gain or loss of up to 50 nucleotides at a single locus) are increasingly being recognized ...
  18. [18]
    What kinds of gene variants are possible?: MedlinePlus Genetics
    Nov 4, 2021 · A deletion-insertion (delins) variant may also be known as an insertion-deletion (indel) variant. ... A reading frame consists of groups of three ...
  19. [19]
    Small insertions and deletions (INDELs) in human genomes - PMC
    In this review, we focus on progress that has been made with detecting small insertions and deletions (INDELs) in human genomes.
  20. [20]
    A Brief Review of Short Tandem Repeat Mutation - PMC - NIH
    Unique DNA sequences in a genome exhibit a very low mutation rate (approximately 10−9 nt per generation), whereas the mutation rates in STR sequences are ...
  21. [21]
    Sequence-based estimation of minisatellite and microsatellite repeat ...
    Variable tandem repeats are frequently used for genetic mapping, genotyping, and forensics studies. Moreover, variation in some repeats underlies rapidly ...
  22. [22]
    Lect. 8. Intro. to microsatellites
    Mutation process: Microsatellites are useful genetic markers because they tend to be highly polymorphic. It is not uncommon to have human microsatellites with ...
  23. [23]
    Effects of Synonymous Mutations beyond Codon Bias - NIH
    Introduction. Synonymous mutations do not alter the encoded amino acids but can still impact fitness via their effects on gene expression and protein structure.
  24. [24]
    the causes and consequences of codon bias - PMC - PubMed Central
    The central dogma of molecular biology suggests that synonymous mutations – those that do not alter the encoded amino acid – will have no effect on the ...Figure 1. Codon Bias Within... · Patterns Of Codon Usage · Effects Of Codon Adaptation...
  25. [25]
    Molecular Mechanisms and the Significance of Synonymous Mutations
    Jan 20, 2024 · Synonymous mutations result from the degeneracy of the genetic code. Most amino acids are encoded by two or more codons.
  26. [26]
    Sickle Cell Disease—Genetics, Pathophysiology, Clinical ...
    May 7, 2019 · Sickle cell disease (SCD) is a monogenetic disorder due to a single base-pair point mutation in the β-globin gene resulting in the substitution ...
  27. [27]
    Frameshift Mutation - National Human Genome Research Institute
    A frameshift mutation in a gene refers to the insertion or deletion of nucleotide bases in numbers that are not multiples of three.Missing: indels | Show results with:indels
  28. [28]
    Mutation, Repair and Recombination - Genomes - NCBI Bookshelf
    Any protein binding site is susceptible to point, insertion or deletion mutations ... The mutation must be one that confers an abnormal activity on a protein.
  29. [29]
    Biochemistry, Mutation - StatPearls - NCBI Bookshelf - NIH
    ... genetic mutations can be divided into four categories: mutations ... DNA double-strand break repair and their potential to induce chromosomal aberrations.
  30. [30]
  31. [31]
    Research progress on the role and mechanism of DNA damage ...
    DNA damage is caused by alkylating agents, aqueous deamination, free radicals, and ROS generated by various photochemical processes including ultraviolet (UV) ...
  32. [32]
    Mechanisms of DNA damage, repair and mutagenesis - PMC
    Base deamination is a major source of spontaneous mutagenesis in human cells, where cytosine (C), adenine (A), guanine (G), and 5-methyl cytosine (5mC) in DNA ...Missing: tautomerization | Show results with:tautomerization
  33. [33]
    Properties and rates of germline mutations in humans - PMC - NIH
    Studies from targeted sequencing of exomes or other regions have reported higher mutation rates (1.31–2.17×10−8 mutations per base pair per generation)[13–16]; ...
  34. [34]
    Why are RNA virus mutation rates so damn high? - PubMed Central
    Aug 13, 2018 · RNA viruses have high mutation rates—up to a million times higher than their hosts—and these high rates are correlated with enhanced virulence ...
  35. [35]
    Viral Mutation Rates - PMC - NIH
    Another general observation is that RNA viruses have a higher mutation rate than DNA viruses (29, 45). However, based on observations that some ssDNA viruses ...
  36. [36]
    Genetics, Somatic Mutation - StatPearls - NCBI Bookshelf - NIH
    Apr 17, 2023 · These mutations do not involve the germline and consequently do not pass on to offspring. Somatic mutations are a normal part of aging and ...Introduction · Cellular · Mechanism · Clinical SignificanceMissing: heritable | Show results with:heritable
  37. [37]
  38. [38]
    DNA Repair - The Cell - NCBI Bookshelf - NIH
    In error-prone repair, a gap opposite a site of DNA damage is filled by newly synthesized DNA. Since the new DNA is synthesized from a damaged template strand, ...
  39. [39]
    DNA-damage repair; the good, the bad, and the ugly - PMC
    This is in contrast to mutations of other excision repair pathways, such as NER and MMR. ... mutation in mice deficient in Mlh1, Pms1 and Pms2 DNA mismatch repair ...
  40. [40]
    DNA Damage/Repair Management in Cancers - PMC - NIH
    Apr 23, 2020 · Failure of DNA repair mechanisms leads to the formation of mutations. ... mutation in the repairing genes [153]. Recently, the development ...6. Dna Repair Pathways · 6.2. Base Excision Repair... · 6.3. Nucleotide Excision...<|control11|><|separator|>
  41. [41]
    Genetics, Meiosis - StatPearls - NCBI Bookshelf - NIH
    Genomic diversity and genetic variation is produced through the process of meiosis due to chromosomal recombination and independent assortment. Each ...
  42. [42]
    Crossover recombination between homologous chromosomes in ...
    Oct 25, 2024 · Crossing over between homologous chromosomes in meiosis is essential in most eukaryotes to produce gametes with the correct ploidy.
  43. [43]
    Chromosome architecture and homologous recombination in meiosis
    In this review, we summarize insights into the importance of chromosome architecture in the regulation of meiotic recombination.Missing: paper | Show results with:paper
  44. [44]
    Principle of Independent Assortment - Nature
    The principle of independent assortment describes how different genes independently separate from one another during the formation of reproductive cells.
  45. [45]
    Population Genetics and Genome Evolution of Selfing Species
    Jan 16, 2017 · Compared to outcrossing, selfing reduces heterozygosity, effective recombination and migration rate and increases genetic drift.
  46. [46]
    Partial Selfing Can Reduce Genetic Loads While Maintaining ... - NIH
    Evolution under predominant outcrossing or partial selfing maintains standing genetic variation. When compared with the ancestral population, androdioecious and ...
  47. [47]
    [PDF] 2.3 Linkage, recombination, and LD.
    Sep 23, 2023 · This is known as linkage dise- quilibrium (LD). Meanwhile, at larger distances, recombination breaks down LD by shuffling genotypes. Here we ...
  48. [48]
    Linkage disequilibrium — understanding the evolutionary past and ...
    the nonrandom association of alleles at different loci — is a sensitive indicator of the population genetic forces that structure a ...Missing: breakdown | Show results with:breakdown
  49. [49]
    Unraveling the genetic basis of hybrid vigor - PNAS
    Aug 29, 2006 · Hybrid vigor, or heterosis, is the increase in stature, biomass, and fertility that characterizes the progeny of crosses between diverse parents.Missing: recombination | Show results with:recombination
  50. [50]
    Heterosis in plants - ScienceDirect.com
    Sep 24, 2018 · The observation that cross-pollinated hybrids are more vigorous than their parents is nowadays commonly referred to as heterosis.
  51. [51]
    Next-Generation Sequencing Technology: Current Trends and ... - NIH
    NGS allows for the rapid sequencing of millions of DNA fragments simultaneously, providing comprehensive insights into genome structure, genetic variations, ...
  52. [52]
    Improving the accuracy of genetic sequencing - Nature
    At present, all 'next-generation sequencing' methods have higher error rates than the first-generation Sanger sequencing approach.
  53. [53]
    Confirming putative variants at ≤ 5% allele frequency using ... - Nature
    Jun 2, 2021 · However, because Sanger sequencing has an LoD of 5–20% VAF, it cannot be used directly to confirm NGS variant findings with VAF < 5%.
  54. [54]
    Next-Generation Sequencing (NGS) | Explore the technology - Illumina
    Next-generation sequencing (NGS) is a massively parallel sequencing technology that offers ultra-high throughput, scalability, and speed.Illumina Sequencing Methods · History of Illumina Sequencing · NGS for Beginners
  55. [55]
    (PDF) The Impact of Next-Generation Sequencing on Genomics
    Aug 9, 2025 · This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on ...Abstract And Figures · References (135) · Repetitive Dna And...<|separator|>
  56. [56]
    Review on the development of genotyping methods for assessing ...
    Jan 23, 2013 · RFLP was the first DNA-based marker for constructing genetic linkage maps; it is also one of the most widely used markers in AnGR assessments ...
  57. [57]
    Amplified-Fragment Length Polymorphism Analysis: the State of an Art
    In plants, AFLP analysis is a multilocus PCR technology that generates as many as 150 locus-specific bands, a high percentage of which can be polymorphic.
  58. [58]
    SNP & SNV Genotyping | NGS & array techniques - Illumina
    Explore genotyping techniques and solutions to find single nucleotide polymorphisms and variants (SNPs and SNVs) using microarrays and sequencing.
  59. [59]
    High-throughput variation detection and genotyping using microarrays
    We have developed an automated statistical method (ABACUS) to analyze microarray hybridization data and applied this method to Affymetrix Variation Detection ...
  60. [60]
    Recent developments and future directions in point-of-care ... - NIH
    Jan 9, 2025 · This review examines the current state of CRISPR-based diagnostics and their potential applications across a wide range of diseases.
  61. [61]
    CRISPR‐driven diagnostics: Molecular mechanisms, clinical efficacy ...
    Sep 26, 2025 · This article reviews the recent research progress in the CRISPR/Cas system for detecting nucleic acids, with an emphasis on CRISPR/Cas9, CRISPR/ ...Missing: 2020s | Show results with:2020s
  62. [62]
    Long-Read Sequencing and Structural Variant Detection - NIH
    Jul 17, 2025 · PacBio HiFi and ONT platforms offer beneficial techniques for discovering structural variants in rare genetic conditions. PacBio's exceptional ...
  63. [63]
    Structural variation in 1,019 diverse humans based on long ... - Nature
    Jul 23, 2025 · Here we conducted long-read sequencing in 1,019 humans to construct an intermediate-coverage resource covering 26 populations from the 1000 ...
  64. [64]
    G. H. Hardy (1908) and Hardy–Weinberg Equilibrium - PMC - NIH
    This is an account of GH Hardy's role in establishing the existence of what is now known as “Hardy–Weinberg equilbrium,”Missing: URL | Show results with:URL
  65. [65]
    Mathematical model for studying genetic variation in terms of ... - NIH
    To express the degree of polymorphism in a population at the nucleotide level, a measure called "nucleotide diversity" is proposed. Full text. PDF · Previous ...
  66. [66]
    Estimating F-statistics: A historical view - PMC - PubMed Central
    Sewall Wright introduced a set of “F-statistics” to describe population structure in 1951 and he emphasized that these quantities were ratios of variances.Missing: flow | Show results with:flow
  67. [67]
    [PDF] Levels of Genetic Variation in Trees - USDA Forest Service
    In the earlier review (Hamrick and others 1979), the inclusion of studies based solely on polymorphic loci tended to increase estimates of the mean levels of ...
  68. [68]
    enzyme polymorphism in feral, outbred and - Nature
    The degrees of heterozygosity observed in the outbred and inbred strains were lower, from 0006 to 0012. Contrary to expectation, the inbred strains.
  69. [69]
    Inbreeding depression across the lifespan in a wild mammal ... - PNAS
    Inbreeding depression is the decrease in fitness with increased genome-wide homozygosity that occurs in the offspring of related parents.
  70. [70]
    The Cheetah Is Depauperate in Genetic Variation - Science
    The extreme monomorphism may be a consequence of a demographic contraction of the cheetah (a population bottleneck) in association with a reduced rate of ...
  71. [71]
    HLA DNA Sequence Variation among Human Populations
    The Human Leukocyte Antigen (HLA) loci are among the most polymorphic genes currently described in the human genome, with more than 4,000 observed alleles ...
  72. [72]
    Inferring Continuous and Discrete Population Genetic Structure ...
    Here, we present a statistical framework for the simultaneous inference of continuous and discrete patterns of population structure.
  73. [73]
    Admixture mapping of quantitative traits in Populus hybrid zones
    Jul 17, 2013 · Admixture mapping can be used to identify the number and effect sizes of genes that contribute to the divergence of ecologically important ...
  74. [74]
    Admixture mapping identifies introgressed genomic regions in North ...
    Using a set of 3102 ancestry informative markers, we identified 60 differentially introgressed regions in 44 canines across this admixture zone.
  75. [75]
    Evolution and dispersal of mitochondrial DNA haplogroup U5 in ...
    May 7, 2022 · Haplogroup U5 has been present in northern Europe since the Mesolithic, and spread in both eastern and western directions, undergoing significant ...
  76. [76]
    Phylogeographic Differentiation of Mitochondrial DNA in Han Chinese
    The migration of Han people to provinces such as Xinjiang and Yunnan occurred relatively recently, having started mainly ∼100–600 years ago, and was caused by ...
  77. [77]
    Got lactase? - Understanding Evolution
    Lactase breaks down lactose. Babies have lactase, but most adults are lactose intolerant because the lactase gene is switched off after weaning.
  78. [78]
    [PDF] The Apportionment of Human Diversity - Vanderbilt University
    Two analyses for man, one on enzymes by Harris (1970) and one on blood groups by Lewontin (1967), give respective estimates of 30% and 36% for polymorphic loci ...
  79. [79]
    [PDF] Human genetic diversity: Lewontin's fallacy - Wasabi
    about 85% of the total genetical variation is due to individual differences within populations and only 15% to differences between populations or ethnic groups.<|control11|><|separator|>
  80. [80]
    Dispatch Population genetics: A new apportionment of human diversity
    For example, Lewontin [3] found that, across 15 protein loci, 85% of allele frequency diversity was found within local populations, with only 7% being ...
  81. [81]
    Heterozygote Advantage Is a Common Outcome of Adaptation in ...
    Mutations that are overdominant in fitness are expected to lead to balancing selection and to the persistence of genetic and fitness variation in diploid ...Results · Mutations In Evolved Clones · Epistasis And Heterozygote...<|control11|><|separator|>
  82. [82]
    Sickle Cell Anaemia and Malaria - PMC - NIH
    Oct 3, 2012 · This confirms the notion that the main advantage of AS heterozygotes in areas with heavy malaria endemicity consists in their increased ...
  83. [83]
    Models of Frequency-Dependent Selection with Mutation from ...
    Negative frequency dependence (selection in favor of rare alleles) is often invoked to explain polymorphism, since if it is beneficial to be rare, it is also ...
  84. [84]
    An Introduction to Sexual Selection | Accumulating Glitches - Nature
    Jul 13, 2015 · Peahens prefer peacocks with large and colourful tails, so those peacocks get to mate more frequently and have more offspring. The male ...
  85. [85]
    The peppered moth and industrial melanism: evolution of a ... - Nature
    Dec 5, 2012 · The peppered moth was the most diagrammatic example of the phenomenon of industrial melanism that came to be recognised in industrial and smoke- ...
  86. [86]
    Phenotypic Plasticity and Genotype by Environment Interaction for ...
    Phenotypic plasticity itself, however, can vary depending on genotype. Genotype by environment interaction (GEI) occurs when there is variation among genotypes ...
  87. [87]
    Prediction and estimation of effective population size | Heredity
    Jun 29, 2016 · The classical developments of effective population size theory are based on the rate of change in gene frequency variance (genetic drift) or the ...
  88. [88]
    Homozygosity mapping of the Achromatopsia locus in the Pingelapese
    Inherited as an autosomal recessive trait, achromatopsia is rare in the general population (1:20,000-1:50,000). Among the Pingelapese people of the Eastern ...Missing: bottleneck founder effect source
  89. [89]
    Coalescent Theory: An Introduction | Systematic Biology
    Mar 24, 2009 · The theory was initially developed by Kingman (1982) in 3 papers published in probability theory journals, which outline the foundation of ...
  90. [90]
    MIGRATION AND GENETIC DRIFT IN HUMAN POPULATIONS
    To summarize the effect of migration on genetic population structure, we introduce a new parameter, the effective migration rate.
  91. [91]
    Increased RNA virus population diversity improves adaptability
    Mar 25, 2021 · The replication machinery of most RNA viruses lacks proofreading mechanisms. As a result, RNA virus populations harbor a large amount of ...Missing: dependent | Show results with:dependent
  92. [92]
    The mutational landscape of SARS-CoV-2 provides new insight into ...
    Jul 11, 2025 · Our analyses indicate that the SARS-CoV-2 genome mutates at a rate of ∼1.5 × 10 −6 /base per viral passage and that the spectrum is dominated by C → U ...
  93. [93]
    Quasispecies Theory and the Behavior of RNA Viruses
    Jul 22, 2010 · According to Eigen's original formulations, a quasispecies can remain at equilibrium despite a high mutation rate [38], [39]. Small increases in ...
  94. [94]
    Quasispecies theory and emerging viruses: challenges and ... - Nature
    Nov 14, 2024 · The quasispecies theory, conceived by Manfred Eigen and Peter Schuster more than fifty years ago was developed to investigate the dynamics of ...
  95. [95]
    Quasispecies Dynamics of RNA Viruses - PMC - PubMed Central
    The quasispecies dynamics of RNA viruses are closely related to viral pathogenesis and disease, and antiviral treatment strategies.
  96. [96]
    RNA Virus Reassortment: An Evolutionary Mechanism for Host ...
    Jul 9, 2015 · Reassortment is an evolutionary mechanism of segmented RNA viruses that plays an important but ill-defined role in virus emergence and interspecies ...
  97. [97]
    Influenza A virus reassortment in mammals gives rise to genetically ...
    Nov 11, 2022 · Influenza A virus (IAV) genetic exchange through reassortment has the potential to accelerate viral evolution and has played a critical role ...
  98. [98]
    Antiretroviral APOBEC3 cytidine deaminases alter HIV-1 provirus ...
    Jan 10, 2023 · Very high levels of deamination, called hypermutation, are observed early in the infection that thoroughly inactivate the virus. However, HIV ...
  99. [99]
    Interactions of host APOBEC3 restriction factors with HIV-1 in vivo
    These factors include the APOBEC3 family of DNA cytidine deaminases, which restrict the infectivity of HIV-1 by hypermutating viral cDNA and inhibiting reverse ...
  100. [100]
    APOBEC3 family proteins as drivers of virus evolution - Frontiers
    APOBEC3G induces a hypermutation gradient: purifying selection at multiple steps during HIV-1 replication results in levels of G-to-A mutations that are ...Abstract · Introduction · A3 proteins as innate antiviral... · Arms race between A3...
  101. [101]
    A body-wide view of somatic mutations | Nature Reviews Genetics
    Sep 14, 2021 · Four new studies in Nature report multi-tissue analyses of somatic mutations from human donors, with insights into cell lineage commitment ...
  102. [102]
    Somatic mutation rates scale with lifespan across mammals - Nature
    Apr 13, 2022 · Previous analyses in humans have shown that most somatic mutations in colorectal crypts accumulate neutrally, without clear evidence of ...<|control11|><|separator|>
  103. [103]
    Advances in single-cell DNA sequencing enable insights into ...
    Apr 25, 2025 · DNA sequencing from bulk or clonal human tissues has shown that genetic mosaicism is common and contributes to both cancer and non-cancerous disorders.
  104. [104]
    Somatic genetic variation in healthy tissue and non-cancer diseases
    Oct 27, 2022 · In this review, we aim to update our knowledge of somatic variation detection and its relation to healthy tissue and non-cancer diseases.
  105. [105]
    Genome-wide mapping of somatic mutation rates uncovers drivers ...
    Jun 20, 2022 · Neutral (passenger) mutations that do not provide a proliferative advantage to a cell dominate the mutational landscape of tumors. Only a ...
  106. [106]
    Cancer gene mutation frequencies for the U.S. population - Nature
    Oct 13, 2021 · TP53 is the most commonly mutated gene (35%), and KMT2C, KMT2D, and ARID1A are among the ten most commonly mutated driver genes, highlighting ...
  107. [107]
    [PDF] The Scala Naturae
    ” Aristotle saw the scale as eternally fixed with no organism able to move to another level over time. Therefore, the scala naturae engendered a world view ...Missing: limited | Show results with:limited
  108. [108]
    The Great Chain of Being: Aristotle's Scala Naturae - Palaeos
    Aristotle divided animals into two types: those with blood, and those without blood (or at least without red blood), corresponding to our distinction between ...Missing: fixed limited variation<|control11|><|separator|>
  109. [109]
    For Linnaeus, classification followed from the new idea that species ...
    Jun 2, 2007 · An essentialist view of species required the assumption that species were fixed, not changing over time.
  110. [110]
    [PDF] Opinion on the evolution of the Linnaean animal species concept ...
    Jun 18, 2025 · The fixed species concept implied that the essential characters of a species do not change over time.
  111. [111]
    Lamarck, Evolution, and the Inheritance of Acquired Characters - PMC
    As we have seen in Lamarck's thought experiment with the infants' eyes and in the “Second Law” of his Philosophie Zoologique, he believed that for characters ...
  112. [112]
    Lamarckism - an overview | ScienceDirect Topics
    Some time before Darwin, J. B. Lamarck proposed that the inheritance of characters acquired during an organism's lifetime could accumulate to give adaptive ...J. -B. Lamarck (1744--1829) · Behavioral Genetics And... · A Brief History Of...
  113. [113]
    Blending Inheritance - an overview | ScienceDirect Topics
    Blending inheritance can be defined as a process in which offspring exhibit a smooth blend of characteristics inherited from their parents, resulting in ...
  114. [114]
    Principles and biological concepts of heredity before Mendel - PMC
    Oct 21, 2021 · It is very common to find an introduction about heredity in genetic textbooks covering Mendel without mentions of preceding breeding experiments ...
  115. [115]
    The Pre-Mendelian Era and Mendelism | Genetics
    In this article we will discuss about the pre-mendelian era birth of mendelism. Pre-Mendelian Era: Man's curiosity to know about transmission of hereditary ...
  116. [116]
    On the Origin of Species - Project Gutenberg
    Species once lost do not reappear. Groups of species follow the same general rules in their appearance and disappearance as do single species.
  117. [117]
    1900: Rediscovery of Mendel's Work
    Apr 22, 2013 · Three botanists - Hugo DeVries, Carl Correns and Erich von Tschermak - independently rediscovered Mendel's work in the same year.
  118. [118]
    Hugo de Vries and the rediscovery of Mendel's laws
    Aug 22, 2006 · Hugo de Vries claimed that he had discovered Mendel's laws before he found Mendel's paper. ... By 1900, both of these ratios had become 3:1.Missing: URL | Show results with:URL
  119. [119]
    [PDF] OU_ 162958 - Gwern
    The present book, with all the limitations of a first attempt, is at least an attempt to consider the theory of Natural Selection on its own merits. Whenthe ...
  120. [120]
    [PDF] The Causes of Evolution - JBS Haldane
    This book is based on a series of lectures delivered in. January 1931 at the Prifysgol Cymru, Aberystwyth, and entitled “ A Re-examination of Darwinism.55.
  121. [121]
    [PDF] The Method of Path Coefficients - Gwern.net
    The method of path coefficients was suggested a number of years ago (Wright 1918, more fully 1920, 1921), as a flexible means of relating the correlation ...Missing: key | Show results with:key
  122. [122]
    A Structure for Deoxyribose Nucleic Acid - Nature
    The determination in 1953 of the structure of deoxyribonucleic acid (DNA), with its two entwined helices and paired organic bases, was a tour de force in ...
  123. [123]
    International Consortium Completes Human Genome Project
    (Nature, April 24, 2003, online publication). Many of the challenges in the vision are aimed at utilizing genome research to combat disease and improve human ...
  124. [124]
    A global reference for human genetic variation | Nature
    Sep 30, 2015 · The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome ...
  125. [125]
    A Programmable Dual-RNA–Guided DNA Endonuclease ... - Science
    Jun 28, 2012 · Our study reveals a family of endonucleases that use dual-RNAs for site-specific DNA cleavage and highlights the potential to exploit the system for RNA- ...
  126. [126]
    Advances in single-cell DNA sequencing enable insights ... - PubMed
    Apr 25, 2025 · DNA sequencing from bulk or clonal human tissues has shown that genetic mosaicism is common and contributes to both cancer and non-cancerous disorders.Missing: 2020-2025 | Show results with:2020-2025