Fact-checked by Grok 2 weeks ago

Single-nucleotide polymorphism

A single-nucleotide polymorphism (SNP), pronounced "snip," is a variation at a single site in the DNA sequence where one differs between individuals or between paired chromosomes within an individual. These variations can occur in both coding and non-coding regions of the and represent the most common form of , accounting for the majority of differences in DNA sequences among humans. SNPs arise from single base-pair substitutions, which may be transitions ( to or to ) or transversions ( to or vice versa), and they are inherited in a Mendelian fashion. SNPs are highly abundant in the human genome, with estimates indicating over 10 million common SNPs (minor allele frequency greater than 1%) occurring approximately every 100–300 base pairs throughout the 3 billion base pairs of DNA. This frequency contributes to the genetic diversity that underlies individual differences in traits, susceptibility to diseases, and responses to environmental factors and medications. While most SNPs are neutral and do not alter protein function, those located in regulatory or coding regions can influence gene expression, protein structure, or splicing, potentially leading to phenotypic effects. Due to their stability, abundance, and ease of genotyping, SNPs serve as powerful markers in genetic research, including linkage analysis, , and . They are central to genome-wide association studies (GWAS), which have identified thousands of SNP-trait associations for complex diseases such as , cancer, and cardiovascular disorders, enabling insights into disease mechanisms and risk prediction. In pharmacogenomics, SNPs help tailor drug therapies by predicting individual variability in and efficacy, advancing . Ongoing large-scale sequencing efforts continue to catalog SNPs, enhancing their utility in evolutionary studies and ancestry tracing.

Fundamentals

Definition

A single-nucleotide polymorphism () is a substitution of a single at a specific position in the DNA sequence, where two or more alternative alleles occur at appreciable frequencies within a . Unlike rare mutations, SNPs are defined by a minor allele frequency (MAF) typically greater than 1%, distinguishing them as common genetic variations inherited from parents and stably transmitted across generations. They represent the most prevalent form of sequence variation in the , outnumbering other types of polymorphisms such as insertions or deletions. In the , SNPs occur approximately every 100 to 300 base pairs, with common variants (MAF >1%) numbering around 10 million based on early large-scale catalogs from projects like the and the SNP Consortium. More recent efforts, such as the Phase 3 (2015), have cataloged approximately 15 million common SNPs (MAF ≥1%). These variations arise naturally through evolutionary processes and are present in both coding and non-coding regions, contributing to without typically causing disease unless in specific contexts. The concept of SNPs emerged in the 1970s with initial observations of single-base differences in DNA sequences, but their systematic identification and cataloging gained momentum in the 1990s through advancements in sequencing technology during the Human Genome Project (1990–2003). Efforts like the International SNP Map Working Group in 2001 further accelerated discovery, enabling genome-wide studies of human variation.

Molecular Basis

Single-nucleotide polymorphisms (SNPs) primarily originate from point mutations during DNA replication, where errors introduced by DNA polymerase, such as base misincorporation, escape proofreading and repair mechanisms. Failures in DNA repair pathways, including mismatch repair and base excision repair, can also perpetuate these errors, allowing them to become heritable. Environmental mutagens, like ionizing radiation or certain chemicals, further contribute by damaging DNA bases and inducing substitutions that, if unrepaired, lead to SNPs. In populations, these variants persist and spread through neutral genetic drift, particularly when they confer no significant selective advantage or disadvantage. SNPs are inherited following Mendelian principles as typically biallelic genetic markers, though rare multiallelic cases occur, meaning each individual carries two alleles at a given locus—one inherited from each parent. For instance, a locus might feature (A) or (G) as alternative bases, resulting in homozygous (e.g., AA or GG) or heterozygous (AG) genotypes that segregate independently during . This biallelic nature ensures predictable transmission patterns across generations, akin to other codominant markers. At the molecular level, SNPs manifest as single base-pair substitutions, categorized as transitions or transversions. Transitions involve exchanges between purines (A ↔ G) or pyrimidines (C ↔ T), such as a C-to-T change, which are chemically more similar and thus more frequent. Transversions, conversely, swap a purine for a pyrimidine or vice versa, like C-to-A, and occur less often due to greater structural differences. In humans, approximately two-thirds of SNPs are transitions. SNPs are detected molecularly through techniques that identify base differences at specific loci, such as , which directly determines the sequence to reveal variants. Alternatively, hybridization methods employ allele-specific probes that bind preferentially to matching sequences, allowing differentiation via signal intensity or specificity. These approaches confirm the presence of SNPs without delving into population-level .

Classification

Types

Single-nucleotide polymorphisms (SNPs) are primarily classified based on their location within the and their potential to alter products. SNPs occurring in coding regions of , known as coding or exonic SNPs, directly affect the protein-coding sequence and are subdivided into synonymous and non-synonymous variants. Synonymous SNPs result in no change to the sequence of the protein due to the degeneracy of the , whereas non-synonymous SNPs lead to an substitution (missense) or introduce a premature (nonsense), potentially altering protein function. In contrast, non-coding SNPs are located outside of protein-coding exons and include those in introns, intergenic regions between genes, and regulatory elements such as promoters and enhancers. These variants do not directly change the sequence but may influence , splicing, or other regulatory processes. While most SNPs are biallelic (two possible alleles at the site), rare multiallelic SNPs with more than two alleles occur and are classified similarly but require special consideration in genetic analysis. SNPs are further categorized by their prevalence in populations, with common SNPs defined as those having a minor (MAF) greater than 1%, indicating they are widespread, and rare SNPs having an MAF of 1% or less, often arising more recently in . Importantly, SNPs specifically refer to substitutions involving a single , distinguishing them from insertions or deletions (indels), which involve the addition or removal of one or more and are considered separate classes of genetic variants. Special cases include SNPs in (mtDNA), which is a small, circular inherited maternally and subject to higher rates than nuclear DNA, often used in and ancestry studies. Additionally, certain structural variants, such as small copy number changes, can sometimes mimic SNPs in low-resolution sequencing data if not carefully annotated.

Distribution and Frequency

Single-nucleotide polymorphisms (SNPs) are highly prevalent within the diploid genome, where a typical individual harbors approximately 4.1–5.0 million single-nucleotide variants relative to the (GRCh38). This equates to a personal density of roughly one variant every 600–800 base pairs. Genome-wide, the polymorphism across human populations is approximately one SNP every 300 base pairs, corresponding to over 10 million variant sites in total. Across global human populations, the catalog of discovered SNPs has expanded dramatically, with over 786 million single-nucleotide variants identified in the Genome Aggregation Database (gnomAD) version 4.1 as of 2024, encompassing data from 807,162 individuals. These variants reflect the cumulative genetic diversity accumulated over human history. SNP distribution and frequency exhibit significant variation among populations, with African groups displaying the highest levels of nucleotide diversity—up to 50% greater than in non-African populations—consistent with the African origin of modern humans. This elevated diversity in African populations is accompanied by shorter linkage disequilibrium (LD) blocks, where LD typically decays within 5-10 kilobases, compared to longer LD extents (20-50 kilobases) in European and Asian populations due to historical bottlenecks and migrations. The observed frequencies of SNPs are shaped primarily by the human germline mutation rate, estimated at about 1.2 × 10^{-8} mutations per per generation, which introduces new variants stochastically. Additionally, pressures, including purifying selection against deleterious and positive selection favoring adaptive variants, further modulate allele frequencies and distribution patterns across populations.

Functional Implications

Effects on Gene Function

Single-nucleotide polymorphisms (SNPs) can profoundly influence gene function by altering the DNA sequence in ways that affect , , or . In regions, SNPs may result in synonymous changes that do not alter the sequence and are typically neutral, missense variants that substitute one for another, or nonsense mutations that introduce premature stop codons leading to truncated proteins. Missense SNPs, for instance, can disrupt and function; a well-known example is the Glu6Val substitution in the HBB gene, which causes sickle cell anemia by producing abnormal that polymerizes under low oxygen conditions. Nonsense SNPs, by contrast, often trigger or yield non-functional protein fragments, contributing to loss-of-function phenotypes in various genetic disorders. In non-coding regions, SNPs exert effects by modifying regulatory elements such as promoters, enhancers, or splice sites, thereby influencing transcription levels or mRNA processing. SNPs in promoter or enhancer regions can alter binding affinity, leading to changes in ; for example, the C-13910T SNP in an enhancer upstream of the LCT gene enables persistent lactase production in adults, conferring in populations with dairy-based diets. SNPs at splice junctions or exonic splicing enhancers can disrupt pre-mRNA splicing, resulting in , intron retention, or aberrant isoform production; a significant proportion of exonic disease-causing variants, estimated at 15–50% in recent studies, alter splicing patterns , often exacerbating pathogenic outcomes. To assess the potential pathogenicity of SNPs, particularly missense variants, computational tools evaluate their likely impact on protein function based on evolutionary conservation, physicochemical properties, and structural predictions. SIFT (Sorting Intolerant From Tolerant) predicts deleterious effects by analyzing and estimating , classifying variants as tolerated or damaging. Similarly, PolyPhen scores assess structural and functional consequences, aiding in distinguishing benign from harmful changes. These methods provide probabilistic insights but require experimental validation. Recent advances, including AI-driven structural predictions and high-throughput validation (as of 2025), have refined assessments of SNP impacts on splicing and protein function. The vast majority of SNPs are neutral or quasi-neutral, exerting no significant effect on gene function and persisting due to genetic drift rather than selection. Only a small fraction are deleterious, subject to negative selection that purges harmful variants, while rare beneficial SNPs may undergo positive selection, driving adaptive evolution in specific contexts. Estimates suggest that up to 70% of low-frequency missense SNPs are mildly deleterious, with stronger effects in coding regions where about 40% of sites face purifying selection. This distribution underscores why most genetic variation is functionally silent, with deleterious SNPs disproportionately contributing to disease susceptibility.

Role in Phenotypic Variation and Evolution

Single-nucleotide polymorphisms (SNPs) serve as key quantitative trait loci (QTLs) that contribute to phenotypic variation in traits such as and disease susceptibility. For instance, genome-wide association studies (GWAS) have identified thousands of SNPs associated with , collectively accounting for approximately 40-50% of the phenotypic variance in this trait. Similarly, SNPs in risk loci explain a significant portion of for complex diseases, with multiple independent effects at these loci enhancing predictive models for disease outcomes. These variations arise from SNPs altering or protein function, leading to subtle differences in traits across populations. Many phenotypic traits exhibit polygenic , where the combined effects of numerous SNPs with small individual impacts explain a substantial of , reaching up to ~40–50% for morphological traits like , though often lower (10–30%) for behavioral traits. This polygenicity underscores how SNPs interact additively and non-additively to shape complex phenotypes, as demonstrated in reviews of GWAS data where SNP-based estimates highlight the distributed genetic architecture of traits like and . Such effects emphasize the challenge of dissecting causal variants amid widespread polygenic influences. In evolutionary contexts, SNPs drive adaptation through and , with notable examples in pigmentation. Variants in genes like SLC24A5 and SLC45A2 have undergone positive selection to lighter skin tones in populations migrating to low-UV environments, facilitating synthesis while reducing depletion risks. These SNPs illustrate how allelic frequencies shift under selective pressures, contrasting with neutral drift in non-adaptive traits, and contribute to population-level divergence over millennia. SNPs also play a vital role in conservation genetics for , enabling assessments of and risks. Genome-wide SNP panels have been used to evaluate in species like , informing breeding programs to maintain adaptive potential and prevent population bottlenecks. By quantifying SNP-based heterozygosity and structure, these markers support management strategies that preserve evolutionary resilience amid habitat loss.

Applications in Research

Association Studies

Association studies in leverage single-nucleotide polymorphisms (SNPs) to identify links between and specific traits or diseases by examining statistical associations in population samples. These approaches have evolved from targeted hypothesis-driven investigations to comprehensive genome-scale analyses, enabling the discovery of SNPs that contribute to complex phenotypes. Candidate gene studies represent an early, hypothesis-driven method where researchers select SNPs within s suspected to influence a based on prior biological , such as pathways or models. This targeted approach allows for deeper functional interpretation of associations but suffers from lower statistical power due to limited genomic coverage and reliance on accurate prior hypotheses, often requiring smaller sample sizes compared to broader scans. In contrast, genome-wide association studies (GWAS) systematically scan millions of across the genome to detect associations without prior hypotheses, deriving power from large cohort sizes typically exceeding 100,000 individuals to achieve genome-wide significance. Originating in the mid-2000s with the advent of high-density SNP arrays, GWAS have identified thousands of -associated loci by comparing frequencies between cases and controls or across quantitative trait distributions. Key challenges in these studies include multiple testing, where testing millions of SNPs inflates the risk of false positives, necessitating stringent corrections like the Bonferroni method to maintain a genome-wide of approximately 5 × 10^{-8}. Population , arising from differences in SNP frequencies across ancestral subpopulations, can also produce spurious associations if not addressed through or mixed models. Post-2010 advances have integrated GWAS with next-generation sequencing to incorporate rare variants ( <1%), which were previously undetectable by SNP arrays, enhancing resolution for low-frequency effects through methods like burden tests and SKAT. This sequencing augmentation, enabled by falling costs, has expanded association detection to non-coding and structural s while maintaining focus on common SNPs for polygenic risk modeling.

Homozygosity and Linkage Mapping

Homozygosity mapping leverages (SNPs) to identify regions of the genome that are identical by descent, particularly in consanguineous families where affected individuals are more likely to inherit two copies of the same ancestral allele at a recessive disease locus. This approach detects long runs of (ROH), which are extended stretches of consecutive homozygous SNPs spanning several megabases, indicating shared ancestry and potential localization of disease-causing variants. Originally proposed for mapping recessive traits using restriction fragment length polymorphisms, the method has been enhanced by high-density SNP genotyping arrays that enable genome-wide scans with thousands to millions of markers, increasing resolution and power in inbred pedigrees. In linkage mapping, SNPs serve as dense markers to exploit linkage disequilibrium (LD), the non-random association of alleles at nearby loci due to reduced recombination within haplotype blocks—contiguous genomic segments inherited together. LD patterns form these blocks, with decay rates typically measured as the distance over which LD drops to half its initial value, varying across populations; for instance, LD decays more rapidly (within ~5-10 kb) in African ancestry groups owing to larger effective population sizes and older demographic histories, compared to slower decay (~50-100 kb) in European or Asian populations. This variation influences the utility of SNPs in constructing haplotypes for fine-mapping disease loci in family-based studies, where co-segregation of markers with the trait is assessed. Applications of SNP-based homozygosity and linkage mapping are particularly valuable for rare autosomal recessive diseases in isolated or consanguineous communities, such as founder populations in the Andes or Finland, where elevated inbreeding increases ROH frequency and simplifies variant prioritization. By intersecting ROH across affected individuals, causal variants can be pinpointed within narrowed intervals, facilitating targeted sequencing; for example, in an Andean isolate, imputation-enhanced IBD analysis using SNP data identified rare disease alleles in regions of extended homozygosity shared by patients. Unlike population-based association studies that scan for common variants, this family-oriented strategy excels in detecting low-frequency recessive mutations through inheritance patterns.02363-7/fulltext) Tools for these analyses include parametric and non-parametric linkage methods adapted for SNP array data. Parametric approaches model the inheritance pattern explicitly, incorporating parameters like penetrance and allele frequency to compute likelihood ratios (LOD scores) for linkage, assuming a known genetic model such as autosomal recessive with full penetrance. Non-parametric methods, in contrast, are model-free and rely on observed allele sharing among relatives—such as excess identical-by-descent segments in affected sib pairs—making them robust to misspecified models and suitable for complex traits. Software like or processes SNP genotypes to perform multipoint analyses, integrating ROH detection with linkage statistics for efficient locus mapping in pedigrees.

Applications in Medicine and Beyond

Pharmacogenomics

Single-nucleotide polymorphisms (SNPs) are central to pharmacogenomics, enabling the prediction of individual responses to medications by identifying genetic variants that influence drug metabolism, efficacy, and toxicity. These variations, particularly in genes involved in pharmacokinetics and pharmacodynamics, allow for personalized dosing and selection of therapies, reducing the risk of adverse drug reactions (ADRs) and improving therapeutic outcomes. For instance, SNPs can alter the activity of cytochrome P450 enzymes, leading to differences in drug clearance rates among patients. A prominent example is the role of SNPs in the CYP2D6 gene, which encodes a key enzyme in the metabolism of approximately 25% of commonly prescribed drugs, including opioids like . Individuals with certain CYP2D6 SNPs, such as those defining poor metabolizer phenotypes (e.g., *4, *5 alleles), exhibit reduced conversion of to its active metabolite morphine, resulting in inadequate pain relief, while ultrarapid metabolizers (e.g., gene duplications) face heightened risks of opioid toxicity, including respiratory depression. Clinical guidelines from the Clinical Pharmacogenetics Implementation Consortium (CPIC) recommend avoiding in poor and ultrarapid metabolizers based on , with evidence from prospective studies showing decreased ADR incidence when implemented. Similarly, SNPs in the VKORC1 gene, such as the -1639G>A variant (rs9923231), significantly impact dosing by modulating activity; patients with the AA require approximately 30% lower doses to achieve therapeutic anticoagulation, preventing or thrombotic events, as validated in large cohort studies and incorporated into FDA-approved dosing algorithms. The U.S. (FDA) has integrated pharmacogenomic insights into drug labeling for approximately 300 therapeutic products as of 2024, with SNPs serving as biomarkers for dosing adjustments or contraindications in areas like , , and . for these SNPs is increasingly used in clinical practice to predict ADRs; for example, preemptive testing panels covering , , and HLA-B variants have demonstrated up to 30% reductions in severe reactions in implementation trials. Cost-benefit analyses further support this approach, revealing that pharmacogenomic-guided prescribing in clinical trials and routine care yields net savings by averting hospitalizations. Looking ahead, the integration of SNPs into polygenic risk scores (PRS) promises to refine therapy selection by combining single-variant effects with cumulative genetic influences on drug response. Pharmacogenomic PRS models, which aggregate SNPs across multiple loci, have shown improved predictive accuracy for outcomes like efficacy or remission rates in validation studies, potentially enabling broader clinical adoption through electronic health record-linked testing. This evolution could expand beyond monogenic traits, though challenges in PRS validation across diverse populations remain.

Forensics and Population Genetics

Single-nucleotide polymorphisms (SNPs) have become integral to forensic DNA profiling, particularly through specialized panels designed for human identification. Panels comprising 50 or more SNPs, such as the 52-SNPforID panel, enable the generation of genetic profiles from challenging samples, offering a robust alternative to traditional short tandem repeat (STR) analysis. These SNP-based methods excel in cases involving degraded DNA, as SNPs are located within shorter amplicons (typically 50-100 base pairs) compared to STRs (which require 200-400 base pairs), allowing amplification from fragmented evidence like old bones or fire-damaged remains. For instance, in a study of forensic casework samples, SNP genotyping yielded full profiles in 36 cases where STR analysis only succeeded in 17, highlighting the superior sensitivity of SNPs for low-quantity or compromised DNA. Additionally, SNPs provide high discriminatory power when combined in multiplex assays, with match probabilities rivaling or exceeding those of STRs in diverse populations. In ancestry inference, SNPs serve as ancestry-informative markers (AIMs), which are variants exhibiting substantial differences across global populations, facilitating the estimation of biogeographical origins and proportions. mapping leverages panels of AIMs—such as sets of 128 or fewer markers—to trace ancestral contributions in admixed individuals, enabling the reconstruction of genetic ancestry with high accuracy (over 90%) for continental-level assignments. For population , the (FST) quantifies SNP-based divergence, where values range from 0 (no ) to 1 (complete ); for example, inter-continental FST averages around 0.15 for human SNPs, reflecting moderate global structure. These AIMs and FST metrics draw on the uneven distribution of SNPs across populations, such as higher frequencies of certain variants in versus European groups, to infer recent events. Within , SNPs illuminate historical human migration patterns, notably supporting the Out-of-Africa model through analyses of and gradients. Genome-wide SNP data reveal a serial during migrations from around 50,000-70,000 years ago, with decreasing correlating to distance from the origin, as evidenced by clinal patterns in over 1 million SNPs across global cohorts. This model posits a in the migrating , reducing effective size to approximately 1,000-10,000 individuals, which SNP-based coalescent simulations confirm through elevated FST values and reduced heterozygosity outside . Ethical concerns in SNP applications, particularly privacy risks in direct-to-consumer (DTC) genetic testing, arise from the commercial handling of sensitive ancestry and identification data. DTC services often sequence thousands of SNPs for ancestry reports, but inadequate data protection can lead to unauthorized sharing or breaches, as genetic profiles are inherently identifiable and immutable. For example, the 2025 bankruptcy of raised significant concerns about the security and potential misuse of millions of users' genetic data, including risks of sale to third parties or access by without consent. Consumers may unknowingly consent to data use in research or databases, raising issues of and ; for example, some platforms have partnered with without explicit user notification. Regulatory gaps exacerbate these risks, underscoring the need for transparent policies on consent, data ownership, and re-identification prevention in SNP-driven DTC testing.

Examples and Case Studies

Notable SNPs in Humans

Single-nucleotide polymorphisms (SNPs) have been instrumental in elucidating genetic contributions to diseases and traits, with several standing out due to their high , population-specific , or roles in genome-wide studies (GWAS). One prominent disease-associated SNP is rs334 in the HBB gene, which encodes the β-globin subunit of . This SNP, involving a GAG to GTG substitution (Glu6Val), causes sickle cell anemia in homozygotes by promoting abnormal hemoglobin polymerization under low-oxygen conditions, leading to red blood cell sickling and vaso-occlusive crises. The heterozygous state confers resistance, explaining its persistence in malaria-endemic regions. Another key example in is the pair of SNPs rs429358 and rs7412 in the APOE gene, which define the three major isoforms (ε2, ε3, ε4) of , a transporter in the . The ε4 , tagged by the rs429358 C allele and rs7412 T allele, increases risk by 3-15-fold depending on copy number, likely through impaired amyloid-β clearance and . Conversely, the ε2 allele (rs429358 T and rs7412 C) is protective, reducing risk by up to 40%. These variants account for 15-25% of Alzheimer's in populations of European descent. For non-disease traits, rs4988235 in the MCM6 gene, located upstream of the encoding , exemplifies adaptive in humans. The T enables into adulthood, allowing dairy digestion in populations with historical , such as Northern Europeans where its frequency exceeds 70%. This arose around 7,500 years ago and spread via positive selection, contrasting with the ancestral C causing post-weaning. Skin pigmentation variation is strongly influenced by rs1426654 in SLC24A5, a gene involved in function. The derived A , prevalent in Europeans (>98% frequency), reduces production by altering a threonine-to-alanine , contributing 25-38% to pigmentation differences between Europeans and Africans. This SNP likely spread via selection for lighter in low-UV environments to enhance synthesis. In polygenic contexts, GWAS have identified hundreds of SNPs contributing to like (). Notable examples include rs9939609 near FTO, where the risk increases by 0.4 kg/m² on average and risk by 20-30%, possibly through hypothalamic regulation of appetite, and rs17782313 near , which elevates similarly via melanocortin signaling disruptions in . These SNPs, among over 900 BMI-associated variants, explain about 20% of trait variance collectively. Recent studies in the have highlighted SNPs modulating infectious disease susceptibility, such as rs12329760 in , a facilitating entry into cells. The minor T allele (MAF ~0.36 in East Asians per data), more frequent in East Asians than in other populations, reduces TMPRSS2 expression and SARS-CoV-2 infectivity, conferring protection against SARS-CoV-2 infection and moderate symptoms, as evidenced by lower infection rates in carriers. This variant underscores how common SNPs can influence pandemic dynamics.

SNPs in Non-Human Organisms

Single-nucleotide polymorphisms (SNPs) play a crucial role in non-human organisms, influencing traits relevant to , veterinary , and microbial . Across , SNP density varies significantly; for instance, in Drosophila melanogaster, nucleotide diversity is approximately ten-fold higher than in humans, with an average of about one SNP every 167 base pairs due to elevated polymorphism rates. In plants like , SNP densities range from 6 to 22 per kilobase, enabling detailed genomic studies for breeding programs. exhibit even higher intra-species variation, though SNP densities within strains are typically lower, around 0.005% to 0.1%, facilitating tracking of evolutionary changes. In , SNPs are extensively used for crop improvement through (MAS). In (Zea mays), genome-wide association studies have identified SNPs linked to quantitative trait loci (QTLs) for yield components, such as the QTL qERN2a on associated with ear row number, which improves grain yield potential. These markers allow breeders to select favorable alleles early in development, accelerating the creation of high-yielding varieties without extensive field testing. Similarly, SNPs associated with stover yield, like those near genes influencing composition, support dual-purpose maize breeding for grain and . In animals, SNPs contribute to understanding domestication and disease susceptibility. During dog (Canis familiaris) domestication, artificial selection fixed SNPs in genes like ASIP, regulating to produce diverse coat color patterns, such as black-and-tan pigmentation, which arose from ancient modular promoters. In veterinary applications, SNPs aid in identifying disease associations; for example, genome-wide association studies in retrievers have linked SNPs on chromosomes 1 and 11 to risk, informing breeding strategies to reduce incidence. Such associations extend to other traits, like growth and fatness in pigs, where missense SNPs in metabolic genes correlate with meat quality. In microbes, SNPs enable tracking of evolution, particularly antibiotic resistance. In like Escherichia coli, SNPs in genes such as gyrA and penicillin-binding proteins (pbp1A, penA) confer resistance to quinolones and beta-lactams, respectively, with evolutionary paths revealed through whole-genome sequencing of resistant isolates. For Pseudomonas aeruginosa, mixed-strain populations accelerate resistance via SNPs in regulators, allowing rapid adaptation in clinical settings. These SNP-based analyses help monitor resistance spread and inform .

Resources and Analysis

Databases and Nomenclature

Single-nucleotide polymorphisms (SNPs) are identified and cataloged using standardized nomenclature systems to ensure consistency across research and clinical applications. The primary identifier for SNPs in many databases is the Reference SNP (rs) ID, assigned by the Database of Single Nucleotide Polymorphisms (dbSNP) at the (NCBI), which uniquely clusters submissions referring to the same variant locus. For precise description of variant changes, the Variation Society (HGVS) nomenclature is widely adopted, providing formats such as genomic (g.), coding DNA (c.), or protein (p.) descriptors; for example, a SNP might be denoted as NM_000546.6:c.88C>T to indicate a cytosine-to-thymine at position 88 in the TP53 transcript. This system is authorized and maintained by the Organisation (HUGO) through its Variant Nomenclature Committee, with guidelines emphasizing unambiguous, position-based descriptions relative to reference sequences like GRCh38. Major databases serve as central repositories for SNP data, facilitating global access and integration. dbSNP, hosted by NCBI, remains the foundational resource, aggregating submissions from thousands of contributors worldwide; its latest release, Build 157 in March 2025, contains approximately 1.2 billion RefSNP cluster IDs, reflecting ongoing curation of human and other organism variants. Ensembl, a joint project of the European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) and the , integrates SNP data by importing dbSNP rsIDs and providing browser-based visualization, with support for over 4800 eukaryotic genomes in its 2025 release. The , now maintained by the International Genome Sample Resource (IGSR), contributes high-quality SNP catalogs from diverse populations, including data from 2,504 individuals across 26 populations, and its variants are routinely submitted to dbSNP while being accessible via Ensembl for cross-referencing. These databases store comprehensive SNP information beyond mere identification, including allele frequencies derived from population studies, functional annotations such as predicted impacts on genes or proteins (e.g., missense, synonymous), and flags where applicable. For instance, dbSNP incorporates frequency data from sources like the and the Genome Aggregation Database (gnomAD), enabling queries on minor allele frequencies across ancestries. Annotations often include mappings to genomic features, evolutionary conservation scores, and links to associated phenotypes from curated submissions. Submissions to dbSNP follow a structured process: researchers obtain a submission from NCBI, prepare data in formats like tab-delimited text or VCF files detailing variant positions, alleles, and supporting evidence, then email or upload to [email protected] for processing and assignment of rsIDs, with resubmissions tracked via reports. Ensembl and IGSR accept similar formats but focus on integration rather than primary submission, pulling from dbSNP and public releases. Maintaining consistency across these resources presents ongoing challenges, particularly in harmonizing identifiers and data representations amid frequent updates. Versioning issues arise as reference genomes evolve (e.g., from GRCh37 to GRCh38), requiring remapping of SNP positions and potentially splitting or merging clusters when submissions conflict or new evidence emerges, which can lead to discrepancies in frequencies or annotations between dbSNP, Ensembl, and other archives. Efforts to address include cross-database linking via IDs and tools like the Variant Call Format (VCF) standard, but inconsistencies persist due to differing curation priorities and submission quality, necessitating regular synchronization by consortia like the Global Alliance for Genomics and Health.

Methods for Detection and Prediction

Single-nucleotide polymorphisms (SNPs) are detected through a variety of high-throughput methods that leverage advances in genomic technologies to identify variations at the single-base level with high accuracy. SNP arrays, such as those developed by Illumina and Affymetrix, enable the simultaneous genotyping of hundreds of thousands to millions of predefined SNPs per sample by hybridizing fragmented DNA to probes on a microarray chip, achieving call rates exceeding 99% accuracy in population-scale studies. These arrays are cost-effective for targeted genotyping in large cohorts but are limited to known SNP positions and may miss rare variants. Next-generation sequencing (NGS), including platforms like Illumina's short-read sequencers, provides a more comprehensive approach by sequencing entire genomes or targeted regions, allowing de novo discovery of SNPs through alignment to reference genomes and variant calling algorithms that model sequencing errors, with typical SNP detection accuracies above 99% after quality filtering. Predicting the functional effects of SNPs involves computational tools that annotate variants based on their genomic context and potential impacts on gene function, protein structure, or regulation. The Ensembl Variant Effect Predictor (VEP) is a widely used algorithm that classifies SNPs according to their overlap with genes, transcripts, and regulatory elements, predicting consequences such as missense mutations, splice site disruptions, or synonymous changes, and integrating data from multiple databases for prioritization in clinical and research settings. Recent advances in machine learning have enhanced prediction accuracy; for instance, AlphaMissense, a deep learning model trained on protein sequences and structures, assesses the pathogenicity of missense SNPs genome-wide, classifying all approximately 71 million possible missense variants in the human proteome as likely benign or pathogenic with performance surpassing traditional tools like SIFT and PolyPhen-2 on benchmark datasets. These tools facilitate rapid annotation but require validation against experimental data to account for context-specific effects. Analysis pipelines for SNPs typically incorporate imputation and (QC) steps to maximize data utility, particularly in genome-wide association studies (GWAS). Imputation leverages (LD), the non-random of alleles at nearby loci, to infer ungenotyped SNPs using reference panels like the , enabling the filling of missing data and increasing effective sample size by up to 20-30% in diverse populations. QC procedures, including checks for missingness, Hardy-Weinberg equilibrium deviations, and thresholds, filter out low-quality variants and samples to reduce false positives, with standard pipelines removing up to 20% of SNPs based on call rates below 95% or excessive heterozygosity. Emerging technologies are addressing limitations in SNP detection and validation, particularly for complex genomic regions. Long-read sequencing platforms, such as PacBio and Oxford Nanopore, generate reads spanning thousands of base pairs to resolve s in repetitive or structurally variant areas where short-read NGS struggles, improving phasing accuracy for reconstruction and rare variant discovery in challenging loci like segmental duplications. Additionally, CRISPR-based validation methods, including targeting followed by sequencing, confirm predicted SNP effects by editing specific variants in cellular models and assessing phenotypic outcomes, such as altered protein function, thereby bridging computational predictions with functional evidence.

References

  1. [1]
    single nucleotide polymorphism / SNP | Learn Science at Scitable
    A single nucleotide polymorphism, or SNP (pronounced "snip"), is a variation at a single position in a DNA sequence among individuals.
  2. [2]
    Single-Nucleotide Polymorphism - an overview | ScienceDirect Topics
    A single-nucleotide polymorphism (SNP) is a variation in a single nucleotide that occurs at a specific position in the genome. These genetic variations in ...
  3. [3]
    SNPs: impact on gene function and phenotype - PubMed
    Single nucleotide polymorphism (SNP) is the simplest form of DNA variation among individuals. These simple changes can be of transition or transversion type ...
  4. [4]
    Exploring the Impact of Single-Nucleotide Polymorphisms on ... - NIH
    There are at least 10 million SNPs within the genome, occurring approximately every 100–300 base pairs and with an allele frequency greater than 1%, making ...5′ Leader Sequence... · Snps Affect Start Codon... · Snps And Elongation Rates
  5. [5]
    Single Nucleotide Polymorphism - an overview | ScienceDirect Topics
    SNPs (Single Nucleotide Polymorphisms) are defined as single code changes in a single base pair of DNA and are a major factor in genetic variation.
  6. [6]
    Human-genome single nucleotide polymorphisms affecting ...
    Most SNPs associated with certain traits or pathologies are mapped to regulatory regions of the genome and affect gene expression by changing transcription ...
  7. [7]
    Single nucleotide polymorphisms (SNPs) are inherited from parents ...
    They are used for the identification of inherited cancer susceptibility genes and those that may interact with environmental factors.
  8. [8]
    Genome-wide association studies: the good, the bad and the ugly
    Genome-wide association studies (GWASs) have identified hundreds of common genetic variants (usually single nucleotide polymorphisms [SNPs]) associated with ...
  9. [9]
    The Use of SNPs in Pharmacogenomics Studies - PMC - NIH
    There is enormous diversity in SNP frequency between genes, reflecting different selective pressures on each gene as well as different mutation and ...
  10. [10]
    The Discovery of Single-Nucleotide Polymorphisms—and ... - NIH
    It has important consequences for the mutation distribution over the genealogy of the sample. Mutations that occur during the most recent coalescent interval, t ...Introduction · Results · Analysis Of Snp Allele...
  11. [11]
    Single nucleotide polymorphism arrays: a decade of biological ...
    Jul 1, 2009 · Single nucleotide polymorphisms (SNPs)—genome positions at which there are two distinct nucleotide residues (alleles) that each appears in a ...
  12. [12]
    Large-Scale Validation of Single Nucleotide Polymorphisms in Gene ...
    Single nucleotide polymorphisms (SNPs) are the most abundant genetic variations in the human genome. They occur, on average, once every 300 base pairs of ...
  13. [13]
    Integrating common and rare genetic variation in diverse human ...
    The Human Genome Project, the SNP Consortium and the International HapMap Project collectively identified ~10 million common DNA variants, primarily SNPs, in a ...
  14. [14]
    DNA Replication Fidelity and Cancer - PMC - NIH
    Normal cells avoid deleterious mutations by replicating their genomes with extraordinary accuracy. Here we review the pathways governing DNA replication ...
  15. [15]
    Genetic epidemiology of single-nucleotide polymorphisms - PMC - NIH
    Most genetic determinants of disease are single-nucleotide polymorphisms (SNPs) that are likely to be selected as markers for positional cloning.
  16. [16]
    Single Nucleotide Polymorphisms (SNPs)
    A single nucleotide polymorphism (abbreviated SNP, pronounced snip) is a genomic variant at a single base position in the DNA.
  17. [17]
    Single Nucleotide Variation Analysis in 65 Candidate Genes ... - NIH
    ... (SNPs), and >4 million SNPs have already been identified in the human genome (Sherry et al. 2001). It is generally agreed that SNPs form the basis for ...
  18. [18]
    A strategy for detection of known and unknown SNP using a ...
    We describe a simplified strategy for fluorimetric detection of known and unknown SNP by proportional hybridization to oligonucleotide arrays.
  19. [19]
    What are single nucleotide polymorphisms (SNPs)? - MedlinePlus
    Mar 22, 2022 · SNPs help predict an individual's response to certain drugs, susceptibility to environmental factors such as toxins, and risk of developing ...
  20. [20]
    Genomics explainer: types of genetic variants
    Classification based on type of alteration​​ Sometimes SNVs are known as single nucleotide polymorphisms (SNPs), although SNV and SNPs are not interchangeable. ...
  21. [21]
    Rare Variants, Common Markers: Synthetic Association and Beyond
    Rare variants were defined as SNPs with minor allele frequency (MAF) < 0.01, and common variants were defined as SNPs with MAF > 0.05. ... (synonymous) rare ...Missing: classification non-
  22. [22]
    Human Genomic Variation
    Feb 1, 2023 · SNVs are the most common type of genomic variation. A subtype of SNVs is called a single-nucleotide polymorphism (SNP; pronounced “snip”).
  23. [23]
    Small insertions and deletions (INDELs) in human genomes
    INDELs are small insertions and deletions in human genomes, often 2-16 bp long, and are the second most abundant genetic variation after SNPs.
  24. [24]
    Single nucleotide polymorphisms (SNPs) in mitochondrial genes ...
    Jul 9, 2022 · In contrast, certain mtDNA single nucleotide polymorphisms (SNPs) may be beneficial to mitochondrial electron transport chain function and the ...
  25. [25]
    mtDNA haplogroup and single nucleotide polymorphisms structure ...
    Apr 3, 2014 · Human mtDNA is characterized by variants, which in turn define haplogroups and polymorphisms. Mitochondria haplogroups are defined on the basis ...
  26. [26]
    Sequencing Your Genome: What Does It Mean? - PMC - NIH
    Of the approximately 4 million DSVs in each genome, about 3.5 million involve only a single nucleotide and hence are called single nucleotide variants (SNVs) ...
  27. [27]
    A HapMap harvest of insights into the genetics of common disease
    May 1, 2008 · Roughly 10 million such sites, on average about one site per 300 bases, are estimated to exist in the human population such that both alleles ...Snps And Linkage... · Figure 3. Tag Snps Can... · Building A Haplotype Map Of...
  28. [28]
    gnomAD v3.0
    Oct 16, 2019 · We are thrilled to announce the release of gnomAD v3, a catalog containing 602M SNVs and 105M indels based on the whole-genome sequencing of 71 ...
  29. [29]
    AFRICAN GENETIC DIVERSITY: Implications for Human ...
    African populations are characterized by greater levels of genetic diversity, extensive population substructure, and less linkage disequilibrium (LD) among loci ...
  30. [30]
    Linkage disequilibrium maps for European and African populations ...
    Oct 17, 2019 · Here we report LD maps generated from WGS data for a large population of European ancestry, as well as populations of Baganda, Ethiopian and Zulu ancestry.
  31. [31]
    Estimating the genome-wide mutation rate from thousands of ...
    Our overall estimate of the average genome-wide mutation rate per 108 base pairs per generation for single-nucleotide variants is 1.24 (95% CI 1.18–1.33).Subjects And Methods · Results · Analysis Of Topmed Data
  32. [32]
    Evolution of the mutation rate - PMC - PubMed Central
    Thus, with a human germline mutation rate of ~10−8 base substitutions/site/generation, a site in a somatic nucleus will be mutated with a probability of 10−7 ...Random Genetic Drift As The... · Figure 2 · Somatic Mutation
  33. [33]
    Sickle Cell Disease - GeneReviews® - NCBI Bookshelf - NIH
    Sep 15, 2003 · Glu6Val allele (e.g., homozygous p.Glu6Val; p.Glu6Val and a second HBB pathogenic variant) on molecular genetic testing. Newborn screening ...
  34. [34]
    Biochemistry, Mutation - StatPearls - NCBI Bookshelf - NIH
    Nonsense mutations occur when a single nucleotide change results in a ... sickle cell anemia. Science. 1985 Dec 20;230(4732):1350-4. [PubMed: 2999980].<|separator|>
  35. [35]
    Convergent adaptation of human lactase persistence in Africa and ...
    A SNP in the gene encoding lactase (LCT) (C/T-13910) is associated with the ability to digest milk as adults (lactase persistence) in Europeans.
  36. [36]
    Pathogenic variants that alter protein code often disrupt splicing - PMC
    Aug 3, 2019 · We found ~10% (513/4,964) of exonic disease alleles disrupt splicing in vivo and in vitro. In contrast, only 3% (7/228) of common SNPs altered ...
  37. [37]
    SIFT web server: predicting effects of amino acid substitutions ... - NIH
    Algorithm description. SIFT uses sequence homology to compute the likelihood that an amino acid substitution will have an adverse effect on protein function.
  38. [38]
    Comparison and integration of deleteriousness prediction methods ...
    Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome ...
  39. [39]
    Most Rare Missense Alleles Are Deleterious in Humans - NIH
    A missense mutation can be lethal or can cause severe Mendelian disease; alternatively, it can be mildly deleterious, effectively neutral, or beneficial.
  40. [40]
    Evolutionary evidence of the effect of rare variants on disease etiology
    Those investigators estimated that about 40% of sites in protein-coding regions are deleterious and subject to negative selection. When combining SNP data with ...
  41. [41]
    A saturated map of common genetic variants associated ... - Nature
    Oct 12, 2022 · Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40–50% of phenotypic variation in human height, ...
  42. [42]
    Presence of Multiple Independent Effects in Risk Loci of Common ...
    Abstract. Many genetic loci and SNPs associated with many common complex human diseases and traits are now identified. The total genetic variance explained ...Missing: review paper
  43. [43]
    Polygenic scores: prediction versus explanation | Molecular Psychiatry
    Oct 22, 2021 · Polygenic scores will never predict complex traits with perfect precision because heritabilities are about 50% for most behavioural traits [22] ...
  44. [44]
    Embracing polygenicity: a review of methods and tools for ...
    Estimation of SNP-heritability has been of particular importance for disease traits, especially those of low lifetime risk (<1% is typical of most common ...
  45. [45]
    Human Skin Pigmentation as an Adaptation to UV Radiation - NCBI
    Human skin pigmentation is the product of two clines produced by natural selection to adjust levels of constitutive pigmentation to levels of UV radiation (UVR) ...UV RADIATION AS A... · UV RADIATION AND THE... · CONCLUSIONS
  46. [46]
    The evolution of human skin pigmentation: A changing medley of ...
    This review examines putative, yet likely critical evolutionary pressures contributing to human skin pigmentation and subsequently, depigmentation phenotypes.
  47. [47]
    Comparative genomics and genome-wide SNPs of endangered ...
    Nov 13, 2023 · This study provides valuable insights about Eld's deer populations and appropriate breeder selection in efforts to repopulate this endangered species while ...
  48. [48]
    Conservation genetics as a management tool: The five best ... - PNAS
    Dec 20, 2021 · Conservation genetics remains a rapidly developing discipline highly relevant to the management of the increasing number of threatened populations and species.
  49. [49]
    Genome-wide association studies | Nature Reviews Methods Primers
    Aug 26, 2021 · This can include in silico fine-mapping, SNP to gene mapping, gene to function mapping, pathway analysis, genetic correlation analysis, ...
  50. [50]
    Hypothesis-Driven Candidate Gene Association Studies
    Sep 17, 2009 · Within the targeted candidate genes, this approach may confer inferential advantages in comparison with untargeted screening strategies such as ...
  51. [51]
    Commentary: What is the case for candidate gene approaches in the ...
    Feb 14, 2017 · The typical goal of a candidate study is quite distinct from a GWAS: the focus is on a biological path or process of interest, rather than an ...Missing: challenges | Show results with:challenges
  52. [52]
    A Short History of the Genome-Wide Association Study - NIH
    Its mission was to identify up to 150,000 SNPs throughout the human genome within two years, to make the information available to the public, and to develop ...Hapmap · Gold Rush · Real Value Of GwassMissing: 1970s | Show results with:1970s
  53. [53]
    10 Years of GWAS Discovery: Biology, Function, and Translation
    Here we review the remarkable range of discoveries that genome-wide association studies (GWASs) have facilitated in population and complex-trait genetics.
  54. [54]
    A tutorial on conducting genome‐wide association studies
    This tutorial aims to provide a guideline for conducting genetic analyses. Methods We discuss and explain key concepts and illustrate how to conduct GWAS.
  55. [55]
    Population Stratification in Genetic Association Studies - PMC - NIH
    Accounting for PS in candidate gene studies is challenging due to the lack of genome-wide coverage of genetic factors from which ancestry may be inferred. A ...
  56. [56]
    Recent advances and challenges of rare variant association ... - NIH
    We also review recent advances and challenges in rare variant analysis for familial sequencing data and for more complex phenotypes such as survival data.
  57. [57]
    After a decade of genome-wide association studies, a new phase of ...
    Aug 14, 2017 · Sequencing studies of rare variants have highlighted the biological pathways involved. Harnessing the power of numbers, a recent GWAS ...
  58. [58]
    Homozygosity Mapping: A Way to Map Human Recessive Traits with ...
    Homozygosity Mapping: A Way to Map Human Recessive Traits with the DNA of Inbred Children. Eric S. Lander and David BotsteinAuthors Info & Affiliations.
  59. [59]
    A Systematic Approach to Mapping Recessive Disease Genes in ...
    Jan 23, 2009 · The identification of recessive disease-causing genes by homozygosity mapping is often restricted by lack of suitable consanguineous families.
  60. [60]
    Linkage disequilibrium — understanding the evolutionary past and ...
    Linkage disequilibrium (LD) is the nonrandom association of alleles at different loci, or two or more loci.
  61. [61]
    Refining rare disease variant discovery in an isolated Andean ...
    Aug 15, 2025 · Rare genetic diseases pose significant diagnostic challenges, especially in geographically isolated populations where consanguinity, founder ...Ibd Detection And... · Roh And Ibd Patterns... · Combined Roh And Ibd...
  62. [62]
    Parametric and nonparametric linkage analysis: a unified multipoint ...
    In this paper, we describe how to extract complete multipoint inheritance information from general pedigrees of moderate size.
  63. [63]
    Parametric and Nonparametric Linkage Analysis - Wiley Online Library
    Jun 14, 2018 · Parametric or model-based linkage analysis assumes that models describing both the trait and genetic marker loci are known without error,.
  64. [64]
    Codeine Therapy and CYP2D6 Genotype - NCBI - NIH
    Sep 20, 2012 · The CYP2D6 enzyme is responsible for the metabolism of many commonly prescribed drugs, including antidepressants, antipsychotics, analgesics, ...Introduction · Drug: Codeine · Gene: CYP2D6 · The CYP2D6 Gene...
  65. [65]
    [PDF] Clinical Pharmacogenetics Implementation Consortium (CPIC ...
    Codeine is bioactivated to morphine, a strong opioid agonist, by the hepatic cytochrome P450 2D6 (CYP2D6); hence, the efficacy and safety of codeine as an ...
  66. [66]
    Warfarin Therapy and VKORC1 and CYP Genotype - NCBI - NIH
    Mar 8, 2012 · The VKORC1 and CYP2C9 genotypes are the most important known genetic determinants of warfarin dosing. Warfarin targets VKORC1, an enzyme involved in vitamin K ...Introduction · Drug: Warfarin · Gene: VKORC1 · Therapeutic...
  67. [67]
    Effect of VKORC1 Haplotypes on Transcriptional Regulation and ...
    Jun 2, 2005 · VKORC1 haplotypes can be used to stratify patients into low-, intermediate-, and high-dose warfarin groups and may explain differences in dose requirements.
  68. [68]
    Table of Pharmacogenomic Biomarkers in Drug Labeling - FDA
    Sep 23, 2024 · The table below lists therapeutic products from Drugs@FDA with pharmacogenomic information found in the drug labeling.
  69. [69]
    Pharmacogenomic Testing: Clinical Evidence and Implementation ...
    Abstract. Pharmacogenomics can enhance patient care by enabling treatments tailored to genetic make-up and lowering risk of serious adverse events.
  70. [70]
    Cost Effectiveness of Pharmacogenetic Testing for Drugs with ... - NIH
    The objective of this study was to evaluate the evidence on cost‐effectiveness of pharmacogenetic (PGx)–guided treatment for drugs with Clinical ...Missing: SNPs | Show results with:SNPs
  71. [71]
    Pharmacogenomics polygenic risk score for drug response ... - Nature
    Sep 8, 2022 · Efficacy PGx studies have great potential to guide treatment options by integrating routine pharmacogenomic screening into clinical development ...
  72. [72]
    Pharmacogenomics polygenic risk score: Ready or not for prime time?
    Jul 30, 2024 · Pharmacogenomic Polygenic Risk Scores (PRS) have emerged as a tool to address the polygenic nature of pharmacogenetic phenotypes, increasing the potential to ...
  73. [73]
    SNP genotyping of forensic casework samples using the 52 ...
    The study used 52 SNP markers to profile degraded DNA, achieving 36 full profiles compared to 17 full STR profiles, showing SNPs can generate full profiles ...
  74. [74]
    Dense single nucleotide polymorphism testing revolutionizes scope ...
    One of the foremost benefits of SNPs is their presence in smaller DNA fragments compared with STRs, making them particularly advantageous for analyzing highly ...
  75. [75]
    Forensically Relevant SNP Classes - Taylor & Francis Online
    May 16, 2018 · Single nucleotide polymorphisms (SNPs) offer promise to support forensic DNA analyses because of an abundance of potential markers, amenability ...Abstract · Introduction · Snps For Human...<|control11|><|separator|>
  76. [76]
    Ancestry Informative Marker Sets for Determining Continental Origin ...
    A comprehensive set of 128 AIMs and subsets as small as 24 AIMs are shown to be useful tools for ascertaining the origin of subjects from particular continents.
  77. [77]
    Worldwide population differentiation at disease-associated SNPs
    Jun 4, 2008 · The Fst statistic captures the difference in allele frequency between populations at any given SNP and ranges from 0 (no differentiation) to 1 ...
  78. [78]
    Extensive set of African ancestry-informative markers (AIMs) to study ...
    Ancestry-informative markers (AIMs) aid in the detection of population stratification and provide an alternative approach to map population-specific alleles to ...
  79. [79]
    Human population dispersal “Out of Africa” estimated from linkage ...
    A linkage disequilibrium (LD)–based approach allows changes in human population size to be traced over time and reveals a substantial reduction in N e ...
  80. [80]
    Explaining worldwide patterns of human genetic variation using a ...
    Studies of worldwide human variation have discovered three trends in summary statistics as a function of increasing geographic distance from East Africa.
  81. [81]
    Human Dispersal Out of Africa: A Lasting Debate - PMC
    Under all of these models, genetic evidence suggests that migration out of Africa was accompanied by a severe bottleneck in the initial migrating group(s), ...
  82. [82]
    Ethical Issues Associated With Direct-to-Consumer Genetic Testing
    Jun 3, 2023 · Poor consumer education prior to testing is a concerning issue. A lack of transparency relating to the accuracy of testing, which demographics ...
  83. [83]
    Ethical Issues Associated With Direct-to-Consumer Genetic Testing
    Jun 3, 2023 · This review aims to provide an overview of the services these companies purport to provide as well as highlight important ethical issues of the service.
  84. [84]
    Direct-to-consumer genetic testing: an updated systematic review of ...
    Oct 12, 2022 · Two ethical issues were only reported once each, both in the newly identified papers: 1) DTC-GT threatening the genetic counselling profession ...
  85. [85]
    Direct-to-Consumer Genetic Testing FAQ for Healthcare Professionals
    Jun 14, 2023 · As a result, consumers may have privacy and safety concerns. Additionally, depending on the DTC-GT company, an individual's DNA may be used ...
  86. [86]
    Sickle Cell Anemia and Its Phenotypes - PMC - PubMed Central
    The genetic causes of SCD include homozygosity for the rs334 mutation (HbSS) (generally known as SCA) and compound heterozygosity between rs334 and mutations ...
  87. [87]
    APOE and Alzheimer's Disease: Advances in Genetics ... - NIH
    Two single nucleotide polymorphisms (SNPs) —rs429358 and rs7412— define the three alleles of APOE, located in chromosome 19q13.2: ε2, ε3, and ε4. Relative ...
  88. [88]
    The evolutionary genetics of lactase persistence in seven ethnic ...
    Feb 11, 2019 · Series of studies revealed five regulatory variants that are located in the 14 kb upstream of LCT in various populations: − 13910*T (rs4988235) ...
  89. [89]
    The Light Skin Allele of SLC24A5 in South Asians and Europeans ...
    Nov 7, 2013 · Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the ...
  90. [90]
    Six new loci associated with body mass index highlight a neuronal ...
    Abstract. Common variants at only two loci, FTO and MC4R, have been reproducibly associated with body mass index (BMI) in humans.
  91. [91]
    The genetics of obesity: from discovery to biology - Nature
    Sep 23, 2021 · a | Prevalence of obesity (body mass index (BMI) ≥30 kg m−2) in women and men ≥20 years of age, from 1975 to 2016. b | Prevalence of obesity ( ...
  92. [92]
    TMPRSS2 gene polymorphism common in East Asians confers ...
    Nov 30, 2022 · We found that rs12329760 in the TMPRSS2 gene, a missense variant common in East Asian populations, contributes to protection against SARS-CoV-2 infection.
  93. [93]
    Genome-wide variation in the human and fruitfly: a comparison
    Average levels of nucleotide diversity are ten-fold lower in humans than in the fruitfly, Drosophila melanogaster. Despite this difference, apparently as a ...
  94. [94]
    Development of a maize 55 K SNP array with improved genome ...
    Feb 16, 2017 · In plants, the SNP density ranges from 6 to 22 SNPs per 1 kb sequence (Shen et al. 2004; Clark et al. 2007; Gore et al. 2009). The number of ...
  95. [95]
    Reconstruction of Microbial Haplotypes by Integration of Statistical ...
    Feb 6, 2021 · In general, high SNP density (between 0.5% and 2.0%) for viruses (Prabhakaran et al. 2014), and a lower range for bacteria (between 0.005% and ...
  96. [96]
    Mapping of QTL for Grain Yield Components Based on a DH ...
    Apr 27, 2020 · Therefore, the linked markers of the QTL qERN2a and qERN2-Z could be used in marker-assisted selection (MAS) for ERN improvement in maize ...
  97. [97]
    Association mapping for maize stover yield and saccharification ...
    Feb 9, 2021 · We identified 13 SNPs significantly associated with increased stover yield that corresponded to 13 QTL, and 2 SNPs significantly associated with improved ...
  98. [98]
    Dog colour patterns explained by modular promoters of ancient ...
    Aug 12, 2021 · Here, we identify independent regulatory modules for ventral and hair cycle ASIP expression, and we characterize their action and evolutionary origin.Missing: disease | Show results with:disease
  99. [99]
    Genome wide association study in Swedish Labrador retrievers ...
    Mar 13, 2024 · We performed a GWAS in Labrador retrievers to identify genetic loci associated with hip dysplasia and body weight.
  100. [100]
    SNP discovery and association study for growth, fatness and meat ...
    Sep 30, 2022 · We selected 1023 missense SNPs located on annotated genes and showing different allele frequencies between pigs with makerdly different growth ...
  101. [101]
    Deciphering the distance to antibiotic resistance for the ... - Nature
    Feb 16, 2017 · Other significant associations for SNPs in genes implicated in resistance to other essential antibiotics, for example PBPs (pbp1A, pbpX, penA) ...
  102. [102]
    Mixed strain pathogen populations accelerate the evolution ... - Nature
    Jul 12, 2023 · Here we show that mixed strain populations are common in the opportunistic pathogen P. aeruginosa. Crucially, resistance evolves rapidly in ...
  103. [103]
    HGNC Guidelines | HUGO Gene Nomenclature Committee
    Current guidelines for naming human genes. For a discussion of our latest guidelines please go to https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7494048/ ...Missing: SNP dbSNP rs
  104. [104]
    dbSNP Build 157 Release - NCBI Insights - NIH
    Mar 18, 2025 · We are pleased to announce the release of the Database of Single Nucleotide Polymorphisms (dbSNP) Build 157, which has approximately 1.2 billion Reference SNP ...
  105. [105]
    [PDF] Submission of SNPs to dbSNP - NCBI
    Sep 5, 2002 · To submit to dbSNP, get a handle, prepare a file with data, send it to snp-sub@ncbi.nlm.nih.gov, and receive a report. Resubmissions go to snp- ...
  106. [106]
    The evolution of dbSNP: 25 years of impact in genomic research
    Nov 12, 2024 · dbSNP catalog single nucleotide polymorphisms (SNPs) and other small genetic variations including SNVs, indels, microsatellites, and small ...
  107. [107]
    A fast and accurate SNP detection algorithm for next-generation ...
    Dec 4, 2012 · We propose a fast and accurate single-nucleotide polymorphism detection program that uses a binomial distribution-based algorithm and a mutation probability.Results · Methods · The Fasd Model
  108. [108]
    Whole Genome SNP Genotyping
    There are various methods for Single Nucleotide Polymorphism (SNP) genotyping based on Next-Generation Sequencing. Whole exome sequencing (WES): WES is a ...Microarray-Based Snp... · Ngs-Based Snp Genotyping · Pcr-Based Snp Genotyping...
  109. [109]
    Ensembl Variant Effect Predictor (VEP)
    Ensembl VEP predicts the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on gene transcripts and protein sequence.Running Ensembl VEP · Documentation contents · Download and install · TutorialMissing: single | Show results with:single
  110. [110]
    Accurate proteome-wide missense variant effect prediction ... - Science
    Sep 22, 2023 · AlphaMissense predicts the probability of a missense variant being pathogenic and classifies it as either likely benign, likely pathogenic, or ...
  111. [111]
    Long-read individual-molecule sequencing reveals CRISPR ...
    Aug 24, 2020 · It provides the first quantitative evidence of persistent nonrandom large structural variants and an increase in single-nucleotide variants at ...
  112. [112]
    Verification of CRISPR editing and finding transgenic inserts by ...
    The enriched DNA is compatible with (10) downstream analyses, such as long- and short-read sequencing. (For interpretation of the references to colour in this ...