Fact-checked by Grok 2 weeks ago

Fixation index

The fixation index, denoted as FST, is a fundamental measure in population genetics that quantifies the extent of genetic differentiation among subpopulations within a species due to factors such as genetic drift, gene flow, and selection. Introduced by Sewall Wright in 1951, it represents the proportion of total genetic variation attributable to differences between subpopulations rather than within them, providing a standardized way to assess population structure across diverse taxa. Values of FST range from 0, indicating no differentiation and complete genetic homogeneity (as in a panmictic population), to 1, signifying complete differentiation where subpopulations exhibit fixed allelic differences. Formally, FST is defined as the correlation between two randomly drawn s from the same subpopulation relative to the total population, or equivalently as FST = 1 - (HS / HT), where HS is the average expected heterozygosity within subpopulations and HT is the total expected heterozygosity across the population. This can also be expressed as the ratio of variance in frequencies among subpopulations to the total variance (σ²b / (σ²b + σ²w)), where σ²b is between-subpopulation variance and σ²w is within-subpopulation variance. Wright's framework extends to hierarchical (FIS, FIT), allowing analysis of and overall structure, but FST remains the primary index for inter-subpopulation divergence. Estimation typically involves molecular markers like SNPs or microsatellites, with modern genomic data enabling locus-specific calculations to detect outliers under selection. In practice, FST is applied across to infer , such as migration rates and effective population sizes, and in conservation genetics to evaluate fragmentation and risks in . Elevated FST values often signal barriers to or local adaptation, while low values suggest ongoing ; for instance, genome scans using FST identify selective sweeps by comparing across loci. Despite its utility, interpretations must account for biases from rare variants or uneven sampling, as highlighted in methodological refinements since Wright's era.

Fundamentals

Definition

The fixation index, denoted as F_{ST}, quantifies the degree of genetic differentiation among subpopulations within a larger population by measuring the proportion of total genetic variation attributable to differences between subpopulations rather than within them. This metric ranges from 0, indicating no differentiation (complete panmixia), to 1, signifying complete isolation and fixation of different alleles in each subpopulation. Sewall Wright originally defined F_{ST} in 1951 as the correlation between uniting gametes within subpopulations relative to the total array of gametes in the overall , initially formulated for biallelic loci in the context of and structure. This correlation-based approach captures how allele frequencies diverge due to factors like , , and selection across subpopulations. Wright's framework emphasized F_{ST} as a key parameter for understanding hierarchical . The definition was later extended to multi-allelic loci through the use of heterozygosity measures, where F_{ST} = 1 - \frac{H_S}{H_T}, with H_S representing the average expected heterozygosity within subpopulations and H_T the expected heterozygosity across the total (weighted by subpopulation sizes). This , introduced by Nei in 1973, provides an equivalent and computationally convenient expression for F_{ST} that directly reflects the partitioning of . As part of Wright's broader set of F-statistics, F_{ST} specifically addresses between-subpopulation effects and relates to F_{IS} (the inbreeding coefficient within subpopulations relative to their own gene pools) and F_{IT} (the total inbreeding coefficient relative to the overall population), forming a cohesive system for dissecting genetic variance at multiple levels.

Interpretation

The fixation index F_{ST} quantifies the extent of genetic differentiation among populations, reflecting the proportion of total genetic variation attributable to differences between subpopulations rather than within them. Biologically, it arises from evolutionary forces such as genetic drift, which causes random fluctuations in allele frequencies; limited migration, which reduces gene flow; and natural selection, which can promote divergence by favoring different alleles in different environments. This measure thus provides insights into how these processes have shaped population structure over time. Statistically, F_{ST} ranges from 0, indicating no and complete where frequencies are identical across populations, to 1, signifying complete with fixation of alternative s in different populations. As briefly referenced from its , F_{ST} = 1 - \frac{H_S}{H_T}, this value represents the reduction in heterozygosity within subpopulations relative to the total. provided guidelines for interpreting F_{ST} values: less than 0.05 indicates little genetic ; 0.05 to 0.15 indicates moderate ; 0.15 to 0.25 indicates great ; and greater than 0.25 indicates very great . These thresholds help assess the degree of population subdivision but are contextual and should be evaluated alongside other genetic metrics. Interpretation of F_{ST} has limitations, as it is sensitive to allele frequencies—rare variants can inflate estimates, leading to overestimation of differentiation in low-diversity loci. Additionally, the measure assumes neutrality; deviations due to selection can elevate F_{ST} at specific loci beyond what drift and migration alone would produce, complicating inferences about .

Estimation Methods

Mathematical Formulation

The fixation index F_{ST}, introduced by , quantifies the degree of genetic differentiation among subpopulations due to limited or drift. For a biallelic locus, it is formulated as the relative reduction in heterozygosity caused by population subdivision: F_{ST} = \frac{H_T - H_S}{H_T} where H_T = 2p(1-p) is the expected heterozygosity in the total population under random mating (treating all subpopulations as a single panmictic unit with overall p), and H_S is the average expected heterozygosity across subpopulations (each computed as $2p_i(1-p_i), with p_i the in subpopulation i). This expression measures the proportion of total attributable to differences between subpopulations. An equivalent derivation for biallelic loci expresses F_{ST} in terms of the variance in . Under the assumption of Hardy-Weinberg equilibrium within subpopulations, the expected heterozygosity relates directly to binomial sampling variance. Thus, F_{ST} = \frac{\sigma_p^2}{p(1-p)} where \sigma_p^2 is the variance of the p_i across subpopulations, and p(1-p) is the maximum variance for a biallelic locus in the total population. This variance-based form highlights F_{ST} as a standardized measure of divergence. For multi-allelic loci, the extends naturally by generalizing heterozygosity to multiple alleles. Here, H_T = 1 - \sum_k p_k^2 (or equivalently, the probability that two randomly drawn alleles from the total population differ), and H_S is the average of $1 - \sum_k p_{i k}^2 across subpopulations i for alleles k. The index becomes F_{ST} = 1 - \frac{H_S}{H_T}, which reduces to the biallelic case when there are two alleles. For multi-locus analyses, F_{ST} is computed as the average across loci, assuming independence. Wright's F-statistics form a correlated set, with F_{ST} related to the total F_{IT} (correlation between alleles within individuals relative to the total population) and the within-subpopulation F_{IS} (correlation relative to subpopulations). The core identity is $1 - F_{IT} = (1 - F_{IS})(1 - F_{ST}), reflecting the partitioning of effects. Solving for F_{ST} yields F_{ST} = \frac{F_{IT} - F_{IS}}{1 - F_{IS}}. To derive this, rearrange the identity: F_{IT} = 1 - (1 - F_{IS})(1 - F_{ST}) = F_{IS} + F_{ST} - F_{IS} F_{ST} = F_{IS} + F_{ST}(1 - F_{IS}). Isolating F_{ST} gives the expression above. This relation holds under the correlation framework of path analysis. These formulations assume Hardy-Weinberg equilibrium within subpopulations (justifying the use of expected heterozygosity based on allele frequencies) and often invoke the infinite alleles model with no recurrent mutation, where genetic differentiation arises solely from random genetic drift in subdivided populations.

Statistical Estimation Procedures

One widely used unbiased for the fixation index F_{ST}, denoted as \theta, was proposed by Weir and Cockerham in 1984. This corrects for bias arising from finite sample sizes and is given by \theta = \frac{MSB - MSW}{MSB + (n̄ - 1)MSW}, where MSB is the mean square between populations, MSW is the mean square within populations, and n̄ is the average sample size per population. This method provides a method-of-moments estimate that performs well under the infinite alleles model and is applicable to codominant markers such as allozymes or microsatellites. For biallelic markers like SNPs, an alternative estimator proposed by Hudson (1992) is commonly used: F_{ST} = \frac{\sigma_p^2}{p(1-p)}, where \sigma_p^2 is the variance in allele frequency across populations, and p is the overall allele frequency. This form is particularly suitable for genomic data with many loci and low heterozygosity. To obtain confidence intervals and standard errors for F_{ST} estimates, resampling techniques such as the bootstrap and jackknife are commonly employed. The bootstrap involves resampling with replacement from the genetic data (e.g., loci or individuals) to generate replicate datasets, from which the variability in \theta can be assessed; percentile or bias-corrected accelerated (BCa) bootstrap intervals are particularly effective for F_{ST}. Jackknife resampling, by contrast, systematically omits one data unit (e.g., a locus or population) at a time to compute pseudovalues, yielding standard errors that are less computationally intensive and robust to small sample sizes. These approaches account for the sampling distribution of F_{ST} without assuming normality, though block-jackknife variants are preferred for linked loci to mitigate autocorrelation. Estimates of F_{ST} can be biased downward in the presence of rare alleles or small sample sizes, as low-frequency variants inflate within-population heterozygosity relative to total heterozygosity. Bias corrections, such as those derived by Nei in 1977, adjust for these effects by incorporating sample size and thresholds, ensuring more accurate partitioning of gene diversity in subdivided populations. For hierarchical population structures involving multiple levels (e.g., individuals within demes within regions), simulation-based approaches facilitate the estimation of multilevel by generating synthetic datasets under specified demographic models to evaluate identifiability and bias. These methods, often integrated with ANOVA frameworks, allow for robust inference on coancestry coefficients across hierarchies.

Applications in Population Genetics

FST in Human Populations

The fixation index (F_ST) in human populations is characteristically low, typically ranging from 0.10 to 0.15 globally, reflecting substantial and a shared recent ancestry among diverse groups. This modest differentiation indicates that the majority of —approximately 85%—occurs within populations rather than between them, underscoring the limited role of geographic barriers in shaping genomes over the past 50,000–100,000 years. A seminal analysis by Lewontin in 1972, based on protein polymorphisms across 17 loci in seven racial categories, apportioned human diversity such that 85.4% was within populations, 8.3% among populations within races, and 6.3% between races, yielding an overall between-group component of about 14.6%. Subsequent studies using DNA markers have largely confirmed these patterns; for instance, an examination of 109 loci (including microsatellites and restriction fragment length polymorphisms) in 16 worldwide populations found 84.4% of variation within populations, with 5% between populations on the same continent and 8–11.7% between continents, corresponding to an F_ST of roughly 0.156. Modern genomic data from projects like the 1000 Genomes continue to support ~10–15% between-population variation, though estimates vary slightly with marker type and ascertainment bias, such as the inclusion of rare variants which can inflate F_ST by up to 20–30%. At the continental scale, F_ST values are higher between major groups, exemplifying ~0.139 between West Africans and Europeans and ~0.110 between Europeans and East Asians, highlighting greater differentiation across the African-Eurasian divide compared to within-continent comparisons (often <0.05). These patterns align with the Out-of-Africa model, where non-African populations derive from a subset of African diversity. Several demographic processes have profoundly influenced these low F_ST values in humans. The Out-of-Africa migration around 50,000–100,000 years ago involved a severe bottleneck, reducing effective population size to ~1,000–10,000 individuals and amplifying genetic drift, which elevated F_ST between Africans and non-Africans by fixing certain alleles outside . However, ongoing migration and gene flow—estimated at Nm >10 migrants per generation in many models—have counteracted drift, maintaining low global differentiation by homogenizing allele frequencies across continents. events, such as back-migrations into or Eurasian expansions into regions like the , further reduce F_ST by introducing hybrid ancestries, as seen in populations with 5–20% non-local components that blur continental boundaries.

FST in Non-Human Populations

In conservation genetics, the fixation index F_{ST} has been instrumental in assessing population structure and inbreeding in endangered animal species. For instance, genomic analyses of (Acinonyx jubatus) populations reveal high F_{ST} values ranging from 0.219 to 0.497 across subspecies and subpopulations, reflecting significant driven by historical bottlenecks and ongoing that exacerbate . These elevated F_{ST} levels highlight the need for targeted management strategies, such as translocations between isolated groups, to mitigate the loss of and enhance long-term viability. In ecological contexts, F_{ST} gradients in marine species often indicate extensive facilitated by larval dispersal. Many with pelagic larval phases exhibit low F_{ST} values typically below 0.05, as larvae can travel considerable distances via ocean currents, homogenizing across populations despite geographic separation. This pattern underscores the role of dispersal in maintaining connectivity, with implications for resilience and the design of marine protected areas to preserve . Applications of F_{ST} extend to microbial populations, where it helps quantify and recombination rates in . Low F_{ST} values in bacterial communities often signal high levels of (HGT), which introduces across strains and counters differentiation by facilitating the rapid spread of adaptive traits like antibiotic resistance. For example, analyses of soil bacterial populations have shown that elevated recombination and HGT correlate with reduced F_{ST}, promoting panmictic-like structures even in spatially separated groups. Comparative studies across taxa reveal systematic differences in F_{ST} linked to reproductive strategies, with self-pollinating generally exhibiting higher values than animals due to reduced from limited dispersal. Seminal research since the 1990s, including meta-analyses of seed , has demonstrated that selfing species can have F_{ST} values up to 10 times greater than outcrossers, as self-fertilization promotes local but increases . This contrast highlights how mating systems influence genetic structure, with selfers showing steeper differentiation gradients in fragmented habitats compared to the broader connectivity in animal outcrossers.

Genetic Distances Derived from FST

Autosomal Distances Using Classical Markers

In the pre-genomics era, autosomal genetic distances based on the fixation index (FST) were primarily derived from classical genetic markers, including ABO blood groups, (HLA) loci, and electrophoretic variants of proteins such as enzymes and serum proteins. These markers, detectable through serological and electrophoretic techniques, provided data from hundreds of loci across diverse , enabling early estimates of . Despite their limited resolution compared to modern methods, they captured substantial inter-population variation attributable to historical , drift, and selection. A landmark study by Cavalli-Sforza, Menozzi, and Piazza (1994) synthesized data from more than 40 global populations using these classical markers, yielding an average FST of approximately 0.11. This value indicates that about 11% of the total occurs between populations, with the remainder within them, highlighting moderate but structured differentiation consistent with human . The analysis incorporated over 120 such markers, emphasizing their role in revealing patterns of genetic affinity that align with broad geographic divisions. To derive phylogenetic and clustering insights from these FST estimates, researchers applied specialized distance metrics. The , introduced by Cavalli-Sforza and Edwards (1967), models allele frequencies as points on a hypersphere, computing as the straight-line () length between them:
d_c = \sqrt{2 \left(1 - \sqrt{\sum_{i=1}^k \sqrt{p_i q_i}}\right)}
where p_i and q_i are allele frequencies in the two populations, and k is the number of alleles. This geometric approach proved effective for constructing trees that reflected evolutionary relationships. Alternatively, Nei's (1972), adapted for FST, approximates the extent of allelic as D = -\ln(1 - \bar{F}_{ST}) for small values, treating FST as a proxy for accumulated substitutions per locus.
Analyses using these classical marker-derived distances consistently revealed continental-scale clustering in human populations, with principal component maps and neighbor-joining trees separating groups like Africans, Eurasians, and Oceanians, even before the advent of genomic sequencing. This pre-genomics evidence reinforced the moderate overall FST levels in human populations, as overviewed in studies of global differentiation.

Autosomal Distances Using SNPs

The analysis of autosomal genetic distances using single nucleotide polymorphisms (SNPs) has been pivotal in quantifying human population structure through the fixation index (F_ST), providing dense genome-wide data that surpass the resolution of earlier marker types. Early SNP datasets from the International HapMap Project, initiated in 2005, genotyped over 1 million common SNPs across four continental populations (Yoruba from West Africa, Europeans from Utah, Han Chinese from Beijing, and Japanese from Tokyo), yielding average pairwise F_ST values of approximately 0.11 between Europeans and East Asians and 0.16 between Europeans and West Africans. These estimates, derived from autosomal SNPs, highlighted that about 12% of genetic variation occurs between continental groups, with the majority (88%) within populations. Subsequent expansions in HapMap Phase 3 incorporated additional populations, maintaining similar F_ST ranges while increasing SNP coverage to over 1.5 million markers. The , culminating in its Phase 3 release, sequenced 2,504 individuals from 26 populations across five continental superpopulations (, , East Asian, , and South Asian), cataloging 84.7 million SNPs and indels. This dataset produced refined F_ST estimates of 0.106 between Europeans and East Asians and 0.139 between Europeans and West Africans, adjusted for rare variant biases using the Hudson estimator, underscoring a global F_ST of around 0.09–0.12 for most continental pairwise comparisons. The genome-wide density of SNPs enabled detection of fine-scale structure, such as elevated F_ST in isolated indigenous groups; for instance, the of the exhibit F_ST values of approximately 0.05 with South Asian populations and around 0.10 with Europeans, reflecting long-term isolation and drift. A key advantage of SNP-based F_ST is its ability to reveal local through windowed scans, where F_ST is computed in sliding genomic windows (e.g., 50–100 kb) to identify outlier regions of elevated differentiation amid neutral background variation. These scans have pinpointed signals of selection, such as in genes related to pigmentation (e.g., SLC24A5) or (LCT), where local F_ST exceeds 0.3 in targeted windows between adapted populations. By averaging or maximizing F_ST across SNPs within windows, this approach outperforms single-locus estimates for detecting soft sweeps and polygenic , as demonstrated in simulations and empirical data. Post-2015 studies integrating with modern datasets have illuminated temporal dynamics in F_ST, showing how and altered over millennia. For example, ancient genomes from indicate that F_ST between hunter-gatherers and early farmers was around 0.10–0.15, decreasing to modern levels (∼0.05 intra-continental) due to subsequent . In , Neolithic-era F_ST between northern and southern ancient s was approximately 0.04, but declined further post- around 5,000–7,000 years ago, as evidenced by 26 ancient individuals from , reflecting southward expansion and reduced isolation. These temporal shifts underscore how F_ST evolves with , with providing calibration for interpreting contemporary -derived distances.

Autosomal Distances Using Whole Exome Sequencing

Whole exome sequencing targets approximately 1-2% of the , focusing exclusively on protein-coding regions and adjacent splice sites, which enables precise estimation of F_ST-based genetic distances in functionally constrained areas. Studies utilizing large-scale exome datasets have revealed that average F_ST values in coding regions are notably lower, around 0.08 for distantly related populations such as Europeans and Africans, compared to synonymous sites serving as proxies for non-coding regions (F_ST ≈ 0.15). This reduction arises from purifying selection, which removes deleterious variants more efficiently across populations, thereby dampening differentiation signals in coding sequences. In pharmacogenetically relevant genes, however, F_ST values can be elevated due to localized balancing or pressures. For instance, variants in genes like exhibit high , with F_ST up to 0.74 between African and European super-populations, reflecting adaptive differences in . Such patterns highlight how data uncovers population-specific frequencies in clinically important loci, contrasting with broader SNP-based trends where is more uniform across the . Methodologically, functional constraints in coding regions introduce biases in F_ST estimates, as purifying selection disproportionately affects rare, deleterious alleles, leading to underestimation of differentiation relative to neutral markers. This necessitates adjustments, such as filtering for synonymous variants or incorporating selection metrics like CADD scores, to interpret exome-derived distances accurately. Post-2020 advances have integrated exome sequencing with whole-genome data to better delineate admixture patterns, enhancing resolution of ancestry proportions in diverse cohorts. For example, exome-based ancestry estimation in multi-ethnic patient groups has identified admixed profiles using tools like ADMIXTURE, revealing subtle gene flow histories that influence coding variant distributions. These hybrid approaches mitigate exome's limited scope while leveraging its depth in coding regions for admixture-informed F_ST analyses.

Computational Tools

Standalone Programs

Arlequin is a graphical user interface (GUI)-based standalone software package designed for population genetics analyses, including the computation of F-statistics such as FST. Initially released in 1996 and significantly updated in version 3.5 in 2010 (with the latest version 3.5.2.2 as of 2015), it supports multi-locus datasets encompassing restriction fragment length polymorphisms (RFLPs), DNA sequences, and microsatellites, allowing users to estimate FST through methods like analysis of molecular variance (AMOVA). The software processes input files in a structured format and outputs pairwise FST values along with significance tests via permutation procedures, making it accessible for researchers without advanced programming skills. Genepop, first developed in and re-implemented in version 4.0 in with further updates to version 4.8.4 as of August 2025 including a 2020 web interface, is a command-line standalone program for estimating FST and related measures like rhoST for stepwise models. It performs exact tests for Hardy-Weinberg , , and genotypic disequilibrium while computing FST via unbiased estimators for multi-locus codominant data such as microsatellites. Genepop supports of input files in its native format and provides options for pairwise comparisons, isolation-by-distance regressions, and output in tabular form for further analysis. VCFtools, introduced in 2011, is a command-line toolkit specifically tailored for processing (VCF) files and includes functionality for SNP-based FST estimation using the Weir and Cockerham method. Users specify population files listing individuals from the VCF to compute windowed or site-specific FST values between pairs of s, enabling analysis of large-scale genomic data from next-generation sequencing. The tool outputs FST summaries in text files, with options for filtering variants by quality or to refine estimates. PLINK is a widely used command-line toolkit for whole-genome association and population genetic analyses, including FST estimation using the --fst flag for pairwise or multi-group comparisons on data in or VCF formats. Introduced in with major updates in PLINK 2.0 (2019), it supports efficient processing of large genomic datasets and is particularly popular for human and non-human population studies as of 2025. Despite their utility, these standalone programs exhibit limitations in handling very large datasets, such as whole-genome VCF files exceeding millions of variants, due to sequential processing that can result in extended run times without built-in parallelization in core versions. For instance, VCFtools recommends subsetting by to manage and speed for massive inputs, while Arlequin's may constrain file sizes in certain formats, and Genepop's exact tests scale poorly with high locus counts. Recent updates have improved efficiency for moderate-scale analyses, but users often complement these tools with preprocessing scripts for genomic-scale FST computations.

Integrated Modules

Integrated modules refer to programmable libraries and packages that enable the incorporation of F_ST calculations into larger analytical workflows, facilitating automated and reproducible population genetic analyses. These tools, primarily in and , allow researchers to process genetic data formats like VCF files and compute fixation indices within scripted pipelines, contrasting with standalone executables by emphasizing and extensibility. In , the hierfstat package, introduced in 2005, provides functions for estimating hierarchical , including F_ST, from haploid or diploid data across multiple levels, using algorithms based on variance partitioning. It supports significance testing via and is suitable for datasets with complex structures, such as nested subpopulations. The poppr package, developed for populations with mixed sexual and clonal , extends F_ST-like measures such as G_ST (a genomic analog of F_ST) to account for clonality, enabling bootstrap-supported estimates of differentiation in partially asexual organisms. Python libraries support VCF-based F_ST computations through efficient data handling and statistical functions. The pysam library serves as a lightweight interface to HTSlib for reading and manipulating VCF files, providing the foundational input processing needed for downstream F_ST analyses in genomic pipelines. Combined with scikit-allel, which implements Weir-Cockerham, , and Patterson methods for F_ST estimation from arrays derived from VCF data, these tools enable rapid calculation of variance components across large variant sets. Bioconductor's SNPRelate package offers high-performance tools for genome-wide F_ST calculations on data stored in GDS format, using to handle millions of markers and samples efficiently. The snpgdsFst function computes pairwise fixation indices via Weir-Cockerham estimators, making it ideal for large-scale studies requiring . These integrated modules provide key advantages, including scriptability for custom workflows and in high-throughput analyses, as evidenced by their adoption in population genomic pipelines processing next-generation sequencing data. For instance, combining scikit-allel with VCF tools has streamlined F_ST scans in and microbial , reducing manual intervention compared to programs like Arlequin.

References

  1. [1]
  2. [2]
  3. [3]
    Estimating and interpreting FST: The impact of rare variants - NIH
    Abstract. In a pair of seminal papers, Sewall Wright and Gustave Malécot introduced FST as a measure of structure in natural populations.
  4. [4]
    ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF ...
    ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE. B. S. Weir,. B. S. Weir. Department of Statistics, North Carolina State University, Raleigh ...
  5. [5]
    F‐statistics and analysis of gene diversity in subdivided populations
    It is shown that Wright's F-statistics can be defined as ratios of gene diversities of heterozygosities rather than as the correlations of uniting gametes.
  6. [6]
    An apportionment of human DNA diversity - PNAS
    In 1972, Richard Lewontin analyzed allele frequencies at 15 protein loci and concluded that 85% of the overall human genetic diversity is represented by ...
  7. [7]
    [PDF] The Apportionment of Human Diversity - Vanderbilt University
    Two analyses for man, one on enzymes by Harris (1970) and one on blood groups by Lewontin (1967), give respective estimates of 30% and 36% for polymorphic loci ...
  8. [8]
    Genomic inference of a severe human bottleneck during ... - Science
    Aug 31, 2023 · Our findings indicate that the severe bottleneck brought the ancestral human population close to extinction and completely reshaped present-day ...
  9. [9]
    Indirect measures of gene flow and migration: F ST ≠1/(4Nm+1)
    Feb 1, 1999 · Underlying theory. Wright's F-statistics are a set of hierarchical measures of the correlations of alleles within individuals and within ...Missing: original definition
  10. [10]
    Admixture into and within sub-Saharan Africa - eLife
    Jun 21, 2016 · Due to the genetic drift associated with the out-of-Africa bottleneck and subsequent expansion, Eurasian groups will tend to generate the ...
  11. [11]
    Genomic analyses show extremely perilous conservation status of ...
    Jun 24, 2022 · We show that genome-wide FST values (0.219–0.497) are comparable or higher than those of other large felids, such as tigers (0.11–0.43, Liu et ...
  12. [12]
    Conservation Genomic Analyses of African and Asiatic Cheetahs ...
    Mar 31, 2020 · Genetic distances, measured using FST, were the highest (0.49696) between the two endangered subspecies, A. j. hecki and A. j. venaticus ...Missing: fixation | Show results with:fixation
  13. [13]
    Patterns, causes, and consequences of marine larval dispersal - PNAS
    Oct 27, 2015 · We find that dispersal declines exponentially, with most larvae traveling less than 2 km from their parents.
  14. [14]
    Soil bacterial populations are shaped by recombination and gene ...
    Linkage disequilibrium, FST, and tests for selection. Linkage was calculated ... Mechanisms of, and barriers to, horizontal gene transfer between bacteria.
  15. [15]
    Plant traits correlated with generation time directly affect inbreeding ...
    Jul 27, 2009 · Plant traits, in particular perenniality, influence FST mostly via their effect on the mating system but also via their association with the ...
  16. [16]
    [PDF] Global patterns of population genetic differentiation in seed plants
    Jul 23, 2020 · Overall, we found higher FST for tropical, mixed- mating, non-woody species pollinated by small insects, and lower FST for temperate, ...
  17. [17]
  18. [18]
    Distances between Populations on the Basis of Gene Frequencies
    see Cavalli-Sforza and Edwards [1967]) this angular transformation is ... suggested by Edwards and Cavalli-Sforza [1964], is to use the straight chord.
  19. [19]
    Genetic Distance between Populations | The American Naturalist
    A measure of genetic distance (D) based on the identity of genes between populations is formulated. It is defined as D = -log e I.
  20. [20]
    Analysis of protein-coding genetic variation in 60,706 humans - Nature
    Aug 17, 2016 · Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ...Missing: F_ST | Show results with:F_ST
  21. [21]
    The global spectrum of protein-coding pharmacogenomic diversity
    Oct 25, 2016 · ... (FST>0.5) for one or more super-population comparison. A median ... Twenty-three genes showed at least one variant that had FST values ...
  22. [22]
    Genetic ancestry and diagnostic yield of exome sequencing in a ...
    Jan 3, 2024 · Genetic ancestry proportions/percentages were estimated from exome sequencing data using Admixture software with unrelated Human Genome Diversity Panel (HGDP) ...Missing: F_ST | Show results with:F_ST
  23. [23]
    Arlequin ver 3.5.2.2 - Population Genetics
    The goal of Arlequin is to provide the average user in population genetics with quite a large set of basic methods and statistical tests.Downloads · What's new · Screenshots
  24. [24]
    Genepop on the Web
    Genepop is a population genetics software. The web version is for teaching or when local PC/Mac use is not possible, limited to 50 loci/100 populations.1. Hardy Weinberg Exact Tests · GenePop Input/Output Help · Genepop Option 6
  25. [25]
    [PDF] Population Genetic Data Analysis Using Genepop
    Genepop computes tests for Hardy-Weinberg, population differentiation, and genotypic disequilibrium, estimates F-statistics, and performs isolation by distance ...
  26. [26]
    VCFtools Documentation - SourceForge
    VCFtools can also calculate Fst statistics between individuals of different populations. It is an estimate calculated in accordance to Weir and Cockerham's 1984 ...Getting basic file statistics · Writing to a new VCF file · Writing out to screen
  27. [27]
    VCF Manual - VCFtools
    This is the preferred calculation of Fst. The provided file must contain a list of individuals (one individual per line) from the VCF file that correspond to ...
  28. [28]
    F-statistics — scikit-allel 1.3.3 documentation
    Compute the variance components from the analyses of variance of allele frequencies according to Weir and Cockerham (1984).Missing: VCF | Show results with:VCF
  29. [29]
  30. [30]
    Input/output utilities — scikit-allel 1.3.3 documentation - Read the Docs
    Read data from a VCF file into a pandas DataFrame. Read data from a VCF file and write out to a comma-separated values (CSV) file.Missing: FST calculation<|separator|>