G*Power

G*Power is a free statistical software package designed for performing power analyses, sample size calculations, and effect size estimations across a variety of common statistical tests, including t-tests, F-tests, χ²-tests, z-tests, and some exact tests.^[1] Developed primarily for researchers in the social, behavioral, and biomedical sciences, it provides an intuitive graphical user interface that enables users to visualize power curves, conduct sensitivity analyses, and explore different analysis options without requiring advanced programming knowledge. The software is available for both Windows and macOS operating systems, with the latest versions being 3.1.9.7 for Windows and 3.1.9.6 for macOS 10.7 through 15, and it has been widely adopted due to its flexibility and ease of use in experimental design and hypothesis testing.^[1] Originally introduced in 1996 as a standalone power analysis tool by Edgar Erdfelder, Franz Faul, and Axel Buchner, G*Power evolved significantly with the release of version 3 in 2007, led by Franz Faul, Edgar Erdfelder, Albert-Georg Lang, and Axel Buchner at Heinrich Heine University Düsseldorf in Germany. This version expanded its capabilities to handle more complex analyses, such as those involving correlations and regressions, while incorporating improvements for accuracy and user-friendliness based on user feedback and methodological advancements. A subsequent update in 2009 further refined the program for specific test families, including correlation and regression analyses, ensuring it remains a robust tool for addressing underpowered studies—a common issue in psychological and biomedical research. Hosted and maintained by the Department of Experimental Psychology at Heinrich Heine University, G*Power is distributed under terms that prohibit commercial resale but encourage free academic and personal use.^[1] Key features of G*Power include built-in effect size calculators from various conventions (e.g., Cohen's d, f, or r), support for both a priori and post-hoc power calculations, and the generation of detailed plots to illustrate how power varies with factors like sample size and effect magnitude. It also integrates tutorial resources and a comprehensive manual to guide users through its functions, making it accessible to students, faculty, and professionals alike.^[1] The software's enduring popularity stems from its role in promoting rigorous statistical planning, as evidenced by its frequent citation in methodological guidelines and its integration into curricula for research design courses.

Development and History

Origins and Creators

G*Power was initially developed as a standalone statistical power analysis program by German psychologists Edgar Erdfelder, Franz Faul, and Axel Buchner in the early 1990s, with its foundational version originating in 1992 as a tool for Macintosh computers focused on a priori, post hoc, and compromise power analyses.^[2] The program's creation stemmed from the need to offer researchers in the social and behavioral sciences an accessible, interactive alternative to traditional power tables and charts, which were cumbersome and limited in flexibility for computing sample sizes, power values, and alpha/beta ratios across common statistical tests.^[3] At the time, existing software options lacked comprehensive support for compromise analyses and precise handling of noncentral distributions, prompting the developers to design GPOWER as a menu-driven application capable of high-precision calculations for t tests, F tests, and χ² tests on IBM-compatible PCs and Apple Macintosh systems.^[3] The 1996 publication of GPOWER marked a significant milestone, formalizing the software as a general-purpose tool released as freeware to promote better statistical planning in behavioral research.^[3] Erdfelder was affiliated with the University of Bonn, Faul with the University of Kiel, and Buchner with the University of Trier during this period, reflecting a collaborative effort across German academic institutions to address the underutilization of power analysis due to the absence of user-friendly tools.^[3] This initial iteration emphasized graphical outputs to visualize relationships between power, sample size, and effect sizes, making complex computations more intuitive for non-experts while maintaining algorithmic accuracy through approximations of noncentral distributions.^[3] Over time, the project evolved under ongoing contributions from the original creators, with the software later hosted by the Department of Psychology at Heinrich Heine University Düsseldorf, where Axel Buchner held a position.^[1] This institutional base facilitated subsequent enhancements, but the core origins trace back to the 1990s collaborations aimed at democratizing power analysis in empirical sciences.^[2]

Evolution of Versions

GPower originated as a DOS-based and Macintosh-compatible program in the mid-1990s, with its initial major release documented in 1996 by developers Edgar Erdfelder, Franz Faul, and Axel Buchner, marking a shift toward a general stand-alone power analysis tool for common statistical tests in behavioral research. This early iteration, referred to as GPower 2.0 in some contexts, supported basic platforms including DOS and Mac OS 7-9, and introduced three types of power analyses: a priori, post hoc, and compromise, while providing limited graphical output capabilities. A significant evolution occurred with the release of G*Power 3.0 in 2007, representing a major overhaul that transitioned the software from its DOS-based roots to a fully graphical user interface (GUI) architecture, enhancing accessibility and usability. This version improved algorithmic precision for power calculations, expanded cross-platform compatibility to include Windows XP/Vista and Mac OS X 10.4, and added support for five types of analyses, including new sensitivity and criterion options, as detailed in its seminal publication. The overhaul addressed limitations in earlier versions by incorporating more robust numerical methods and graphical features for visualizing power curves. In 2009, G*Power 3.1 extended the software's capabilities with dedicated modules for correlation and regression analyses, including multiple linear regression, logistic regression, and Poisson regression, building on the foundational improvements of version 3.0 to support a broader range of advanced statistical procedures. Subsequent updates focused on refinement rather than wholesale redesign; for instance, version 3.1.9.2 in 2014 fixed a bug in the χ² goodness-of-fit test where the power was incorrectly set to 1-β instead of β.^[1] Further enhancements came in version 3.1.9.7, released on March 17, 2020, which improved plotting functionalities to handle ranges of input values and noncentrality parameters more dynamically.^[1] As of November 2025, no major version releases have occurred since 2020, though the software receives ongoing minor patches for compatibility with evolving operating systems, maintaining its availability and functionality through the Heinrich Heine University Düsseldorf website.^[1] This steady maintenance underscores G*Power's enduring role as a reliable, free tool, with the GUI-driven architecture established in version 3.0 continuing to define its modern operation across Windows and macOS platforms.

Core Functionality

Supported Statistical Tests

G*Power supports a wide array of statistical tests for power analysis, primarily focusing on common parametric and non-parametric procedures used in behavioral, social, and biomedical research. The software categorizes these into t-tests, F-tests, χ²-tests, z-tests, and exact tests, with each category offering variants tailored to specific research designs. All tests accommodate a priori, post-hoc, and compromise power analyses, allowing users to estimate sample sizes, detect power levels, or balance Type I and Type II error rates based on specified effect sizes, alpha levels, and other parameters. For t-tests, G*Power provides comprehensive support for comparing means, including one-sample tests to assess differences from a constant (H₀: μ = μ₀), independent samples tests for two groups (H₀: μ₁ = μ₂) with options to assume equal or unequal variances through effect size adjustments like Cohen's d, and paired samples tests for dependent means (H₀: μ_d = 0) using effect size d_z. Additional t-test variants include point-biserial correlations (testing ρ = 0 between a dichotomous and continuous variable) and linear regression slope tests (one group or two groups for slope equality). These implementations draw from Cohen's frameworks for effect size standardization, enabling precise power calculations via noncentral t-distributions. The F-tests category encompasses analyses of variance and regression, supporting fixed-effects one-way ANOVA for k independent groups (H₀: all μ_i equal), factorial ANOVA for main effects and interactions in multi-factor designs, and repeated measures ANOVA variants for within-subjects, between-subjects, or mixed effects (including between-within interactions). ANCOVA is handled through specialized F-tests adjusting for covariates, while multiple regression options include omnibus tests for R² deviation from zero (fixed model) and incremental tests for R² increases with added predictors. Multivariate extensions cover MANOVA global effects, special effects, and Hotelling's T² for mean vector comparisons (one or two groups), with effect sizes like f or f². Variance equality tests between two groups are also included. These tests utilize noncentral F-distributions for power computation. χ²-tests in G*Power facilitate goodness-of-fit assessments for single or multiple proportions (e.g., contingency tables) and independence tests in categorical data, with support for McNemar's test as a variant for paired nominal data (testing marginal homogeneity). Effect sizes are based on standardized differences, and power is derived from noncentral χ²-distributions. Z-tests cover proportion comparisons, such as differences between two independent proportions (with pooled or unpooled variance options and continuity corrections), and correlation tests including tetrachoric models for dichotomous variables, and differences between two Pearson r's (independent or dependent, with or without common variables). Logistic and Poisson regression z-tests (Wald or likelihood ratio) assess single or multiple predictors' effects on binary or count outcomes, using odds ratios or rate ratios as effect sizes. These rely on normal approximations for large samples. Exact tests address small-sample scenarios without asymptotic approximations, including the binomial test for single proportions (difference from constant), sign test (special binomial case for p=0.5), Fisher's exact test for two independent proportions (unconditional), and McNemar's exact test for dependent proportions. Power for these is computed using exact distributions to avoid inflation in low-n situations. Later versions, such as G*Power 3.1, expanded support to include advanced correlation models like tetrachoric correlations and generalized linear models (logistic and Poisson regression), enhancing applicability to non-normal data without introducing equivalence tests or nonlinear regression capabilities.^[4]

Power Analysis Methods

Statistical power in G*Power is defined as the probability (1 - β) of correctly rejecting the null hypothesis when it is false, representing the likelihood of detecting a true effect of a specified size.^[4] This calculation relies on noncentral distributions under the alternative hypothesis (H1), such as the noncentral t, F, or chi-square distributions, which shift from the central distributions assumed under the null hypothesis (H0) based on a noncentrality parameter that quantifies the effect's magnitude.^[2] Key parameters in these power analyses include the effect size (e.g., Cohen's d for t-tests or f for F-tests), the significance level α (typically 0.05), the sample size n, and degrees of freedom specific to the test.^[4] Effect sizes follow Cohen's conventions, such as small (d = 0.2, f = 0.10), medium (d = 0.5, f = 0.25), and large (d = 0.8, f = 0.40), providing standardized measures of practical significance.^[2] G*Power supports several analysis types for power computation: a priori analyses determine required sample size given a target power, effect size, and α; post-hoc analyses estimate achieved power given an observed effect size, α, and sample size; compromise analyses balance Type I (α) and Type II (β) error probabilities; criterion analyses determine the significance level α given power, effect size, and sample size; and sensitivity analyses identify the minimum detectable effect size given power, α, and sample size.^[4] The software employs algorithms based on numerical integration for exact power calculations, utilizing libraries like DCDFLIB to evaluate cumulative distribution functions of noncentral distributions and generate power curves.^[2] For more complex tests, such as repeated measures ANOVA, approximation methods are applied, including large-sample normal approximations or continuity corrections where exact solutions are computationally intensive.^[4] Power is formally expressed as:

\text{Power} = 1 - \beta = P(T \geq t_{\alpha} \mid H_1)

where T is the test statistic under H1, and t_{\alpha} is the critical value from the central distribution at level α; β denotes the Type II error rate.^[4] The noncentrality parameter drives this computation; for F-tests, it is given by \lambda = f^2 N, where f is the effect size and N is the total sample size in balanced designs, reflecting the squared effect scaled by sample size to capture deviation from H0 (derived from the expected value of the noncentral F distribution as E(F) = \frac{df_2}{df_2 - 2} \left(1 + \frac{\lambda}{df_1}\right) for df_2 > 2, linking directly to power via the distribution's tail probability).^[2] For handling multiple comparisons, G*Power incorporates planned contrasts in ANOVA and MANOVA via contrast matrices, allowing users to specify adjusted α levels or error probabilities to control family-wise error rates without inflating Type II errors.^[4] Cluster randomization is addressed indirectly by adjusting the effective sample size for intraclass correlation (ICC); users compute the design effect as $1 + (m-1)\rho (where m is average cluster size and ρ is ICC) and divide the nominal sample size by this factor before inputting into standard tests, ensuring power accounts for clustering-induced variance inflation.^[5]

User Interface and Operation

Graphical Interface Features

G*Power's graphical user interface centers on a main window that streamlines power analysis setup through intuitive selection mechanisms. Users begin by accessing a "Test family" dropdown menu to choose from categories such as F tests, t tests, or χ² tests, which then populates a secondary "Statistical test" menu with relevant options like "Means: Difference between two independent means (two groups)."^[4] A central "Type of power analysis" menu or set of radio buttons allows selection among five modes: a priori, compromise, criterion, post-hoc, and sensitivity, dynamically adjusting the input and output parameter fields accordingly.^[4] In the lower left section, dedicated input fields capture essential parameters, including effect size (e.g., Cohen's d or f²), α error probability, desired power (1-β), sample size, and test-specific details like degrees of freedom.^[4] Output visualization is integrated directly into the interface for immediate feedback, with the upper main window displaying overlaid probability distributions of the test statistic under the null hypothesis (H0) and alternative hypothesis (H1), including the decision criterion line and shaded regions for Type I and Type II error probabilities.^[4] For exploratory analysis, a dedicated "X-Y plot for a range of values" button opens a separate window generating real-time line graphs, such as power as a function of sample size or effect size across specified ranges, complete with right-click options to copy the image, save it as a file, print, or export underlying data to a table.^[4] The menu structure supports advanced customization and workflow efficiency, with an "Options" button revealing dialogs for test-specific criteria, such as alpha balancing in two-sided tests or accrual intervals in sequential designs.^[4] A "Protocol of power analyses" tab maintains a log of all performed calculations, enabling users to clear, save to file, print, or copy the protocol for documentation.^[4] Batch processing is accommodated via the built-in calculator's scripting support for evaluating multiple expressions sequentially.^[4] Cross-platform consistency ensures a uniform experience across operating systems, as the Windows and macOS versions of G*Power share an identical layout and toolbar for quick-access functions like calculate and plot generation, with no reported GUI variances.^[4]^[1] Accessibility features promote usability for non-experts, including tooltips that appear on hover over input fields to explain parameters and display standard conventions, such as Cohen's guidelines for effect sizes (e.g., small d = 0.2).^[4] Built-in error checking validates entries in real time, issuing alerts for invalid inputs like negative sample sizes or implausible correlations (e.g., |ρ - ρ0| ≤ ε), and the ESC key allows aborting lengthy computations.^[4]

Input and Output Options

G*Power allows users to specify inputs through two primary modes: distribution-based and design-based. In the distribution-based mode, users select a test family (such as F, t, χ², z, or exact tests) and enter parameters directly, including effect size (e.g., Cohen's d or f), significance level (alpha, typically 0.05), and desired power (e.g., 0.80). The design-based mode enables specification of study characteristics, such as the number of groups or type of means comparison, facilitating more tailored analyses. Manual entry is supported for all parameters, with options to fix certain values (e.g., alpha and power) while varying others (e.g., sample size) for sensitivity analyses.^[4] For advanced inputs, G*Power accommodates multilevel and complex designs, including factorial ANOVA with multiple groups (e.g., 2×3×4 designs specifying up to 24 groups) and repeated measures setups that account for the number of measurements per subject. Nonsphericity corrections, such as Greenhouse-Geisser or Huynh-Feldt epsilon adjustments, are available for repeated measures ANOVA to handle violations of sphericity assumptions. These features allow precise modeling of hierarchical or within-subjects data structures.^[4] Outputs in G*Power are presented in numerical tables displaying key results, such as achieved power values, required sample sizes (n), and critical values, alongside graphical plots (e.g., power curves versus sample size or effect size). Plots and tables can be exported in formats including BMP, PDF, and enhanced metafiles for integration into reports. Additionally, text-based protocols detail all input parameters, assumptions, and computation steps.^[4] The protocol feature automatically records the full analysis setup upon each calculation, enabling reproducibility by saving inputs, options, and outputs to a text file; for exact tests, it includes the random seed to ensure identical results upon recalculation. This documentation supports auditing and sharing of power analyses in research workflows.^[4] Error handling in G*Power includes warnings for scenarios such as low achieved power (e.g., below typical thresholds like 0.80) or infeasible solutions (e.g., negative sample sizes or invalid parameter combinations like non-positive definite correlation matrices), prompting users to adjust inputs for viable designs. These alerts help prevent underpowered studies during planning.^[4]

Applications and Impact

Use in Research Fields

GPower is primarily utilized for a priori sample size planning in psychological experiments, enabling researchers to determine the necessary number of participants to detect effects of interest with adequate statistical power. For instance, in designing a two-group independent t-test to detect a medium effect size (Cohen's d = 0.5) at an alpha level of 0.05 and 80% power, GPower calculates a required sample size of 128 participants (64 per group), ensuring the study is sufficiently powered to identify meaningful differences without excessive Type II errors.^[6]^[7] This application is central to experimental psychology, where it supports planning for t-tests, ANOVA, and correlation analyses commonly employed in cognitive and behavioral studies.^[1] Beyond psychology, G*Power finds extensive use in clinical trials for power calculations involving chi-square tests to evaluate treatment outcomes, such as comparing categorical response rates between intervention and control groups. In the social sciences, it facilitates ANOVA-based power analyses for survey data, helping researchers assess group differences in attitudes or behaviors across multiple conditions. Similarly, in education research, G*Power aids in regression analyses to identify predictors of academic performance, allowing for sample size determination based on expected effect sizes from multiple variables.^[7]^[8]^[9] The software's integration into research workflows is evident in its frequent citation within methods sections of published papers, reflecting its role in justifying study designs. The seminal GPower 3 paper by Faul et al. (2007) is highly cited, underscoring its widespread adoption across disciplines.^[6] Additionally, with millions of downloads worldwide, GPower serves an educational purpose, commonly taught in statistics courses for the behavioral sciences to instruct students on power analysis principles and practical application.^[2]^[10]

Limitations and Alternatives

While GPower provides robust support for classical frequentist power analyses, it has notable limitations in handling advanced statistical paradigms. The software does not support Bayesian power analysis, which is increasingly used in fields requiring probabilistic inference under uncertainty, necessitating separate tools for such computations. Similarly, it lacks integration with machine learning models for power calculations in predictive analytics contexts. GPower primarily assumes normality in its parametric tests without built-in options for robust or non-parametric alternatives, potentially leading to inaccurate estimates in skewed data distributions. Furthermore, it offers no native simulation capabilities for complex experimental designs, such as those involving interactions beyond basic factorial structures, requiring users to resort to programming environments for Monte Carlo simulations. Platform-specific constraints also affect usability. The Mac version (3.1.9.6) trails the Windows version (3.1.9.7) in updates and has historically encountered bugs, such as crashes and interface glitches, though many have been addressed in patches. As of 2025, no mobile or web-based versions exist, limiting accessibility for users without desktop environments. Development has seen no major feature updates since around 2020, with minor bug fixes continuing as of August 2025, leaving G*Power without extensions for emerging tests like advanced multilevel modeling, which accounts for hierarchical data structures common in social sciences and education research. This omission can hinder power planning for clustered or longitudinal designs.^[11] Alternatives address many of these gaps, often at the cost of GPower's hallmark simplicity. The commercial PASS software supports over 1,200 scenarios, including more advanced equivalence tests and survival analyses, making it suitable for comprehensive clinical trial planning. For scriptable, open-source options, the R package pwr enables basic power calculations for t-tests, correlations, and ANOVA via flexible functions, ideal for integration into reproducible workflows. The R package Superpower specializes in simulation-based power analysis for factorial designs, allowing empirical evaluation of complex interactions and non-sphericity corrections that GPower cannot handle natively. While G*Power remains preferred for quick, user-friendly computations in standard scenarios, these alternatives provide greater flexibility for sophisticated or customizable analyses.