Fact-checked by Grok 2 weeks ago

Fisher transformation

The Fisher transformation, also known as the Fisher z-transformation, is a statistical method introduced by Ronald A. Fisher in 1915 for normalizing the of the Pearson product-moment correlation coefficient (r), converting its bounded and skewed distribution into an approximately one unbounded by ±1. This transformation facilitates reliable inference on population correlations (ρ) by stabilizing variance and enabling the application of standard theory for tasks such as hypothesis testing and construction, particularly useful when sample sizes are moderate or correlations are near the extremes of 0 or ±1. The formula for the transformation is z_r = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right), equivalent to the inverse hyperbolic tangent function \tanh^{-1}(r), where \ln denotes the natural logarithm. Under the null hypothesis of no correlation (ρ = 0), z_r follows a standard normal distribution asymptotically; more generally, its expected value is \frac{1}{2} \ln \left( \frac{1 + \rho}{1 - \rho} \right) with variance approximately \frac{1}{n-3} for sample size n \geq 3, making it especially effective for large samples where the approximation improves. The inverse transformation, r = \frac{e^{2z_r} - 1}{e^{2z_r} + 1}, allows recovery of the original correlation scale when needed. Beyond basic inference, the Fisher transformation plays a key role in advanced applications, including comparing correlations across independent samples via z-tests and meta-analyzing effect sizes from multiple studies by averaging transformed coefficients to account for varying precisions. It is implemented in statistical software like and for robust correlation analysis, though care must be taken with small samples or near-perfect correlations where the approximation may falter, sometimes requiring alternatives.

Mathematical Foundations

Definition

The Fisher transformation, also known as the Fisher z-transformation, applies to the to map it onto an unbounded scale. For a sample r (where |\ r\ | < 1), the transformation is defined as z = \artanh(r) = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right). This formula, introduced by Ronald Fisher, converts the bounded correlation value into a variable with an approximately normal distribution for large samples. The inverse transformation recovers the original correlation from the transformed value: r = \tanh(z) = \frac{e^{2z} - 1}{e^{2z} + 1}. The domain of the transformation is r \in (-1, 1), which maps bijectively to z \in (-\infty, \infty), thereby linearizing the nonlinear scale of the and facilitating statistical analysis. Standard notation distinguishes the population correlation coefficient \rho from the sample estimate r, with the transformation typically applied to r.

Derivation

The derivation of the Fisher transformation relies on the asymptotic properties of the sample Pearson correlation coefficient r, computed from a random sample of size n drawn from a bivariate normal population with true correlation \rho. Under these conditions, the asymptotic distribution of r is given by \sqrt{n} (r - \rho) \xrightarrow{d} N\left(0, (1 - \rho^2)^2\right) \quad \text{as} \quad n \to \infty. This result follows from the central limit theorem applied to the moments of the bivariate normal variables, accounting for the dependence between the sample means and variances in the correlation formula. The distribution is skewed when \rho \neq 0, and its variance (1 - \rho^2)^2 / n depends on the unknown \rho, which hinders direct normal-based inference for moderate sample sizes. To mitigate skewness and stabilize the variance, the transformation z = \artanh(r) = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right) is applied, where \artanh denotes the inverse hyperbolic tangent (as defined in the preceding section). The choice of the inverse hyperbolic tangent arises from a first-order Taylor series expansion of the sampling distribution of r around \rho, or equivalently, from the delta method for functions of asymptotically normal estimators. Specifically, let g(\rho) = \artanh(\rho); then g'(\rho) = \frac{1}{1 - \rho^2}. Applying the delta method yields \sqrt{n} \left( g(r) - g(\rho) \right) \xrightarrow{d} N\left(0, [g'(\rho)]^2 (1 - \rho^2)^2 \right) = N\left(0, 1\right), so z is approximately normal with mean \artanh(\rho) and variance $1/n, now independent of \rho. This transformation removes the leading-order skewness term in the expansion of r's distribution and equalizes the variance across different values of \rho. A refined finite-sample approximation replaces the asymptotic variance $1/n with $1/(n-3), derived from higher-order terms in the series expansion of the exact distribution of r under bivariate normality; this adjustment accounts for the degrees of freedom lost in estimating the means and variances. The overall derivation assumes that the underlying data are bivariate normal and that n is large enough (typically n > 3) for the and Taylor approximations to apply effectively, ensuring the transformed z closely follows a for inference purposes.

Statistical Properties

Distributional Characteristics

The Fisher z-transformation, defined as z = \artanh(r) where r is the sample , yields a that is approximately normally distributed under the of bivariate in the . Specifically, for large sample sizes n, z follows approximately \mathcal{N}(\artanh(\rho), 1/(n-3)), where \rho is the correlation coefficient. This transformation substantially reduces the and present in the of r, which is notably asymmetric and bounded between -1 and 1, particularly when |\rho| is not close to zero. By mapping r to an unbounded scale, the higher-order moments of z exhibit much less dependence on \rho, resulting in a more symmetric and normal-like distribution compared to r. In finite samples, particularly when n < 30, the distribution of z displays a slight positive bias in its mean estimate, though this bias is generally small and diminishes as n increases or when \rho is near zero; the normality approximation performs best under these conditions with large n and moderate |\rho|. For more precise approximations in finite samples, Edgeworth series expansions have been developed to describe the exact distribution of z, incorporating corrections for skewness and kurtosis beyond the normal approximation, as detailed in early work by .

Variance Stabilization

The Fisher transformation achieves variance stabilization for the sample correlation coefficient r by applying the function z = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right), resulting in an approximate variance for z that is nearly constant across different values of the population correlation ρ. Specifically, the asymptotic variance is given by \operatorname{Var}(z) \approx \frac{1}{n - 3}, where n is the sample size, and this expression is independent of ρ to first order. This contrasts sharply with the variance of r, which depends strongly on ρ as \operatorname{Var}(r) \approx \frac{(1 - \rho^2)^2}{n - 1}. The stabilization arises because the transformation stretches the distribution of r near |ρ| = 1, where the variance of r is smallest, thereby balancing the variability across the range of possible correlations. The standard error of the transformed value is thus SE(z) = \frac{1}{\sqrt{n - 3}}, providing a simple and consistent measure of precision that does not require knowledge of ρ. In comparison, the standard error of the untransformed r is SE(r) = \sqrt{\frac{1 - r^2}{n - 2}}, which varies with the observed r and hence with ρ, making it less reliable for inference when ρ is unknown or extreme. This non-constant nature of SE(r) can lead to distorted confidence intervals or test statistics, particularly when |ρ| is close to 1, where SE(r) becomes very small. The Fisher transformation mitigates this by rendering the standard error approximately uniform, facilitating more robust statistical procedures. For moderate sample sizes, the first-order approximation \frac{1}{n - 3} may exhibit slight dependence on ρ, and higher-order corrections can improve accuracy by incorporating additional terms dependent on ρ and n. These refinements account for residual variability influenced by both n and ρ, with the dependence diminishing as n increases.

Applications in Statistics

Hypothesis Testing for Correlations

The Fisher transformation facilitates hypothesis testing for the population correlation coefficient ρ by normalizing the skewed sampling distribution of the sample correlation r, enabling the use of standard normal approximations for test statistics. For the common null hypothesis H₀: ρ = 0, the test statistic is defined as Z = \sqrt{n - 3} \cdot z, where z = \tanh^{-1}(r) is the Fisher-transformed value of the sample correlation r based on n observations, and \tanh^{-1} denotes the inverse hyperbolic tangent function. Under H₀ and bivariate normality, Z approximately follows a standard normal distribution N(0,1), allowing the two-sided p-value to be computed as $2(1 - \Phi(|Z|)), with Φ the cumulative distribution function of the standard normal. For H₀: ρ = 0, the exact t-test (t = r √((n-2)/(1-r²)) ~ t_{n-2}) is preferred under bivariate normality, while the z-test provides an asymptotic alternative useful for large n or when extending to H₀: ρ = ρ₀ ≠ 0. To test H₀: ρ = ρ₀ for ρ₀ ≠ 0, the statistic is modified to account for the non-zero null value: Z = \sqrt{n - 3} \left( z - \tanh^{-1}(\rho_0) \right). Under the null, Z again approximates N(0,1), providing a straightforward normal test for deviations from any specified ρ₀. This adjustment is particularly useful in comparative studies or when prior information suggests a non-zero correlation, maintaining the transformation's stabilizing properties while shifting the expected value under the null. The p-value is calculated similarly using the standard normal tails. The transformation improves test power, especially when the true |ρ| approaches 1, where the distribution of r becomes highly asymmetric and bounded. This stems from the near-constant variance of z, which enhances the efficiency of the normal approximation even when ρ is extreme. These tests assume the underlying data follow a bivariate normal distribution to ensure the asymptotic normality of z. Violations of this assumption can lead to distorted p-values and reduced power in small samples (n < 30), but the procedure demonstrates robustness to moderate non-normality in large samples (n > 50), where the supports the normal approximation regardless of marginal distributions. For severe non-normality, alternative methods like Spearman rank correlations or may be preferable to maintain validity. Under non-normality, the can offer advantages over the t-test in controlling Type I error and power, as shown in simulations for various distributions.

Confidence Intervals for Correlations

The z-transformation provides a practical for constructing confidence intervals for the population ρ, leveraging the approximate normality of the transformed variable and its stabilized variance of 1/(n-3) for large samples. To form the interval, first compute from the sample using z = artanh(), then apply the normal approximation to obtain bounds around , and finally back-transform these bounds to the scale. The confidence interval for z at level (1-α) is given by: z \pm z_{\alpha/2} \cdot \frac{1}{\sqrt{n-3}} where z_{\alpha/2} is the (1-α/2) quantile of the standard normal distribution (e.g., 1.96 for α=0.05). The endpoints of this interval, denoted z_L and z_U, are then back-transformed to the correlation scale using the hyperbolic tangent function: r_L = \tanh(z_L), \quad r_U = \tanh(z_U) This yields an asymmetric interval (r_L, r_U) on the original scale, reflecting the bounded and skewed nature of the sampling distribution of r. Consider an example with sample correlation r = 0.5 and sample size n = 100 for a 95% confidence interval (α=0.05). First, compute z = artanh(0.5) = 0.5 \ln\left(\frac{1+0.5}{1-0.5}\right) = 0.5 \ln(3) \approx 0.5493. The standard error is 1/\sqrt{97} \approx 0.1015, so the interval for z is 0.5493 \pm 1.96 \times 0.1015 \approx (0.3503, 0.7483). Back-transforming gives r_L = \tanh(0.3503) \approx 0.337 and r_U = \tanh(0.7483) \approx 0.634, resulting in the asymmetric 95% confidence interval (0.337, 0.634) for ρ. This interval is wider on the upper end due to the transformation's properties. For small sample sizes (n < 30), the normal approximation may underperform because the distribution of z deviates from normality, leading to inadequate coverage; in such cases, alternatives like t-distribution-based intervals with df = n-3 or nonparametric bootstrap methods (e.g., or bias-corrected accelerated) are recommended for better accuracy, especially under non-normality.

Extensions and Variations

Application to Rank Correlations

The Fisher transformation is adapted to \rho_s, which measures the strength and direction of association between two ranked variables, by applying z = \artanh(\rho_s) to yield an approximately ly distributed for large sample sizes. This approach is valuable for analyzing or non- continuous data transformed to ranks, as it stabilizes the variance and facilitates inference on monotonic relationships. For large n without ties, the variance of z is approximately $1/(n-3), akin to the Pearson correlation case, enabling standard approximations for tests and confidence intervals. A refined estimate, proposed by Fieller et al. (1957), adjusts the variance by a factor of approximately 1.06, yielding a standard deviation of \sqrt{1.06 / (n-3)} \approx 1.03 / \sqrt{n-3} to better account for the rank-based sampling distribution. When ties are present in the , ranks are typically assigned as the of tied positions, which modifies the computation of \rho_s using the adjusted incorporating tie corrections, such as \sum t_i (t_i^2 - 1)/12 for each variable. For large n without ties, the mirrors the Pearson application, but small sample sizes or substantial ties require Fieller's correction or related adjustments to mitigate in the variance estimate and improve . These modifications ensure more reliable , particularly when the standard may underestimate variability. For illustration, suppose \rho_s = 0.7 based on n = 20 paired ranks without ties. The transformed value is z = \artanh(0.7) \approx 0.867. The approximate standard error is \sqrt{1/(20-3)} \approx 0.243, or \sqrt{1.06 / 17} \approx 0.250 with the Fieller adjustment; under large-sample or permutation distribution assumptions, this supports a 95% confidence interval for the population \zeta = \artanh(\rho_s) as roughly $0.867 \pm 1.96 \times 0.243 (i.e., 0.389 to 1.345), back-transformed to a range for \rho_s of about 0.37 to 0.88. This example highlights how the transformation aids interpretation, though exact permutation-based validation is advisable for discrete rank distributions. Despite these adaptations, the Fisher transformation applied to \rho_s is generally less accurate than for Pearson's r owing to the discrete nature of ranks, which can distort the assumption, especially with small n, many ties, or non-uniform distributions. In such scenarios, the may lead to inflated Type I error rates or poor coverage; tests, which resample the pairings to derive empirical distributions, are recommended as a robust, distribution-free alternative for testing and on correlations. The angular transformation, defined as \arcsin(\sqrt{p}) where p is a proportion between 0 and 1, serves to stabilize the variance of data by approximately normalizing the distribution and making the variance independent of the mean proportion. Introduced by in the context of genetic proportions, this transformation is particularly useful for analyzing percentage data in biological and agricultural experiments, where it facilitates the application of standard tests like ANOVA by reducing heteroscedasticity. Like the z-transformation for , the angular transformation achieves variance stabilization for bounded variables, but it targets binomial variances rather than sampling variability in correlation estimates. The transformation, given by \log\left(\frac{p}{1-p}\right), maps proportions p to an unbounded log-odds scale, which tends toward for moderate sample sizes and helps normalize data constrained to (0,1). Developed by Joseph Berkson for bio-assay applications, it is commonly employed in models to model binary outcomes and interpret odds ratios, providing a for predictors while addressing the asymmetry of raw proportions. In comparison to the z-transformation, the similarly unbounded a bounded statistic to enable approximate , though it is tailored for probabilistic interpretations in generalized linear models rather than . For meta-analysis of correlations, alternatives to the Fisher z-transformation include Bonett's method, which computes fixed-effects intervals directly from the raw correlations using sample-size-based weights, avoiding the z-transformation to simplify and reduce in heterogeneous settings. Hedges' approaches, often integrated into random-effects frameworks, adjust for between-study variability but typically retain the Fisher z for initial standardization, differing from Bonett's direct method by emphasizing moderator analyses and corrections in effect-size synthesis. These methods contrast with the Fisher z by focusing on weighted averages across studies rather than individual variance stabilization, making them suitable for aggregating evidence from multiple independent estimates. The choice of transformation depends on the data type and analytical goal: the Fisher z-transformation is ideal for single bivariate correlations due to its precise variance stabilization, while the and transformations are preferred for proportion-based data in experimental designs, and Bonett's or Hedges' methods for meta-analytic contexts involving multiple correlations.

Historical Context

Ronald Fisher's Original Contribution

Ronald A. Fisher first introduced the transformation that bears his name in his 1915 paper, "Frequency Distribution of the Values of the Correlation Coefficient in Samples from an Indefinitely Large Population," published in Biometrika. Motivated by limitations in existing methods for handling the skewed sampling distribution of the Pearson correlation coefficient r in biometric analyses, particularly for small samples where normal approximations were unreliable, Fisher sought to derive the exact distribution of r under bivariate normality and provide practical tools for significance testing. His attention to this problem was prompted by H. E. Soper's 1913 article on probable errors in correlations from small samples, which highlighted the need for better distributional theory in early 20th-century biometrics. In the paper, derived the for r and proposed the z = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right), demonstrating that z follows an approximately for moderate sample sizes. To aid practitioners, he included extensive tables of critical values for z, enabling straightforward assessments of whether observed deviated significantly from zero in large populations. This approach marked a significant advance in analysis, shifting focus from the bounded and asymmetric of r to the unbounded and symmetric properties of z. Fisher expanded upon these ideas in his 1921 paper, "On the 'Probable Error' of a Coefficient of Deduced from a Small Sample," published in Metron. Building on the 1915 work, he provided more detailed derivations of the exact of r and refined the properties of the z-transformation, including an asymptotic variance of approximately 1/(n - 3) that is independent of the true . This invariance property greatly simplified the construction of intervals and tests, making the transformation a cornerstone for inference on correlations. These contributions formed part of Fisher's efforts in statistical distribution theory while teaching and physics at public schools, reflecting his engagement with the biometric tradition established by and . Occurring before his landmark 1922 paper on maximum likelihood and 1925 work on analysis of variance, they laid foundational groundwork for modern parametric inference in and beyond.

Subsequent Developments

In the , refinements to the Fisher transformation focused on improving approximations for finite sample sizes and non-normal distributions. A key contribution came from A.K. Gayen, who derived expressions for the higher moments of the transformed and applied Edgeworth series expansions to approximate its finite-sample distribution more accurately than the asymptotic approximation alone. These developments addressed limitations in Fisher's original variance stabilization by providing better tail probabilities and moment corrections for small samples. In the intervening decades, statisticians like further developed methods for comparing transformed correlations. During the 1980s and 1990s, the Fisher transformation gained prominence in for combining estimates across studies. Hedges and Vevea outlined fixed- and random-effects models that leverage the transformed z-scores, weighting them by their inverse variances to synthesize overall effect sizes while accounting for heterogeneity. This approach, which normalizes the of correlations, became a standard for integrating evidence from multiple independent samples, enhancing precision in fields like and social sciences. Post-2021 computational advances have emphasized simulation-based methods to extend the transformation's robustness to non-normal . Bootstrap techniques, such as and bias-corrected intervals applied to the z-transformed correlations, have shown improved coverage probabilities under and violations compared to traditional methods. These are readily implemented in statistical software like R's DescTools package, which includes functions for z-transformation and intervals, and SAS's PROC CORR with the option for automated testing and estimation. While applications to high-dimensional correlations have seen recent advancements, such as generalizations to multiple correlations, ongoing research explores robust variants to mitigate sensitivity, building on early robustness assessments to develop contamination-resistant tests.

References

  1. [1]
    FREQUENCY DISTRIBUTION OF THE VALUES OF THE ...
    R. A. FISHER; FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFIENTS IN SAMPLES FROM AN INDEFINITELY LARGE POPU;ATION, Biometrika, Volume 10, I.
  2. [2]
    Fisher Z-Transformation: Definition & Example - Statology
    Jan 6, 2022 · The Fisher Z transformation is a formula we can use to transform Pearson's correlation coefficient (r) into a value (zr) that can be used to ...
  3. [3]
    18.1 - Pearson Correlation Coefficient | STAT 509
    based on a standard normal distribution, we transform r p using Fisher's Z transformation to get a quantity, z p , that has an approximate normal distribution.
  4. [4]
  5. [5]
    Applications of Fisher's z Transformation - SAS Help Center
    Apr 16, 2025 · This data set will be used to illustrate the following applications of Fisher's z transformation: ... Statistics (Fisher's z Transformation).
  6. [6]
    Meta‐analyzing partial correlation coefficients using Fisher's z ...
    Jul 8, 2023 · The Fisher's z transformed PCC is independent of the sampling variance and its sampling distribution more closely follows a normal distribution.Abstract · INTRODUCTION · META-ANALYZING FISHER'S... · DISCUSSIONMissing: applications | Show results with:applications<|control11|><|separator|>
  7. [7]
  8. [8]
    [PDF] Statistical Inference
    Casella, George. Statistical inference / George Casella, Roger L. Berger.-2nd ed. p. cm. Includes bibliographical references and indexes. ISBN 0-534-24312-6.
  9. [9]
    [PDF] Bias in Estimation and Hypothesis Testing of Correlation
    elementary transcendental function known as the inverse hyperbolic tangent function. Apparently, Fisher discovered in serendipitous fashion, without a.
  10. [10]
    Frequency Distribution of the Values of the Correlation Coefficient in ...
    AN INDEFINITELY LARGE POPULATION. BY R. A. FISHER. 1. My attention was drawn to the problem of the frequency distribution of the correlation ...
  11. [11]
    A Brief Note on the Standard Error of the Pearson Correlation
    Sep 6, 2023 · Based on these results, it is recommended to use the expression ( 1 − r 2 ) / N − 3 for the calculation of the standard error of the Pearson ...Brief Note on the Standard... · Comparison of Estimators of... · Conclusion
  12. [12]
    Fisher transformation based confidence intervals of correlations in ...
    May 2, 2021 · Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population.
  13. [13]
  14. [14]
    [PDF] Frequency Distribution of the Values of the Correlation Coefficient in ...
    Dec 7, 2005 · In this paper the general form will first be demonstrated, and for a few important cases some of the successive moments will be derived.Missing: original | Show results with:original
  15. [15]
    [PDF] Confidence intervals around Pearson r's
    ... Fisher's r-to-z transformation. In particular, suppose a sample of n X-Y pairs produces some value of Pearson r. Given the transformation, z = 0.5ln. 1+ r. 1 ...
  16. [16]
    [PDF] 170-31: Computation of Correlation Coefficient and Its Confidence ...
    (1) The FISHER option in the PROC CORR statement offers confidence limits and p-values for. Pearson correlation coefficients based on Fisher's z transformation.
  17. [17]
    None
    ### Notes on Bootstrap as Alternative to Fisher z for Small Samples in Correlation Confidence Intervals
  18. [18]
    [PDF] Fisher transformation based Confidence Intervals of Correlations in ...
    Sep 4, 2020 · vates a variance stabilizing transformation. A popular choice for correlation coefficients is the Fisher-z transformation (Fisher, 1915), ρ 7→ z ...<|control11|><|separator|>
  19. [19]
    18.2 - Spearman Correlation Coefficient | STAT 509
    Similar to the Pearson r p , Fisher's Z transformation can be applied to the Spearman r s to get a statistic, z s , that has an asymptotic normal distribution ...
  20. [20]
    The theory and application of transformation in statistics - VTechWorks
    This paper is a review of the major literature dealing with transformations of random variates which achieve variance stabilization and approximate ...
  21. [21]
    Arcsine‐based transformations for meta‐analysis of proportions
    Jul 27, 2020 · Arcsine‐based transformations, especially the Freeman–Tukey double‐arcsine transformation, are popular tools for stabilizing the variance of each study's ...
  22. [22]
    Transformations of proportions and percentages
    The arcsine transformation (also called the arcsine square root transformation, or the angular transformation) is calculated as two times the arcsine of the ...
  23. [23]
  24. [24]
    "Probable Error" of a Coefficient of Correlation
    On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. Files 14.pdf (1.34 MB) Date 1921 Authors Fisher, Ronald Aylmer, Sir, 1890- ...
  25. [25]
    Ronald Aylmer Fisher (1890-1962) - University College London
    His 1915 paper “Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population” established the ...<|control11|><|separator|>
  26. [26]
    THE FREQUENCY DISTRIBUTION OF THE PRODUCT-MOMENT ...
    This paper, by A.K. Gayen, published in Biometrika in 1951, discusses the frequency distribution of the product-moment correlation coefficient in non-normal ...
  27. [27]
    Fisher-Transformation for Correlation to z-Score - R
    Convert a correlation to az score or z to r using the Fisher transformation or find the confidence intervals for a specified correlation.<|control11|><|separator|>
  28. [28]
    On The robusiness of Tesis of correlation coefficient in the presence ...
    Jun 27, 2007 · Four tests t, Fisher's Z, Arcsine and Ruben's transformation, are considered and i t is shown that they are robust in the presence of an outlier ...