Fact-checked by Grok 2 weeks ago

Distance correlation

Distance correlation is a statistical measure of dependence between random vectors in arbitrary dimensions, introduced by Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov in 2007,^[1] that captures both linear and nonlinear associations using Euclidean distances between data points. It is defined as the square root of the ratio of the squared distance covariance to the product of the squared distance variances, yielding a value between 0 and 1, where 0 indicates independence and 1 indicates perfect dependence, such as in a linear relationship.^[1] This metric addresses limitations of traditional correlation measures like Pearson's coefficient, which fail to detect nonlinear dependencies, by providing a robust test for independence applicable to non-normal distributions and high-dimensional data. The underlying distance covariance is constructed from the expected value of centered Euclidean distance products, ensuring it is zero only when the vectors are independent, a property that holds for the population version and enables consistent testing of independence using the sample version. Empirical distance correlation enables hypothesis testing for dependence via asymptotic normality and bootstrap methods, with superior power over classical tests in scenarios involving nonmonotonic or complex relationships. Its computation involves pairwise distance matrices, making it versatile for multivariate analysis without assuming specific distributional forms. Since its introduction, distance correlation has seen extensions and applications across diverse fields such as statistics, machine learning, genomics, and finance. Developments include improved estimation techniques for computational efficiency (as of 2025)^[2] and adaptations for spatial data to test geographical dependence in compositional distributions.^[3]

Introduction

Overview and motivation

Distance correlation is a statistical measure of dependence between two random vectors, \mathbf{X} in \mathbb{R}^p and \mathbf{Y} in \mathbb{R}^q, capable of detecting both linear and nonlinear relationships in multivariate settings.^[1] Unlike traditional measures, it provides a coefficient that ranges from 0 (indicating independence) to 1 (indicating perfect dependence), making it a versatile tool for assessing statistical associations without assuming specific distributional forms.^[1] The primary motivation for distance correlation arises from the limitations of classical correlation coefficients, such as Pearson's product-moment correlation, which only capture linear dependencies and can yield zero even when variables are strongly related through nonlinear or non-monotonic transformations.^[1] For instance, Pearson's correlation fails to detect dependence in cases like Y = X^2 for X uniformly distributed on [-1, 1], where the relationship is quadratic and the coefficient is exactly zero, despite clear functional dependence; in contrast, distance correlation quantifies this nonlinear association effectively, highlighting its utility in real-world data exhibiting complex patterns.^[4] This makes distance correlation particularly valuable in fields like genomics, finance, and machine learning, where nonlinear interactions are common.^[1] At its core, distance correlation builds on the intuition of comparing Euclidean distances between pairs of observations in the respective spaces of \mathbf{X} and \mathbf{Y}, centered and normalized to form a dependence metric that is zero if and only if the vectors are independent.^[1] This distance-based approach, grounded in distance covariance as its foundational element, enables robust detection of all forms of dependence, including those missed by parametric tests requiring normality or low dimensionality.^[1]

History and development

Distance correlation was introduced in 2007 by Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov as a measure of dependence between random vectors, building on the concept of distance covariance derived from Euclidean distances between observations.^[1] The initial development drew motivation from energy statistics, which Székely had explored earlier, and analogies to Brownian motion, where distances relate to potential energy in multivariate settings.^[5] This framework provided a way to detect both linear and nonlinear dependencies, addressing limitations of classical correlation measures like Pearson's coefficient.^[1] A key milestone came in 2009 with the publication of "Brownian distance covariance" by Székely and Rizzo in the Annals of Applied Statistics, which formalized the theoretical properties of distance covariance and correlation through connections to Brownian motion processes.^[6] This work established distance correlation as a universal dependence measure that equals zero if and only if the vectors are independent, regardless of dimension.^[6] During the 2010s, extensions addressed challenges in high-dimensional data; for instance, a 2013 paper proposed a t-test based on distance correlation for independence testing in high dimensions, adapting the statistic to handle cases where dimensionality exceeds sample size.^[7] Post-2020 developments have focused on practical implementation and broader integration. The energy package in R, maintained by Rizzo since its initial release, continues to support distance correlation computations and related tests, with updates enhancing efficiency for large datasets.^[8] Similarly, the dcor Python package, introduced in 2017 and refined through 2023, provides accessible tools for distance correlation and energy statistics, including support for partial correlations.^[9] Székely and Rizzo published the book The Energy of Data and Distance Correlation in 2023, providing a comprehensive overview of the topic.^[10] Recent work as of 2025 includes improved estimation methods and robust extensions for applications like independent component analysis.^[2]

Definitions

Distance covariance

Distance covariance is a measure of dependence between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q that generalizes classical covariance to arbitrary dimensions and distributions. The population squared distance covariance V^2(X, Y) is defined via the characteristic functions as

V^2(X, Y) = \frac{1}{c_p c_q} \int_{\mathbb{R}^p} \int_{\mathbb{R}^q} \left| \phi_{X,Y}(s, t) - \phi_X(s) \phi_Y(t) \right|^2 \frac{ds \, dt}{\|s\|^{p+1} \|t\|^{q+1}},

where \phi_{X,Y}(s, t) = \mathbb{E}[\exp(i s^\top X + i t^\top Y)] is the joint characteristic function, \phi_X(s) and \phi_Y(t) are the marginal characteristic functions, and the normalizing constants are c_d = \pi^{(d+1)/2} / \Gamma((d+1)/2) for dimension d. This integral representation weights the squared difference between the joint and product-of-marginals characteristic functions by the inverse of the product of power functions of the arguments, ensuring the measure is nonnegative and zero if and only if X and Y are independent. An equivalent probabilistic interpretation expresses V^2(X, Y) in terms of expected values of distances between independent copies of the vectors. Let X', X'' be independent copies of X independent of Y, Y', Y'', where Y, Y', Y'' are independent copies of Y. Then,

V^2(X, Y) = \mathbb{E}\left[ \|X - X'\| \|Y - Y'\| \right] + \mathbb{E}\left[ \|X - X'\| \right] \mathbb{E}\left[ \|Y - Y'\| \right] - 2 \mathbb{E}\left[ \|X - X'\| \|Y - Y''\| \right],

which highlights its structure as a centered inner product in the space of distance kernels. This form underscores the analogy to classical covariance, where distances replace deviations from means, and it facilitates derivations of properties like nonnegativity. For a sample of n i.i.d. observations (X_1, Y_1), \dots, (X_n, Y_n), the sample squared distance covariance v_n^2(X, Y) is computed from the Euclidean distance matrices. Define a_{kl} = \|X_k - X_l\|_p for k, l = 1, \dots, n, and similarly b_{kl} = \|Y_k - Y_l\|_q. The centered matrices are obtained by double centering:

A_{kl} = a_{kl} - \bar{a}_{k \cdot} - \bar{a}_{\cdot l} + \bar{a}_{\cdot \cdot}, \quad B_{kl} = b_{kl} - \bar{b}_{k \cdot} - \bar{b}_{\cdot l} + \bar{b}_{\cdot \cdot},

where \bar{a}_{k \cdot} = n^{-1} \sum_{l=1}^n a_{kl} is the row mean, \bar{a}_{\cdot l} = n^{-1} \sum_{k=1}^n a_{kl} the column mean, and \bar{a}_{\cdot \cdot} = n^{-2} \sum_{k,l=1}^n a_{kl} the grand mean (similarly for b). The estimator is then

v_n^2(X, Y) = \frac{1}{n^2} \sum_{k=1}^n \sum_{l=1}^n A_{kl} B_{kl}.

This estimator is biased for V^2(X, Y) but consistent as n \to \infty under the moment condition \mathbb{E}[\|X\|_p] < \infty and \mathbb{E}[\|Y\|_q] < \infty. To address the bias, an unbiased estimator was derived as a U-statistic by considering only off-diagonal centered terms adjusted for sample dependencies. Define the U-centered distances for i \neq j:

\tilde{A}_{ij} = a_{ij} - \frac{1}{n-2} \sum_{\substack{l=1 \\ l \neq i}}^n a_{il} - \frac{1}{n-2} \sum_{\substack{k=1 \\ k \neq j}}^n a_{kj} + \frac{1}{(n-1)(n-2)} \sum_{\substack{k,l=1 \\ k \neq l}}^n a_{kl},

and similarly for \tilde{B}_{ij} (with \tilde{A}_{ii} = \tilde{B}_{ii} = 0). The unbiased squared distance covariance is

v_n^2(X, Y) = \frac{1}{n(n-3)} \sum_{\substack{i,j=1 \\ i \neq j}}^n \tilde{A}_{ij} \tilde{B}_{ij}, \quad n > 3.

This U-statistic form ensures \mathbb{E}[v_n^2(X, Y)] = V^2(X, Y) by excluding diagonal terms and adjusting the centering to account for the finite sample, with the denominator chosen to normalize the expectation over distinct pairs.^[11] Its consistency follows from U-statistic theory under the same moment conditions.^[11] Distance covariance serves as the primitive for defining the normalized distance correlation coefficient.

Distance variance and standard deviation

The distance variance of a random vector X in \mathbb{R}^p with finite first moment is defined as the special case of the distance covariance applied to the pair (X, X), denoted V^2(X) = V^2(X, X). In population form, it is given by

V^2(X) = \frac{1}{c_p^2} \int_{\mathbb{R}^{2p}} \left| \phi_X(s + t) - \phi_X(s) \phi_X(t) \right|^2 \frac{ds \, dt}{\|s\|^{p+1} \|t\|^{p+1}},

where \phi_X is the characteristic function of X, \| \cdot \| denotes the Euclidean norm, and c_p = \frac{\pi^{(p+1)/2}}{\Gamma((p+1)/2)} is a normalizing constant independent of the distribution of X. This integral representation arises from the characteristic function approach to distance covariance and equals \mathbb{E}[\|X - X'\|^2] + (\mathbb{E}[\|X - X'\|])^2 - 2\mathbb{E}[\|X - X'\| \|X - X''\|], where X' and X'' are independent copies of X. For a sample X_1, \dots, X_n of i.i.d. copies of X, the sample distance variance is given by

V_n^2(X) = \frac{1}{n^2} \sum_{k=1}^n \sum_{l=1}^n A_{kl}^2,

where A = (A_{kl}) is the centered distance matrix with entries A_{kl} = \|X_k - X_l\| - \bar{a}_{k \cdot} - \bar{a}_{\cdot l} + \bar{a}_{\cdot \cdot}, \bar{a}_{k \cdot} is the row mean of the Euclidean distance matrix, \bar{a}_{\cdot l} the column mean, and \bar{a}_{\cdot \cdot} the grand mean. This estimator is nonnegative and equals zero if and only if all sample points are identical, converging in probability to V^2(X) under the finite first moment assumption. The distance standard deviation is the nonnegative square root V(X) = \sqrt{V^2(X)}, which serves as a scale-equivariant measure of dispersion: V(a + bX) = |b| V(X) for scalars a, b \in \mathbb{R}. Unlike the classical standard deviation, it requires only finite first moments and satisfies V(X) > 0 unless X is almost surely constant, providing a bounded measure of spread that is at most the classical standard deviation or Gini's mean difference when second moments exist. The underlying Euclidean distances embed the random vector X into a Hilbert space of measurable functions, where the distance variance corresponds to the Hilbert-Schmidt norm of the difference between joint and product embeddings, linking it to reproducing kernel Hilbert space interpretations.

Distance correlation coefficient

The distance correlation coefficient provides a normalized measure of dependence between two random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments, defined as

R(X, Y) = \frac{V(X, Y)}{\sqrt{V(X) V(Y)}}

whenever the denominator is positive, and R(X, Y) = 0 otherwise, where V(X, Y) denotes the distance covariance between X and Y, while V(X) = V(X, X) and V(Y) = V(Y, Y) are the respective distance variances. This coefficient satisfies $0 \leq R(X, Y) \leq 1, with R(X, Y) = 0 if and only if X and Y are independent. For a sample of n paired observations (X_1, Y_1), \dots, (X_n, Y_n), the sample distance correlation coefficient is given by

R_n(X, Y) = \frac{V_n(X, Y)}{\sqrt{V_n(X) V_n(Y)}}

defined analogously using the sample distance covariance V_n(X, Y) and sample distance variances V_n(X), V_n(Y), with R_n(X, Y) = 0 if the denominator vanishes. This estimator is consistent and asymptotically unbiased as n \to \infty. The sample coefficient R_n(X, Y) can be computed directly from the double-centered Euclidean distance matrices A = (a_{kl}) and B = (b_{kl}), where a_{kl} = \|X_k - X_l\|_p and b_{kl} = \|Y_k - Y_l\|_q for k, l = 1, \dots, n, after centering by subtracting row and column means and the grand mean:

R_n(X, Y) = \frac{\sum_{i,j=1}^n A_{ij} B_{ij}}{\sqrt{\sum_{i,j=1}^n A_{ij}^2 \sum_{i,j=1}^n B_{ij}^2}}.

This formulation arises because the normalization factors in the sample covariances cancel in the ratio. The distance correlation coefficient is degenerate in cases where V(X) = 0 or V(Y) = 0, which occurs if X or Y is a constant random vector (degenerate distribution concentrated at a single point). In such scenarios, the coefficient is set to 0 by convention, as dependence is undefined.

Properties

Properties of distance correlation

The distance correlation coefficient R(\mathbf{X}, \mathbf{Y}), defined for random vectors \mathbf{X} \in \mathbb{R}^p and \mathbf{Y} \in \mathbb{R}^q with finite first moments, satisfies $0 \leq R(\mathbf{X}, \mathbf{Y}) \leq 1. This bounded range positions it as a standardized measure of dependence, analogous to the Pearson correlation coefficient but capable of detecting both linear and nonlinear associations. Notably, R(\mathbf{X}, \mathbf{Y}) = 0 if and only if \mathbf{X} and \mathbf{Y} are independent, providing a characterization of statistical independence. The coefficient achieves its upper bound of 1 precisely when the vectors are linearly related, specifically if there exists a constant vector \mathbf{a}, a nonzero scalar b > 0, and an orthogonal matrix \mathbf{C} such that \mathbf{Y} = \mathbf{a} + b \mathbf{X} \mathbf{C} almost surely. This condition highlights that perfect dependence under distance correlation requires a strict linear structure, distinguishing it from measures that attain maximum values for broader classes of functional relationships. Distance correlation is invariant under separate translations (\mathbf{X} \to \mathbf{X} + \mathbf{a}, \mathbf{Y} \to \mathbf{Y} + \mathbf{b}), positive scalings (\mathbf{X} \to c \mathbf{X}, \mathbf{Y} \to d \mathbf{Y} with c, d > 0), and orthogonal transformations (\mathbf{X} \to \mathbf{X} \mathbf{C}_1, \mathbf{Y} \to \mathbf{Y} \mathbf{C}_2 with orthogonal \mathbf{C}_1, \mathbf{C}_2) of the coordinates. It is also unaffected by permutations of the observations, as the underlying pairwise distance matrices remain unchanged in distribution. These invariances ensure robustness to affine transformations while preserving sensitivity to dependence structures. The framework naturally extends to multiple random vectors beyond pairs; for instance, the partial distance correlation can quantify the dependence between \mathbf{X} and \mathbf{Y} after removing the linear effects of additional vectors \mathbf{Z}, maintaining the core properties of the bivariate case. In the special case of jointly bivariate normal distributions, distance correlation equals the absolute value of the Pearson correlation coefficient, which coincides with the maximal correlation between the variables.

Properties of distance covariance

Distance covariance, denoted V(X, Y), possesses several fundamental properties that underscore its role as a measure of dependence between random vectors X and Y in Euclidean spaces. One key property is its non-negativity: V(X, Y) \geq 0, with equality holding if and only if X and Y are independent, assuming finite first moments.^[1] This follows directly from its definition as the square root of a weighted L^2 distance between the joint characteristic function of (X, Y) and the product of the marginal characteristic functions.^[1] Distance covariance is homogeneous in each argument: for any scalar c \in \mathbb{R}, V(cX, Y) = |c| V(X, Y), and similarly V(X, cY) = |c| V(X, Y). It is also invariant under location shifts, V(X + a, Y) = V(X, Y) for constant vector a, and under orthogonal transformations, V(CX, DY) = V(X, Y) for orthogonal matrices C, D. These properties extend the analogy to classical covariance while accommodating the geometry of Euclidean distances.^[1] Furthermore, distance covariance defines a semi-inner product on the space of probability distributions with finite moments, embedding them into an L^2 space via their characteristic functions.^[1] This structure induces a metric on the set of such distributions, where the distance between two measures is given by the distance covariance between corresponding random vectors.^[1] When X = Y, distance covariance reduces to distance variance, highlighting its univariate specialization.^[1] Unlike classical correlation coefficients, distance covariance is scale-dependent, meaning its value changes under linear transformations of the variables, such as rescaling.^[1] This dependence arises because the underlying distance measures are sensitive to the magnitudes of the vectors involved.^[1]

Properties of distance variance

The distance variance V(X), defined for a random vector X in \mathbb{R}^p with finite first moment, is a non-negative measure of variability that equals zero if and only if X is degenerate, i.e., constant almost surely. This property characterizes the absence of spread in the distribution of X. The distance variance satisfies a homogeneity condition with respect to scaling: for any scalar c \in \mathbb{R}, V(cX) = |c| V(X). More generally, it is invariant under location shifts and orthogonal transformations, as V(a + U X) = V(X) for a constant vector a and orthonormal matrix U. These behaviors mirror the scale equivariance of classical variance while extending to multivariate settings. A key expression links the distance variance to pairwise distances between independent copies X and X' of the random vector:

V^2(X) = \mathbb{E}[\|X - X'\|^2] - \left( \mathbb{E}[\|X - X'\|] \right)^2,

which represents the variance of the random variable \|X - X'\|. Thus, V(X) is proportional to the expected distance \mathbb{E}[\|X - X'\|] in the sense that it quantifies the dispersion around this mean distance, providing a norm-like measure of the distribution's spread. In the theoretical framework, random vectors are embedded into a Hilbert space via their centered characteristic functions, where the distance covariance acts as an inner product. Under this embedding, the distance variance V(X) corresponds to the squared norm \|X\|^2 in the space, ensuring positive semi-definiteness and enabling the interpretation of distance correlation as a cosine similarity. This structure underpins the method's ability to detect dependencies through geometric properties. It also serves as the normalizing factor in the denominator of the distance correlation coefficient.

Computation and estimation

Sample estimators

The sample distance covariance is estimated from a finite sample of n paired observations (X_k, Y_k)_{k=1}^n by first constructing the Euclidean distance matrices \mathbf{A} = (a_{ij}) and \mathbf{B} = (b_{ij}), where a_{ij} = \|X_i - X_j\| and b_{ij} = \|Y_i - Y_j\|. These matrices are then double-centered to obtain the adjusted values A_{ij} = a_{ij} - \bar{a}_{i\cdot} - \bar{a}_{\cdot j} + \bar{a}_{\cdot \cdot} and similarly for B_{ij}, with the overlines denoting row, column, or grand means. The squared sample distance covariance is then V_n^2(X, Y) = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n A_{ij} B_{ij}.^[1] This estimator V_n^2(X, Y) is biased, with E[V_n^2(X, Y)] > 0 under independence. An unbiased estimator of the squared population distance covariance is the U-statistic

V_n^{*2}(X, Y) = \frac{1}{n(n-1)(n-2)} \sum_{\substack{i,j,k,l=1 \\ i \neq j, k \neq l}}^n (A_{ij} - A_{kl}) (B_{ij} - B_{kl}),

or equivalently in matrix form avoiding explicit quadruple sum, valid for n \geq 3 and ensuring E[V_n^{*2}(X, Y)] = V^2(X, Y). Similar unbiased estimators apply to the sample distance variance. For the distance correlation, the unbiased version uses these in the ratio. For large n, bias corrections become negligible, and efficient algorithms can compute the estimators in O(n \log n) time for univariate data.^[1]^[12] When the data contain ties (identical observations), the corresponding off-diagonal entries in the distance matrices are zero, which is directly incorporated into the double-centering process without additional adjustment, though it may reduce the effective variability in the estimate. For missing data, the distance matrix must be estimated prior to double-centering using imputation techniques that preserve pairwise distances, such as kernel-based or nearest-neighbor methods for incomplete observations.

Example Computation

The example computation contains errors and is removed. For a correct illustration, refer to the original paper or software implementations like the R package energy.

Algorithms and computational complexity

The computation of the sample distance correlation typically begins with the naive algorithm, which constructs full Euclidean distance matrices for the two sets of n observations. For data in dimensions p and q, this requires calculating pairwise distances, incurring a time complexity of O(n²p) for the first matrix and O(n²q) for the second, followed by O(n²) operations for centering the matrices and computing the Frobenius inner product to obtain the distance covariance.^[13] The overall time complexity is thus O(n²(p + q)), while space complexity is O(n²) due to storing the dense matrices, making it impractical for large n even in low dimensions.^[14] To address scalability, fast exact algorithms have been developed for the univariate case (p = q = 1), achieving O(n log n) time through sorting-based methods that avoid explicit matrix construction. One such approach uses two sorting steps on modified distance arrays to compute the centered distances efficiently.^[15] For general dimensions, no exact O(n log n) method exists; computations remain O(n²(p + q)), though approximate methods using random projections or subsampling can reduce effective complexity. These algorithms enable computation on datasets with n up to millions in low dimensions. Approximate methods further improve efficiency in high dimensions or large n, such as random projections that reduce dimensionality to a lower-d space before applying the naive algorithm, yielding O(n²k + nk log n) complexity where k is the projection dimension, often chosen as O(log n) for sketching.^[16] In practice, software implementations facilitate these computations: the R package energy provides the standard estimators with optimized C code for the naive method, supporting up to n ≈ 10⁴ efficiently, while the Python library dcor includes both naive and fast univariate options, with NumPy integration for vectorized operations. Parallelization tips include distributing pairwise distance calculations across cores using libraries like scikit-learn's pairwise_distances with n_jobs=-1, which can reduce wall-clock time by a factor of the available processors for the matrix construction step, though the subsequent centering remains sequential unless custom implementations are used.^[17] In high-dimensional settings where p ≫ n or q ≫ n, the O(n²p + n²q) time dominates due to the per-pair dimension summation in distance calculations, exacerbating challenges like memory bottlenecks for the matrices and potential numerical instability from accumulated floating-point errors in high p.^[18] Mitigation often involves subsampling pairs or using sparse approximations, but exact computation remains costly beyond p ≈ 100 for moderate n.

Independence testing

Characterization of independence

A fundamental property of the distance correlation coefficient R(X, Y) between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments is that it equals zero if and only if X and Y are independent. This holds under the assumption that the distance variances V(X) and V(Y) are positive, ensuring the coefficient is well-defined and avoiding degenerate cases where one or both variables are constant almost surely. The proof relies on characteristic functions. Specifically, the squared distance covariance V^2(X, Y) can be expressed as an integral involving the difference between the joint characteristic function \phi_{X,Y}(t, s) and the product of the marginal characteristic functions \phi_X(t) \phi_Y(s):

V^2(X, Y) = \int_{\mathbb{R}^{p+q}} \frac{|\phi_{X,Y}(t, s) - \phi_X(t) \phi_Y(s)|^2}{c_p c_q |t|^{1+p} |s|^{1+q}} \, dt \, ds,

where c_d = \pi^{(1+d)/2} / \Gamma((1+d)/2) for dimension d. Thus, V^2(X, Y) = 0 implies \phi_{X,Y}(t, s) = \phi_X(t) \phi_Y(s) almost everywhere with respect to the weight measure, which characterizes independence for distributions with finite first moments. The converse follows from the non-negativity of V^2(X, Y). In degenerate cases, such as when the sample distance variance \hat{V}_n(X) = 0 (i.e., all observations of X are identical), the sample distance correlation \hat{R}_n(X, Y) = 0, even if X and Y are dependent in the population. This highlights that the sample estimator may fail to detect dependence in finite samples with degenerate configurations, though the population version holds under the stated conditions. This characterization extends to conditional independence via conditional distance correlation, a measure that equals zero almost surely if and only if X and Y are conditionally independent given a third random vector Z. The conditional version is defined analogously, using distances centered with respect to Z, and inherits the theoretical properties of the unconditional case for testing conditional dependence.

Asymptotic distributions and tests

Under the null hypothesis of independence between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments, the sample distance correlation R_n satisfies n R_n^2 \xrightarrow{d} \int_{[0,1]^{p+q}} \zeta^2(t,s) \, dt \, ds, where \zeta is a centered Gaussian random field on [0,1]^{p+q} with covariance kernel \operatorname{Cov}(\zeta(t,s), \zeta(t',s')) = \left( \phi_X(t - t') - \phi_X(t) \phi_X(t') \right) \left( \phi_Y(s - s') - \phi_Y(s) \phi_Y(s') \right) involving the marginal characteristic functions \phi_X and \phi_Y, up to normalization such that the limit has mean 1.^[19] This limiting distribution, equivalent to the squared L^2-norm of the field, involves the structure of independent Brownian sheets through the Brownian representation of distance covariance and is non-degenerate, excluding simple chi-squared forms.^[20] The complexity of this distribution precludes direct tabulation of critical values, motivating approximations via Monte Carlo simulation of the Gaussian field or empirical methods for inference.^[19] A distribution-free approach to testing independence exploits the exchangeability of paired samples under the null, via a permutation test on the distance matrices. Specifically, fix the distances among the X_i and Y_i, randomly permute the assignment of Y labels to X observations B times (typically B \geq 999), compute R_n^{(b)} for each permuted sample, and obtain the p-value as the proportion of permuted statistics exceeding the observed R_n, i.e., p = \frac{1 + \sum_{b=1}^B \mathbf{1}\{R_n^{(b)} \geq R_n\}}{B+1}.^[19] This test controls the type I error exactly for finite samples and remains computationally feasible for moderate n, with extensions to biased-corrected variants for improved small-sample performance.^[18] In high-dimensional regimes where p, q \to \infty alongside n, the standard R_n exhibits upward bias under independence due to overestimation of distance variances, but a bias-corrected estimator R_n^* that adjusts the distance covariances for high-dimensional bias restores asymptotic normality: \sqrt{n} R_n^* \xrightarrow{d} N(0,1), assuming sub-exponential tails and p q = o(n^{3/2}).^[21] This central limit theorem highlights a "blessing of dimensionality," as the variance stabilizes to 1 regardless of growing dimensions, enabling consistent testing at rates O_p\left( \sqrt{\frac{\log (p \vee q)}{n}} \right) for the null deviation when moments are controlled.^[21] Recent extensions confirm bootstrap validity for p-values in such settings.^[22] More recent work (as of 2023–2025) includes generalized distance correlation tests with asymptotic normality for complex dependencies and self-normalized variants for high-dimensional independence without stringent moment assumptions.^[23]^[24] Empirical power analyses reveal that distance correlation tests outperform Pearson correlation for nonlinear alternatives, achieving detection rates up to 2-3 times higher in simulations involving quadratic, elliptic, or wiggly dependencies (e.g., power >0.8 at n=50 for moderate effects where Pearson power <0.3).^[19] This superiority stems from sensitivity to all dependence forms, with high-dimensional variants maintaining elevated power against sparse signals as p/n \to c >0.^[21]

Generalizations and extensions

Multivariate and high-dimensional cases

Distance correlation, originally defined for bivariate random vectors, naturally extends to multivariate settings where each vector may reside in high-dimensional spaces \mathbb{R}^p and \mathbb{R}^q. In this framework, it measures dependence between two random vectors \mathbf{X} \in \mathbb{R}^p and \mathbf{Y} \in \mathbb{R}^q without assuming linearity or specific distributional forms, leveraging Euclidean distances to capture nonlinear associations across multiple dimensions.^[7] For k > 2 random vectors \mathbf{X}_1, \dots, \mathbf{X}_k, the concept generalizes to distance multivariance, which quantifies overall dependence among the set by summing pairwise distance covariances or using tensor-like products of distance matrices to detect serendipitous independence. This extension preserves properties like non-negativity and zero value under independence, enabling tests for joint dependence in multiparty systems.^[25] In high-dimensional regimes where p or q grows large, computing distance correlation faces the curse of dimensionality: Euclidean distances between points tend to concentrate, making pairwise dissimilarities less informative and inflating variance in estimators. This challenge arises because the volume of high-dimensional space grows exponentially, leading to sparse data and diminished signal in distance matrices, which can degrade dependence detection power. To mitigate this, regularization techniques such as random projections reduce dimensionality by mapping vectors to lower-dimensional subspaces while approximately preserving distances, allowing robust computation of projected distance correlations that maintain asymptotic validity.^[16] Recent theoretical advances address high-dimensional inference directly. For instance, under conditions where dimensions p and q grow with sample size n (e.g., p + q \to \infty as n \to \infty, with finite moments), a bias-corrected distance correlation statistic exhibits asymptotic normality, converging to a standard normal distribution for independence testing. This result highlights a "blessing of dimensionality," where higher dimensions improve the accuracy of normal approximations and test power, contrasting the curse in other metrics.^[22] An illustrative application occurs in genomics, where distance correlation tests independence among high-dimensional gene expression profiles with p > 1000 features across hundreds of samples. In gene co-expression network analysis, it identifies nonlinear dependencies between thousands of genes (e.g., in microarray data with 3,611 genes and 329 samples), outperforming Pearson correlation by capturing complex modules enriched for biological pathways like immune response.^[26]

Kernel and functional variants

Kernel distance correlation extends the standard distance correlation by replacing the Euclidean distance with distances induced by a kernel function, enabling the capture of nonlinear dependencies in the data. This adaptation leverages reproducing kernel Hilbert spaces (RKHS) to embed the data, where the kernel defines a metric that can handle complex structures such as non-Euclidean geometries or high-dimensional nonlinear relationships. For instance, the radial basis function (RBF) kernel, defined as k(x, x') = \exp\left(-\frac{\|x - x'\|^2}{2\sigma^2}\right), induces a distance d(x, x') = \sqrt{k(x, x) + k(x', x') - 2k(x, x')}, which maps data into an infinite-dimensional feature space suitable for nonlinear dependence measurement.^[27]^[28] A seminal result establishes the equivalence between distance covariance and the Hilbert-Schmidt Independence Criterion (HSIC) when using appropriate characteristic kernels, such as the RBF kernel, which ensures that the embedding is injective for distributions with finite second moments. Specifically, the population HSIC, defined as \mathrm{HSIC}(X, Y) = \langle C_{XY}, C_{XY} \rangle_{\mathcal{H} \otimes \mathcal{H}}, where C_{XY} is the cross-covariance operator, coincides with the squared distance covariance under a product kernel k((x,y), (x',y')) = k_X(x,x') k_Y(y,y'). This equivalence allows distance correlation to be interpreted as a normalized HSIC, facilitating independence testing in kernel-embedded spaces with consistent power against nonlinear alternatives. The sample estimator follows a U-statistic form, maintaining computational tractability while inheriting the robustness of kernel methods to distributional assumptions.^[27] In the functional data setting, distance correlation is adapted by employing metrics in infinite-dimensional spaces, such as the L^2 norm on Hilbert spaces of functions, to measure dependence between curves or density functions. For functional random elements X(t) and Y(s) observed over domains \mathcal{T} and \mathcal{S}, the distance covariance is computed using \|X_i - X_j\|_{L^2}^2 = \int_{\mathcal{T}} (X_i(t) - X_j(t))^2 dt, enabling detection of serial or cross-dependencies in trajectories like time-varying signals. Recent theoretical advances provide functional limit theorems for sequential distance correlation processes under absolute regularity conditions, supporting tests for practically significant dependence (e.g., correlation exceeding a threshold \Delta > 0) in stationary functional data with mixing properties. This framework applies to scenarios like biomedical curves or environmental densities, where traditional finite-dimensional assumptions fail.^[29]^[29] Post-2020 extensions include applications of distance correlation to time series analysis in machine learning. For example, in characterizing recurrent neural network performance for forecasting, distance correlation quantifies nonlinear associations between input time series features and hidden states to evaluate model effectiveness across varying lag structures and noise levels.^[30]^[31]

Alternative formulations

Brownian covariance

The Brownian covariance offers an alternative probabilistic formulation of distance covariance, interpreting it through the lens of stochastic processes. Specifically, for random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite second moments, the distance covariance V(X,Y) equals the Brownian distance covariance W(X,Y), introduced by Székely and Rizzo in 2009.^[6] The Brownian distance covariance is defined as

W(X,Y) = \mathbb{E} \left[ (W(X) - \mathbb{E}[W(X)]) (W(Y) - \mathbb{E}[W(Y)]) \right],

where W is a Brownian motion on [0, \infty) with covariance function \mathbb{E}[W(s)W(t)] = 2 \min(s, t), and independent copies are used for centering. For multivariate vectors, this is equivalently expressed in terms of expectations of distance products:

W^2(X,Y) = \mathbb{E}\|X - X'\| \|Y - Y'\| + \mathbb{E}\|X - X'\| \mathbb{E}\|Y - Y'\| - \mathbb{E}\|X - X'\| \|Y - Y''\| - \mathbb{E}\|X - X''\| \|Y - Y'\|,

where X', Y', X'', Y'' are independent copies.^[6] The equivalence between this Brownian formulation and the original characteristic function-based definition of distance covariance, given by

V^2(X,Y) = \frac{1}{c_p c_q} \int_{\mathbb{R}^{p+q}} \bigl| f_{X,Y}(t,s) - f_X(t) f_Y(s) \bigr|^2 \|t\|^{-(1+p)} \|s\|^{-(1+q)} \, dt \, ds,

is established via Fourier transform properties.^[6] The proof relies on integrating the squared difference of characteristic functions against the weights derived from the Brownian covariance kernel, showing that the two expressions coincide for vectors with finite moments. This Brownian perspective provides an intuitive interpretation of distance covariance as a generalized form of classical Pearson covariance, capturing all types of dependence—linear and nonlinear—through the expected discrepancies in associated stochastic processes. It facilitates extensions to arbitrary dimensions without assuming equal dimensionality between X and Y, offering a natural probabilistic framework for dependence measurement. Historically, the Brownian covariance formulation builds on Székely's foundational work in energy statistics, which introduced distance-based measures of dependence as weighted L^2 norms on characteristic functions, providing a stochastic process root for these metrics.^[6]

Energy distance connection

The energy distance between the distributions of two independent random vectors X and Y in \mathbb{R}^p and \mathbb{R}^q, respectively, is defined as

\mathcal{D}^2(X, Y) = 2 \mathbb{E} \|X - Y\| - \mathbb{E} \|X - X'\| - \mathbb{E} \|Y - Y'\|,

where X' and Y' are independent copies of X and Y, and \| \cdot \| denotes the Euclidean norm.^[32] This quantity is nonnegative and equals zero if and only if X and Y have the same distribution, making it a metric on the space of probability measures with finite first moments.^[32] Distance correlation connects directly to energy distance through the distance covariance, which measures dependence between paired random vectors. Specifically, the squared distance covariance V^2(X, Y) equals the squared energy distance between the joint distribution of (X, Y) and the product of the marginal distributions of X and Y.^[1] The squared distance correlation is then obtained by normalizing this as

R^2(X, Y) = \frac{V^2(X, Y)}{\sqrt{V^2(X) V^2(Y)}},

yielding a value in [0, 1] that is zero if and only if X and Y are independent.^[1] In this scaling, V(X) = \mathbb{E} \|X - X'\| and similarly for V(Y), aligning the dependence measure with the metric properties of energy distance.^[1] Energy distance finds application in goodness-of-fit testing by comparing an empirical distribution to a theoretical one; for instance, the test statistic based on \mathcal{D}^2 between samples and a hypothesized distribution rejects equality when large, leveraging the metric's characterization of distributional identity.^[32] The energy distance framework generalizes to non-Euclidean metric spaces via \alpha-distance correlation, where the Euclidean norm is replaced by \| \cdot \|^\alpha for $0 < \alpha \leq 2, accommodating distributions with finite \alpha-moments and enabling analysis under \alpha-mixing conditions for dependent data.^[1]

Robustness and variants

Sensitivity to outliers

Standard distance correlation is highly sensitive to outliers, as even a single contaminated observation can drastically alter the estimated dependence measure. For the typical Euclidean distance formulation (with exponent \alpha = 1), the influence function is bounded, indicating qualitatively bounded gross-error sensitivity, but the quantitative impact scales as O(1/n) due to sample size effects, making the estimator vulnerable to small fractions of contamination.^[33] The breakdown point of distance correlation is zero asymptotically, with the finite-sample breakdown value being $1/n, meaning a single outlier suffices to inflate the sample distance correlation R_n arbitrarily, even to its maximum value of 1 in cases of true independence. This occurs because an outlier at a large distance u causes the distance variance to grow as O(u^4/n^2), leading to divergence as u \to \infty.^[33] Leyder, Raymaekers, and Rousseeuw (2024) provide a rigorous proof that standard distance correlation lacks breakdown-point robustness, contrasting it with median-based statistics like the median absolute deviation, which maintain a positive breakdown point of 0.5 regardless of sample size. Their analysis in the supplementary material derives the exact breakdown behavior, showing that the estimator fails under minimal contamination, unlike robust location or scale measures.^[33] Simulations demonstrate these vulnerabilities in practice: for bivariate data generated from a nonlinear dependence model (e.g., Y = X^2 + \epsilon), adding just one outlier reduces the true distance correlation from around 0.7 to near zero, effectively masking the underlying dependence and causing independence tests to fail with high probability. In contaminated samples with 5% outliers, test rejection rates for dependent data drop by over 50% compared to clean samples of size n=100, while for independent data, outliers spuriously elevate rejection rates to exceed 20%.^[33] This sensitivity underscores the limitations of standard distance correlation in noisy real-world data, where robust alternatives, such as transformation-based variants, offer greater stability without such breakdown risks.^[33]

Robust distance correlation

Robust distance correlation addresses the sensitivity of the classical measure to outliers by employing transformations on the data or distances to enhance breakdown points while maintaining the ability to detect independence. One approach, detailed in a 2025 study, introduces robust versions through data transformations such as replacing raw observations with their ranks or normal scores before computing interpoint distances, or applying a novel biloop transformation that bounds and redescends influences from extreme values. These modifications ensure the distance covariance remains zero if and only if the variables are independent, preserving the core property of the original measure.^[34] Computationally, both transformation-based robust distance correlations retain the O(n²) complexity of the standard algorithm due to pairwise distance evaluations, augmented by robust centering steps like trimmed means for the double centering to further mitigate outlier effects. For instance, in R, transformed versions using ranks can be computed by applying the transformation to the data and then using the energy package's dcor() or dcov() functions. Empirical evaluations demonstrate these variants' superior performance in contaminated settings, such as genetic data analysis, where classical measures fail but robust ones retain power.^[35]

Applications

Dependence detection in statistics

Distance correlation serves as a robust tool for independence testing in multivariate data, particularly excelling in detecting nonlinear dependencies that traditional measures like Pearson's correlation may overlook. The distance covariance test, derived from the distance correlation coefficient, provides a nonparametric approach to assess whether two random vectors are independent, with the null hypothesis of independence rejected when the sample distance correlation significantly exceeds zero under a permutation or bootstrap framework. Monte Carlo simulations demonstrate that this test exhibits superior power against nonlinear alternatives compared to classical tests such as the chi-squared or multivariate Cramér-von Mises tests, especially in bivariate and low-dimensional settings.^[19] In comparisons with kernel-based methods like the Hilbert-Schmidt Independence Criterion (HSIC), distance correlation often shows competitive or superior power for certain nonlinear associations in multivariate scenarios, particularly when sample sizes are moderate.^[36] In feature selection for regression tasks, distance correlation is employed to rank predictor variables by computing the coefficient between each feature and the response variable, prioritizing those with higher values to identify relevant dependencies in high-dimensional datasets. This approach is particularly advantageous in ultrahigh-dimensional settings, where it facilitates screening by capturing both linear and nonlinear relationships, outperforming marginal screening methods based on Pearson correlation in terms of sure independence screening properties. For instance, in regression models with thousands of features, distance correlation-based ranking reduces dimensionality while preserving predictive power, as validated in simulation studies and real-world applications.^[37] For time series analysis, lagged distance correlation extends the measure to detect serial dependence by computing the distance correlation between a series and its delayed version, enabling the identification of autocorrelation structures that may be nonlinear. This lagged formulation, applied to stationary univariate or multivariate processes, quantifies temporal dependencies more flexibly than linear autocorrelation functions, with empirical auto-distance correlation functions providing insights into short- and long-range serial correlations. A recent application integrates lagged distance correlation with recurrent neural networks (RNNs) to characterize time series properties for improved forecasting, linking serial dependence patterns to RNN component effectiveness in capturing nonlinear dynamics.^[38]^[30] In bioinformatics, distance correlation aids in constructing gene co-expression networks by measuring dependencies between gene expression profiles, revealing nonlinear associations that Pearson or Spearman correlations might miss. Applied to high-throughput RNA sequencing data, it generates weighted networks where edges reflect distance correlation strengths, enhancing the detection of functional gene modules in complex biological systems. A 2022 study on human and mouse datasets demonstrated that distance correlation-based networks better capture biologically relevant co-expression patterns compared to traditional methods, improving downstream analyses like pathway enrichment.^[26]

Applications in machine learning and other fields

In machine learning, distance correlation serves as a powerful tool for nonlinear feature selection by measuring dependencies between variables without assuming linearity, enabling the identification of relevant features in high-dimensional datasets such as those from particle physics simulations for top-quark tagging.^[39] For instance, it has been integrated into random forest regression models to filter features based on their nonlinear associations with the target variable, improving predictive performance on benchmark datasets.^[40] In anomaly detection, distance correlation facilitates the selection of features that capture subtle nonlinear patterns in data streams, such as in intrusion detection systems where it is combined with methods like chi-square tests to enhance accuracy for rare events.^[41] Additionally, IBM SPSS Statistics version 31, released in 2025, incorporates distance correlation as a built-in procedure to detect both linear and nonlinear dependencies in multivariate data, supporting applications in exploratory data analysis and model building within machine learning workflows.^[42] In finance, distance correlation has been applied to construct market graphs that reveal nonlinear dependencies among stock returns, providing a more robust representation of inter-stock relationships compared to traditional correlation-based graphs. For example, a 2023 analysis of S&P 500 stocks used distance correlation to build thresholded graphs, demonstrating its ability to capture complex market dynamics and improve portfolio risk assessment by identifying hidden couplings during volatile periods.^[43] Distance correlation also aids in global sensitivity analysis for models with dependent inputs, where it quantifies the nonlinear influence of input parameters on outputs in scenarios like engineering simulations, allowing for more accurate uncertainty propagation when inputs exhibit correlations. A 2019 method formalized this approach, showing through numerical examples that distance correlation-based indices outperform variance-based Sobol indices in handling input dependencies, with applications extended to 2025 studies on complex systems.^[44]^[45] Beyond these areas, distance correlation finds use in physics simulations, such as selecting features in jet tagging for high-energy particle collisions, where it effectively handles nonlinear relationships in event data to boost classification accuracy.^[46] In climate science, it detects nonlinear couplings in atmospheric time series, enabling the construction of complex networks that model teleconnections between variables like temperature and precipitation across regions, as demonstrated in analyses of global datasets revealing non-monotonic dependencies.^[47]

Other nonlinear dependence measures

Several measures have been developed to quantify nonlinear dependencies between variables, offering alternatives to distance correlation by leveraging different mathematical frameworks such as information theory, kernel methods, and divergence metrics. These approaches aim to detect a broad range of associations, including non-monotonic and complex relationships, while varying in computational demands and applicability to specific data types.^[48]^[49] The Maximal Information Coefficient (MIC) is an information-theoretic measure designed to identify pairwise associations of varying strengths and forms in large datasets. It operates by partitioning the data into bins to approximate mutual information, then maximizing this value over possible grid configurations to capture diverse functional relationships, such as linear, nonlinear, or periodic patterns. MIC is normalized to range between 0 and 1, where 0 indicates independence and 1 denotes perfect association, and it is particularly noted for its equitability, meaning it assigns similar scores to relationships with equivalent noise levels regardless of form. Introduced by Reshef et al. in 2011, MIC has been widely adopted for exploratory data analysis in genomics and other high-dimensional fields due to its ability to detect novel associations without assuming a specific model.^[48]^[50] The Hilbert-Schmidt Independence Criterion (HSIC) provides a kernel-based approach to measuring statistical dependence, quantifying the distance between the joint distribution of two variables and their product under independence. It uses the Hilbert-Schmidt norm of the cross-covariance operator in a reproducing kernel Hilbert space, allowing flexibility through choice of kernels (e.g., Gaussian) to capture nonlinear interactions. HSIC equals zero if and only if the variables are independent, and its empirical estimator enables consistent testing of independence hypotheses. Proposed by Gretton et al. in 2005, HSIC shares conceptual similarities with kernel embeddings of distance correlation but extends to high-dimensional and structured data, finding applications in causal inference and feature selection.^[49]^[51] Mutual information (MI) is a foundational information-theoretic quantity that measures the shared information between two random variables, capturing all forms of dependence, linear or nonlinear, without parametric assumptions. It is defined as the Kullback-Leibler divergence between the joint distribution and the product of marginals, with MI = 0 implying independence. For practical estimation in continuous data, nonparametric methods using k-nearest neighbors (k-NN) distances have proven effective, as they adapt to the local geometry of the data and reduce bias compared to histogram-based approaches. The k-NN estimator by Kraskov et al. (2004) computes entropies from nearest-neighbor distances, making it data-efficient for moderate sample sizes and applicable in neuroscience for quantifying neural dependencies.^[52]^[53] For categorical data, distance-based measures like the Hellinger distance offer a robust way to assess dependence by comparing the joint distribution to the independence assumption. The Hellinger distance between two probability distributions is the L2 norm of their square-root densities, providing a bounded metric (0 to 1) that is sensitive to discrepancies in both marginal and joint probabilities. When applied to the joint versus product distributions, it yields a dependence measure that is zero under independence and increases with association strength, suitable for discrete variables due to its affinity to chi-squared statistics. This approach, explored in dependence frameworks by Wu (2010), facilitates tests for association in contingency tables and extends to mixed data types in statistical modeling.^[54]

Comparisons with classical correlations

Distance correlation provides a more comprehensive measure of dependence than Pearson's correlation coefficient, which solely quantifies linear relationships between variables.^[55] For instance, in cases of nonlinear dependence, such as when one variable is a quadratic function of the other (e.g., Y = X^2 with X symmetric around zero), Pearson's coefficient yields zero while distance correlation detects the association with a positive value.^[55] However, distance correlation is computationally more intensive, requiring O(n^2) operations to compute pairwise distance matrices for n observations, in contrast to the O(n) complexity of Pearson's coefficient.^[9] In comparison to rank-based measures like Spearman's rho and Kendall's tau, which assess monotonic associations, distance correlation captures both monotonic and non-monotonic dependencies.^[56] Spearman's rho and Kendall's tau, derived from ranked data, perform well for strictly increasing or decreasing relationships but fail to detect non-monotonic patterns, such as those in a quadratic relationship over an unrestricted domain (e.g., a parabolic curve exhibiting both increases and decreases).^[56] Distance correlation, by contrast, identifies such general dependencies through its basis in Euclidean distances.^[55] A comparative study evaluating the power of various dependence measures, including distance correlation, Pearson's, Spearman's, and Kendall's coefficients, demonstrates the universality of distance correlation in simulation-based power analyses.^[56] Across scenarios with nonlinear and non-monotonic associations in high-dimensional genomic data, distance correlation exhibited superior power to detect dependencies compared to the classical measures, which showed reduced sensitivity outside linear or monotonic regimes.^[56] This underscores its effectiveness for broad dependence detection, though classical methods remain preferable for confirming specific linear or monotonic structures due to their simplicity and interpretability.^[55] Distance correlation is particularly suited for exploratory analyses where nonlinear dependencies are suspected, offering a zero-independence property that classical correlations lack—namely, it equals zero if and only if the variables are independent.^[55] In practice, it serves as a preliminary tool to identify potential associations before applying targeted classical methods for further validation.^[56]

References

[1]
Measuring and testing dependence by correlation of distances
Distance correlation is a new measure of dependence between random vectors. Distance covariance and distance correlation are analogous to product-moment ...
[2]
[PDF] Distance Correlation for Vectors: A SAS Macro - Lex Jansen
Regarding the latter point, the standard example found in some statistical texts is: let Y = X2; then cov(X,Y)=0 even though Y is a function of X; e.g., see ...
[3]
Review Energy statistics: A class of statistics based on distances
Energy statistics are functions of distances between statistical observations. This concept is based on the notion of Newton's gravitational potential energy.
[4]
Brownian distance covariance - Project Euclid
Székely, Maria L. Rizzo "Brownian distance covariance," The Annals of Applied Statistics, Ann. Appl. Stat. 3(4), 1236-1265, (December 2009). Include ...
[5]
The distance correlation t-test of independence in high dimension
Distance correlation characterizes independence and determines a test of multivariate independence for random vectors in arbitrary dimension. In this work, a ...
[6]
CRAN: Package energy
Aug 24, 2024 · E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample ...
[7]
dcor: Distance correlation and energy statistics in Python
This article presents dcor, an open-source Python package dedicated to distance correlation and other statistics related to energy distance.
[8]
The Energy of Data and Distance Correlation - Taylor & Francis Online
Aug 22, 2023 · Implemented in the R software, the energy package is available at https://CRAN.R-project.org/package=energy. The book and corresponding ...
[9]
https://www.sciencedirect.com/science/article/pii/S2352711023000225
[10]
[PDF] Measuring and testing dependence by correlation of distances - arXiv
Distance correlation measures dependence between random vectors, and is zero only if the vectors are independent. It is a measure of dependence.<|control11|><|separator|>
[11]
On the Behrens–Fisher problem: A globally convergent algorithm and a finite-sample study of the Wald, LR and LM tests
**Summary:**
[12]
Rejoinder: Brownian distance covariance - Project Euclid
In this section we present an unbiased estimator of the population distance co- variance, define a corrected distance correlation statistic Cn, and propose a ...
[13]
[1410.1503] Fast Computing for Distance Covariance - arXiv
Oct 6, 2014 · Szekely. View a PDF ... The new formula we derive for an unbiased estimator for squared distance covariance turns out to be a U-statistic.
[14]
[1810.11332] A fast algorithm for computing distance correlation
Oct 26, 2018 · In this paper, we present a simple exact \mathcal{O}(n\log(n)) algorithm to calculate the sample distance covariance between two univariate random variables.
[15]
https://doi.org/10.1016/j.csda.2019.01.016
[16]
A Statistically and Numerically Efficient Independence Test Based ...
In this paper, we introduce a test of independence method based on random projection and distance correlation, which achieves nearly the same power as the ...
[17]
Parallel Calculation of Distance Correlation (dcor) from DataFrame
Apr 19, 2018 · I have a pandas DataFrame with 50 rows and 22000 columns, and I would like to calculate a distance correlation (dcor package) between each pair of columns.using sklearn pairwise_distances to compute distance correlation ...Compute a Pairwise Distance Matrix: is a scalable, big-data-ready ...More results from stackoverflow.com
[18]
[PDF] The distance correlation t-test of independence in high dimension
Feb 27, 2013 · Distance correlation is extended to the problem of testing the independence of random vec- tors in high dimension.
[19]
[PDF] Brownian distance covariance - arXiv
Oct 6, 2010 · Distance correlation is a new class of multivariate dependence coefficients applicable to random vectors of arbitrary and not neces-.
[20]
None
Summary of each segment:
[21]
ASYMPTOTIC DISTRIBUTIONS OF HIGH-DIMENSIONAL ...
Most existing works have explored its asymptotic distributions under the null hypothesis of independence between the two random vectors when only the sample ...
[22]
Distance multivariance: New dependence measures for random vectors
### Summary of Distance Multivariance
[23]
Distance correlation application to gene co-expression network ...
Feb 21, 2022 · Distance correlation is better at revealing complex biological relationships between gene profiles compared with other correlation metrics.Gene Enrichment Comparison · Stability Analysis · Methods
[24]
[PDF] Equivalence of distance-based and RKHS-based statistics in ... - arXiv
This distance measure was also applied to the problem of testing for independence, with the associated test statistic being the Hilbert–Schmidt independence ...
[25]
[PDF] arXiv:2106.07725v2 [math.ST] 1 Aug 2024
Aug 1, 2024 · generalized kernel distance correlation test of independence between X and Y. The power formula in particular unveils an interesting ...
[26]
None
### Summary: Adaptation of Distance Correlation for Functional Data in Infinite Dimensions
[27]
A distance correlation-based approach to characterize the ...
Feb 18, 2025 · Neurocomputing. A distance correlation-based approach to characterize the effectiveness of recurrent neural networks for time series forecasting.
[28]
[2307.15830] A Distance Correlation-Based Approach to ... - arXiv
Jul 28, 2023 · ... A Distance Correlation-Based Approach to Characterize the Effectiveness of Recurrent Neural Networks for Time Series Forecasting.Missing: Neurocomputing | Show results with:Neurocomputing
[29]
Energy distance - Rizzo - 2016 - WIREs Computational Statistics
Dec 28, 2015 · Energy distance is a metric that measures the distance between the distributions of random vectors. Energy distance is zero if and only if the distributions ...Missing: definition | Show results with:definition
[30]
[2403.03722] Robust Distance Covariance - arXiv
Mar 6, 2024 · To address this sensitivity to outliers we construct a more robust version of distance correlation, which is based on a new data transformation.
[31]
Robust Distance Covariance
### Summary of Robust Distance Covariance from https://onlinelibrary.wiley.com/doi/10.1111/insr.70005
[32]
https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/wics.1375
[33]
Large-scale kernel methods for independence testing
Jan 24, 2017 · ... distance correlation (dCor). RKHS-based dependence measures like ... Although dCor and its large-scale approximation give superior power ...
[34]
Feature Screening via Distance Correlation Learning - PMC - NIH
This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields.
[35]
Applications of distance correlation to time series - Project Euclid
The distance covariance and correlation, developed by Székely et al. (Ann. Statist. 35 (2007) 2769–2794) and. Székely and Rizzo (Ann. Appl. Stat. 3 (2009) ...
[36]
[2212.00046] Feature Selection with Distance Correlation - arXiv
Nov 30, 2022 · We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and W-tagging.Missing: regression | Show results with:regression
[37]
Distance Correlation-Based Feature Selection in Random Forest - NIH
The distance correlation measures all types of dependencies between random vectors X and Y in arbitrary dimensions, not just the linear ones. In this paper, we ...
[38]
A feature selection-driven machine learning framework for anomaly ...
Apr 28, 2025 · This study presents a feature selection framework for anomaly-based attack detection systems by combining machine learning and heuristic algorithms.
[39]
Discover Hidden Relationships with Distance Correlation in IBM ...
Jun 11, 2025 · What Is Distance Correlation? Distance Correlation is a universal measure of statistical dependence between two variables or datasets.
[40]
Distance Correlation Market Graph: The Case of S&P500 Stocks
Sep 7, 2023 · This study investigates the use of a novel market graph model for equity markets. Our graph model is built on distance correlation instead of the traditional ...
[41]
Feature selection with distance correlation | Phys. Rev. D
Mar 6, 2024 · We develop a new feature selection method based on distance correlation, and demonstrate its effectiveness on the tasks of boosted top- and -tagging.Article Text · INTRODUCTION · METHOD · CONCLUSIONS
[42]
Exploring Non-Linear Dependencies in Atmospheric Data ... - MDPI
Relations between atmospheric variables are often non-linear, which complicates research efforts to explore and understand multivariable datasets.
[43]
Detecting Novel Associations in Large Data Sets - Science
Dec 16, 2011 · Here, we describe an exploratory data analysis tool, the maximal information coefficient (MIC), that satisfies these two heuristic properties.
[44]
[PDF] Measuring Statistical Dependence with Hilbert-Schmidt Norms
HSIC is an independence criterion using the Hilbert-Schmidt norm of the cross-covariance operator, an empirical estimate of the population HSIC.
[45]
Equitability, mutual information, and the maximal information ... - PNAS
The Maximal Information Coefficient. In contrast to mutual information, Reshef et al. (1) define MIC as a statistic, not as a dependence measure. At the ...
[46]
[PDF] A Kernel Statistical Test of Independence - NIPS papers
This paper introduces a novel test for the independence hypothesis using the Hilbert-Schmidt independence criterion (HSIC), which is a kernel independence ...
[47]
Estimating mutual information | Phys. Rev. E
Mutual information (MI) is estimated using entropy from k-nearest neighbor distances, which are data efficient and adaptive. MI is zero if variables are ...Abstract · Collections · Article Text
[48]
[cond-mat/0305641] Estimating Mutual Information - arXiv
May 28, 2003 · The paper presents improved estimators for mutual information using k-nearest neighbor distances, which are data efficient, adaptive, and have ...
[49]
[PDF] A new look at measuring dependence - Department of Statistics
Hellinger distance is another widely used metric to measure distances between distributions. Proposition 0.6 shows that the local Hellinger dependence measure ...
[50]
A comparative study of statistical methods used to identify ...
Aug 20, 2013 · This work seeks to summarize the main methods (Pearson's, Spearman's and Kendall's correlations; distance correlation; Hoeffding's D measure; Heller–Heller– ...