Fact-checked by Grok 2 weeks ago

Distance correlation

Distance correlation is a statistical measure of dependence between random vectors in arbitrary dimensions, introduced by Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov in 2007, that captures both linear and nonlinear associations using distances between data points. It is defined as the of the ratio of the squared distance covariance to the product of the squared distance variances, yielding a value between 0 and 1, where 0 indicates and 1 indicates perfect dependence, such as in a linear relationship. This metric addresses limitations of traditional measures like Pearson's coefficient, which fail to detect nonlinear dependencies, by providing a robust test for applicable to non-normal distributions and high-dimensional data. The underlying distance covariance is constructed from the of centered products, ensuring it is zero only when the vectors are , a property that holds for the version and enables consistent testing of using the sample version. Empirical distance correlation enables testing for dependence via asymptotic and bootstrap methods, with superior power over classical tests in scenarios involving nonmonotonic or complex relationships. Its computation involves pairwise distance matrices, making it versatile for multivariate analysis without assuming specific distributional forms. Since its introduction, distance correlation has seen extensions and applications across diverse fields such as statistics, , , and . Developments include improved estimation techniques for computational efficiency (as of 2025) and adaptations for spatial data to test geographical dependence in compositional distributions.

Introduction

Overview and motivation

Distance correlation is a statistical measure of dependence between two random vectors, \mathbf{X} in \mathbb{R}^p and \mathbf{Y} in \mathbb{R}^q, capable of detecting both linear and nonlinear relationships in multivariate settings. Unlike traditional measures, it provides a that ranges from 0 (indicating ) to 1 (indicating perfect dependence), making it a versatile tool for assessing statistical associations without assuming specific distributional forms. The primary motivation for distance correlation arises from the limitations of classical correlation coefficients, such as Pearson's product-moment correlation, which only capture linear dependencies and can yield zero even when variables are strongly related through nonlinear or non-monotonic transformations. For instance, Pearson's correlation fails to detect dependence in cases like Y = X^2 for X uniformly distributed on [-1, 1], where the relationship is quadratic and the coefficient is exactly zero, despite clear functional dependence; in contrast, distance correlation quantifies this nonlinear association effectively, highlighting its utility in real-world data exhibiting complex patterns. This makes distance correlation particularly valuable in fields like , , and , where nonlinear interactions are common. At its core, distance correlation builds on the intuition of comparing distances between pairs of observations in the respective spaces of \mathbf{X} and \mathbf{Y}, centered and normalized to form a dependence that is zero the vectors are . This distance-based approach, grounded in distance covariance as its foundational element, enables robust detection of all forms of dependence, including those missed by tests requiring or low dimensionality.

History and development

Distance correlation was introduced in 2007 by Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov as a measure of dependence between random vectors, building on the concept of distance covariance derived from distances between observations. The initial development drew motivation from energy statistics, which Székely had explored earlier, and analogies to , where distances relate to in multivariate settings. This framework provided a way to detect both linear and nonlinear dependencies, addressing limitations of classical measures like Pearson's . A key milestone came in 2009 with the publication of "Brownian distance covariance" by Székely and Rizzo in the Annals of Applied Statistics, which formalized the theoretical properties of distance covariance and correlation through connections to processes. This work established distance correlation as a universal dependence measure that equals zero the vectors are independent, regardless of dimension. During the , extensions addressed challenges in high-dimensional data; for instance, a 2013 paper proposed a t-test based on distance correlation for independence testing in high dimensions, adapting the to handle cases where dimensionality exceeds sample size. Post-2020 developments have focused on practical implementation and broader integration. The energy package in , maintained by Rizzo since its initial release, continues to support distance correlation computations and related tests, with updates enhancing efficiency for large datasets. Similarly, the dcor Python package, introduced in 2017 and refined through , provides accessible tools for distance correlation and energy statistics, including support for partial correlations. Székely and Rizzo published the book The Energy of Data and Distance Correlation in , providing a comprehensive overview of the topic. Recent work as of 2025 includes improved estimation methods and robust extensions for applications like .

Definitions

Distance covariance

Distance covariance is a measure of dependence between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q that generalizes classical to arbitrary dimensions and distributions. The squared distance covariance V^2(X, Y) is defined via the characteristic functions as V^2(X, Y) = \frac{1}{c_p c_q} \int_{\mathbb{R}^p} \int_{\mathbb{R}^q} \left| \phi_{X,Y}(s, t) - \phi_X(s) \phi_Y(t) \right|^2 \frac{ds \, dt}{\|s\|^{p+1} \|t\|^{q+1}}, where \phi_{X,Y}(s, t) = \mathbb{E}[\exp(i s^\top X + i t^\top Y)] is the joint , \phi_X(s) and \phi_Y(t) are the marginal characteristic functions, and the normalizing constants are c_d = \pi^{(d+1)/2} / \Gamma((d+1)/2) for dimension d. This integral representation weights the squared difference between the joint and product-of-marginals characteristic functions by the inverse of the product of power functions of the arguments, ensuring the measure is nonnegative and zero X and Y are independent. An equivalent probabilistic interpretation expresses V^2(X, Y) in terms of expected values of distances between independent copies of the vectors. Let X', X'' be independent copies of X independent of Y, Y', Y'', where Y, Y', Y'' are independent copies of Y. Then, V^2(X, Y) = \mathbb{E}\left[ \|X - X'\| \|Y - Y'\| \right] + \mathbb{E}\left[ \|X - X'\| \right] \mathbb{E}\left[ \|Y - Y'\| \right] - 2 \mathbb{E}\left[ \|X - X'\| \|Y - Y''\| \right], which highlights its structure as a centered inner product in the space of distance kernels. This form underscores the analogy to classical , where distances replace deviations from means, and it facilitates derivations of properties like nonnegativity. For a sample of n i.i.d. observations (X_1, Y_1), \dots, (X_n, Y_n), the sample squared distance covariance v_n^2(X, Y) is computed from the matrices. Define a_{kl} = \|X_k - X_l\|_p for k, l = 1, \dots, n, and similarly b_{kl} = \|Y_k - Y_l\|_q. The centered matrices are obtained by double centering: A_{kl} = a_{kl} - \bar{a}_{k \cdot} - \bar{a}_{\cdot l} + \bar{a}_{\cdot \cdot}, \quad B_{kl} = b_{kl} - \bar{b}_{k \cdot} - \bar{b}_{\cdot l} + \bar{b}_{\cdot \cdot}, where \bar{a}_{k \cdot} = n^{-1} \sum_{l=1}^n a_{kl} is the row mean, \bar{a}_{\cdot l} = n^{-1} \sum_{k=1}^n a_{kl} the column mean, and \bar{a}_{\cdot \cdot} = n^{-2} \sum_{k,l=1}^n a_{kl} the grand mean (similarly for b). The estimator is then v_n^2(X, Y) = \frac{1}{n^2} \sum_{k=1}^n \sum_{l=1}^n A_{kl} B_{kl}. This estimator is biased for V^2(X, Y) but consistent as n \to \infty under the moment condition \mathbb{E}[\|X\|_p] < \infty and \mathbb{E}[\|Y\|_q] < \infty. To address the bias, an unbiased estimator was derived as a U-statistic by considering only off-diagonal centered terms adjusted for sample dependencies. Define the U-centered distances for i \neq j: \tilde{A}_{ij} = a_{ij} - \frac{1}{n-2} \sum_{\substack{l=1 \\ l \neq i}}^n a_{il} - \frac{1}{n-2} \sum_{\substack{k=1 \\ k \neq j}}^n a_{kj} + \frac{1}{(n-1)(n-2)} \sum_{\substack{k,l=1 \\ k \neq l}}^n a_{kl}, and similarly for \tilde{B}_{ij} (with \tilde{A}_{ii} = \tilde{B}_{ii} = 0). The unbiased squared distance covariance is v_n^2(X, Y) = \frac{1}{n(n-3)} \sum_{\substack{i,j=1 \\ i \neq j}}^n \tilde{A}_{ij} \tilde{B}_{ij}, \quad n > 3. This U-statistic form ensures \mathbb{E}[v_n^2(X, Y)] = V^2(X, Y) by excluding diagonal terms and adjusting the centering to account for the finite sample, with the denominator chosen to normalize the expectation over distinct pairs. Its consistency follows from U-statistic theory under the same moment conditions. Distance covariance serves as the primitive for defining the normalized distance correlation coefficient.

Distance variance and standard deviation

The distance variance of a random X in \mathbb{R}^p with finite first is defined as the special case of the distance covariance applied to the pair (X, X), denoted V^2(X) = V^2(X, X). In population form, it is given by V^2(X) = \frac{1}{c_p^2} \int_{\mathbb{R}^{2p}} \left| \phi_X(s + t) - \phi_X(s) \phi_X(t) \right|^2 \frac{ds \, dt}{\|s\|^{p+1} \|t\|^{p+1}}, where \phi_X is the of X, \| \cdot \| denotes the , and c_p = \frac{\pi^{(p+1)/2}}{\Gamma((p+1)/2)} is a independent of the distribution of X. This integral representation arises from the characteristic function approach to distance and equals \mathbb{E}[\|X - X'\|^2] + (\mathbb{E}[\|X - X'\|])^2 - 2\mathbb{E}[\|X - X'\| \|X - X''\|], where X' and X'' are independent copies of X. For a sample X_1, \dots, X_n of i.i.d. copies of X, the sample distance variance is given by V_n^2(X) = \frac{1}{n^2} \sum_{k=1}^n \sum_{l=1}^n A_{kl}^2, where A = (A_{kl}) is the centered with entries A_{kl} = \|X_k - X_l\| - \bar{a}_{k \cdot} - \bar{a}_{\cdot l} + \bar{a}_{\cdot \cdot}, \bar{a}_{k \cdot} is the row of the Euclidean distance matrix, \bar{a}_{\cdot l} the column , and \bar{a}_{\cdot \cdot} the grand . This is nonnegative and equals zero all sample points are , converging in probability to V^2(X) under the finite first . The distance standard deviation is the nonnegative square root V(X) = \sqrt{V^2(X)}, which serves as a scale-equivariant measure of dispersion: V(a + bX) = |b| V(X) for scalars a, b \in \mathbb{R}. Unlike the classical standard deviation, it requires only finite first moments and satisfies V(X) > 0 unless X is almost surely constant, providing a bounded measure of spread that is at most the classical standard deviation or Gini's mean difference when second moments exist. The underlying distances embed the random vector X into a of measurable functions, where the distance variance corresponds to the Hilbert-Schmidt norm of the difference between joint and product embeddings, linking it to interpretations.

Distance correlation coefficient

The distance correlation coefficient provides a normalized measure of dependence between two random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments, defined as R(X, Y) = \frac{V(X, Y)}{\sqrt{V(X) V(Y)}} whenever the denominator is positive, and R(X, Y) = 0 otherwise, where V(X, Y) denotes the distance covariance between X and Y, while V(X) = V(X, X) and V(Y) = V(Y, Y) are the respective distance variances. This coefficient satisfies $0 \leq R(X, Y) \leq 1, with R(X, Y) = 0 if and only if X and Y are . For a sample of n paired observations (X_1, Y_1), \dots, (X_n, Y_n), the sample distance is given by R_n(X, Y) = \frac{V_n(X, Y)}{\sqrt{V_n(X) V_n(Y)}} defined analogously using the sample distance covariance V_n(X, Y) and sample distance variances V_n(X), V_n(Y), with R_n(X, Y) = 0 if the denominator vanishes. This estimator is consistent and asymptotically unbiased as n \to \infty. The sample coefficient R_n(X, Y) can be computed directly from the double-centered matrices A = (a_{kl}) and B = (b_{kl}), where a_{kl} = \|X_k - X_l\|_p and b_{kl} = \|Y_k - Y_l\|_q for k, l = 1, \dots, n, after centering by subtracting row and column means and the grand mean: R_n(X, Y) = \frac{\sum_{i,j=1}^n A_{ij} B_{ij}}{\sqrt{\sum_{i,j=1}^n A_{ij}^2 \sum_{i,j=1}^n B_{ij}^2}}. This formulation arises because the normalization factors in the sample covariances cancel in the ratio. The distance correlation coefficient is degenerate in cases where V(X) = 0 or V(Y) = 0, which occurs if X or Y is a constant random vector ( concentrated at a single point). In such scenarios, the coefficient is set to 0 by convention, as dependence is undefined.

Properties

Properties of distance correlation

The distance correlation coefficient R(\mathbf{X}, \mathbf{Y}), defined for random vectors \mathbf{X} \in \mathbb{R}^p and \mathbf{Y} \in \mathbb{R}^q with finite first moments, satisfies $0 \leq R(\mathbf{X}, \mathbf{Y}) \leq 1. This bounded range positions it as a standardized measure of dependence, analogous to the but capable of detecting both linear and nonlinear associations. Notably, R(\mathbf{X}, \mathbf{Y}) = 0 if and only if \mathbf{X} and \mathbf{Y} are , providing a characterization of statistical . The coefficient achieves its upper bound of 1 precisely when the vectors are linearly related, specifically if there exists a constant vector \mathbf{a}, a nonzero scalar b > 0, and an orthogonal matrix \mathbf{C} such that \mathbf{Y} = \mathbf{a} + b \mathbf{X} \mathbf{C} almost surely. This condition highlights that perfect dependence under distance correlation requires a strict linear structure, distinguishing it from measures that attain maximum values for broader classes of functional relationships. Distance correlation is invariant under separate translations (\mathbf{X} \to \mathbf{X} + \mathbf{a}, \mathbf{Y} \to \mathbf{Y} + \mathbf{b}), positive scalings (\mathbf{X} \to c \mathbf{X}, \mathbf{Y} \to d \mathbf{Y} with c, d > 0), and orthogonal transformations (\mathbf{X} \to \mathbf{X} \mathbf{C}_1, \mathbf{Y} \to \mathbf{Y} \mathbf{C}_2 with orthogonal \mathbf{C}_1, \mathbf{C}_2) of the coordinates. It is also unaffected by permutations of the observations, as the underlying pairwise matrices remain unchanged in distribution. These invariances ensure robustness to affine transformations while preserving sensitivity to dependence structures. The framework naturally extends to multiple random vectors beyond pairs; for instance, the partial distance correlation can quantify the dependence between \mathbf{X} and \mathbf{Y} after removing the linear effects of additional vectors \mathbf{Z}, maintaining the core properties of the bivariate case. In the special case of jointly bivariate normal distributions, distance correlation equals the absolute value of the , which coincides with the maximal correlation between the variables.

Properties of distance covariance

Distance covariance, denoted V(X, Y), possesses several fundamental properties that underscore its role as a measure of dependence between random vectors X and Y in spaces. One key property is its non-negativity: V(X, Y) \geq 0, with equality holding if and only if X and Y are independent, assuming finite first moments. This follows directly from its definition as the square root of a weighted L^2 distance between the joint of (X, Y) and the product of the marginal characteristic functions. Distance covariance is homogeneous in each argument: for any scalar c \in \mathbb{R}, V(cX, Y) = |c| V(X, Y), and similarly V(X, cY) = |c| V(X, Y). It is also invariant under location shifts, V(X + a, Y) = V(X, Y) for constant vector a, and under orthogonal transformations, V(CX, DY) = V(X, Y) for orthogonal matrices C, D. These properties extend the analogy to classical while accommodating the geometry of Euclidean distances. Furthermore, covariance defines a semi-inner product on the of probability distributions with finite moments, them into an L^2 via their characteristic functions. This structure induces a on the set of such distributions, where the between two measures is given by the between corresponding random vectors. When X = Y, reduces to variance, highlighting its univariate specialization. Unlike classical coefficients, distance covariance is scale-dependent, meaning its value changes under linear transformations of the variables, such as rescaling. This dependence arises because the underlying distance measures are sensitive to the magnitudes of the vectors involved.

Properties of distance variance

The distance variance V(X), defined for a random X in \mathbb{R}^p with finite first , is a non-negative measure of variability that equals zero if and only if X is degenerate, i.e., constant . This property characterizes the absence of spread in the distribution of X. The distance variance satisfies a homogeneity condition with respect to scaling: for any scalar c \in \mathbb{R}, V(cX) = |c| V(X). More generally, it is invariant under location shifts and orthogonal transformations, as V(a + U X) = V(X) for a constant a and orthonormal matrix U. These behaviors mirror the scale equivariance of classical variance while extending to multivariate settings. A key expression links the variance to pairwise distances between copies X and X' of the random : V^2(X) = \mathbb{E}[\|X - X'\|^2] - \left( \mathbb{E}[\|X - X'\|] \right)^2, which represents the variance of the random variable \|X - X'\|. Thus, V(X) is proportional to the expected \mathbb{E}[\|X - X'\|] in the sense that it quantifies the around this , providing a norm-like measure of the distribution's . In the theoretical framework, random vectors are embedded into a via their centered characteristic functions, where the distance covariance acts as an inner product. Under this embedding, the distance variance V(X) corresponds to the squared norm \|X\|^2 in the space, ensuring positive semi-definiteness and enabling the interpretation of distance correlation as a . This structure underpins the method's ability to detect dependencies through geometric properties. It also serves as the normalizing factor in the denominator of the distance correlation .

Computation and estimation

Sample estimators

The sample distance covariance is estimated from a finite sample of n paired observations (X_k, Y_k)_{k=1}^n by first constructing the Euclidean distance matrices \mathbf{A} = (a_{ij}) and \mathbf{B} = (b_{ij}), where a_{ij} = \|X_i - X_j\| and b_{ij} = \|Y_i - Y_j\|. These matrices are then double-centered to obtain the adjusted values A_{ij} = a_{ij} - \bar{a}_{i\cdot} - \bar{a}_{\cdot j} + \bar{a}_{\cdot \cdot} and similarly for B_{ij}, with the overlines denoting row, column, or grand means. The squared sample distance covariance is then V_n^2(X, Y) = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n A_{ij} B_{ij}. This estimator V_n^2(X, Y) is biased, with E[V_n^2(X, Y)] > 0 under independence. An unbiased estimator of the squared population distance covariance is the V_n^{*2}(X, Y) = \frac{1}{n(n-1)(n-2)} \sum_{\substack{i,j,k,l=1 \\ i \neq j, k \neq l}}^n (A_{ij} - A_{kl}) (B_{ij} - B_{kl}), or equivalently in matrix form avoiding explicit quadruple sum, valid for n \geq 3 and ensuring E[V_n^{*2}(X, Y)] = V^2(X, Y). Similar unbiased estimators apply to the sample distance variance. For the distance correlation, the unbiased version uses these in the ratio. For large n, bias corrections become negligible, and efficient algorithms can compute the estimators in O(n \log n) time for univariate data. When the data contain ties (identical observations), the corresponding off-diagonal entries in the distance matrices are zero, which is directly incorporated into the double-centering process without additional adjustment, though it may reduce the effective variability in the estimate. For , the must be estimated prior to double-centering using imputation techniques that preserve pairwise distances, such as kernel-based or nearest-neighbor methods for incomplete observations.

Example Computation

The example computation contains errors and is removed. For a correct , refer to the original or software implementations like the R package .

Algorithms and computational complexity

The computation of the sample distance correlation typically begins with the naive , which constructs full matrices for the two sets of n observations. For data in dimensions p and q, this requires calculating pairwise distances, incurring a time complexity of O(n²p) for the first matrix and O(n²q) for the second, followed by O(n²) operations for centering the matrices and computing the to obtain the distance covariance. The overall time complexity is thus O(n²(p + q)), while is O(n²) due to storing the dense matrices, making it impractical for large n even in low dimensions. To address , fast exact algorithms have been developed for the univariate case (p = = 1), achieving O(n log n) time through -based methods that avoid explicit construction. One such approach uses two steps on modified arrays to compute the centered distances efficiently. For general dimensions, no exact O(n log n) method exists; computations remain O(n²(p + )), though approximate methods using random projections or can reduce effective complexity. These algorithms enable computation on datasets with n up to millions in low dimensions. Approximate methods further improve efficiency in high dimensions or large n, such as random projections that reduce dimensionality to a lower-d space before applying the naive , yielding O(n²k + nk log n) complexity where k is the projection dimension, often chosen as O(log n) for sketching. In practice, software implementations facilitate these computations: the energy provides the standard estimators with optimized C code for the naive method, supporting up to n ≈ 10⁴ efficiently, while the Python library dcor includes both naive and fast univariate options, with integration for vectorized operations. Parallelization tips include distributing pairwise distance calculations across cores using libraries like scikit-learn's pairwise_distances with n_jobs=-1, which can reduce wall-clock time by a factor of the available processors for the matrix construction step, though the subsequent centering remains sequential unless custom implementations are used. In high-dimensional settings where p ≫ n or q ≫ n, the O(n²p + n²q) time dominates due to the per-pair dimension summation in distance calculations, exacerbating challenges like memory bottlenecks for the matrices and potential numerical instability from accumulated floating-point errors in high p. Mitigation often involves subsampling pairs or using sparse approximations, but exact computation remains costly beyond p ≈ 100 for moderate n.

Independence testing

Characterization of independence

A fundamental property of the distance correlation coefficient R(X, Y) between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments is that it equals zero X and Y are . This holds under the assumption that the distance variances V(X) and V(Y) are positive, ensuring the coefficient is well-defined and avoiding degenerate cases where one or both variables are constant . The proof relies on s. Specifically, the squared distance covariance V^2(X, Y) can be expressed as an integral involving the difference between the joint \phi_{X,Y}(t, s) and the product of the marginal s \phi_X(t) \phi_Y(s): V^2(X, Y) = \int_{\mathbb{R}^{p+q}} \frac{|\phi_{X,Y}(t, s) - \phi_X(t) \phi_Y(s)|^2}{c_p c_q |t|^{1+p} |s|^{1+q}} \, dt \, ds, where c_d = \pi^{(1+d)/2} / \Gamma((1+d)/2) for d. Thus, V^2(X, Y) = 0 implies \phi_{X,Y}(t, s) = \phi_X(t) \phi_Y(s) with respect to the weight measure, which characterizes for distributions with finite first moments. The converse follows from the non-negativity of V^2(X, Y). In degenerate cases, such as when the sample distance variance \hat{V}_n(X) = 0 (i.e., all observations of X are identical), the sample distance correlation \hat{R}_n(X, Y) = 0, even if X and Y are dependent in the . This highlights that the sample may fail to detect dependence in finite samples with degenerate configurations, though the population version holds under the stated conditions. This characterization extends to conditional independence via conditional distance correlation, a measure that equals zero X and Y are conditionally independent given a third random vector Z. The conditional version is defined analogously, using distances centered with respect to Z, and inherits the theoretical properties of the unconditional case for testing .

Asymptotic distributions and tests

Under the null hypothesis of independence between random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite first moments, the sample distance correlation R_n satisfies n R_n^2 \xrightarrow{d} \int_{[0,1]^{p+q}} \zeta^2(t,s) \, dt \, ds, where \zeta is a centered on [0,1]^{p+q} with covariance kernel \operatorname{Cov}(\zeta(t,s), \zeta(t',s')) = \left( \phi_X(t - t') - \phi_X(t) \phi_X(t') \right) \left( \phi_Y(s - s') - \phi_Y(s) \phi_Y(s') \right) involving the marginal characteristic functions \phi_X and \phi_Y, up to normalization such that the limit has mean 1. This limiting distribution, equivalent to the squared L^2-norm of the field, involves the structure of independent Brownian sheets through the Brownian representation of and is non-degenerate, excluding simple chi-squared forms. The complexity of this distribution precludes direct tabulation of critical values, motivating approximations via simulation of the Gaussian field or empirical methods for inference. A distribution-free approach to testing exploits the exchangeability of paired samples under the null, via a on the matrices. Specifically, fix the distances among the X_i and Y_i, randomly permute the of Y labels to X observations B times (typically B \geq 999), compute R_n^{(b)} for each permuted sample, and obtain the as the proportion of permuted statistics exceeding the observed R_n, i.e., p = \frac{1 + \sum_{b=1}^B \mathbf{1}\{R_n^{(b)} \geq R_n\}}{B+1}. This test controls the type I exactly for finite samples and remains computationally feasible for moderate n, with extensions to biased-corrected variants for improved small-sample performance. In high-dimensional regimes where p, q \to \infty alongside n, the standard R_n exhibits upward under due to overestimation of variances, but a -corrected R_n^* that adjusts the covariances for high-dimensional restores asymptotic : \sqrt{n} R_n^* \xrightarrow{d} N(0,1), assuming sub-exponential tails and p q = o(n^{3/2}). This highlights a "blessing of dimensionality," as the variance stabilizes to 1 regardless of growing dimensions, enabling consistent testing at rates O_p\left( \sqrt{\frac{\log (p \vee q)}{n}} \right) for the null deviation when s are controlled. Recent extensions confirm bootstrap validity for p-values in such settings. More recent work (as of 2023–2025) includes generalized correlation tests with asymptotic for complex dependencies and self-normalized variants for high-dimensional without stringent assumptions. Empirical power analyses reveal that distance correlation tests outperform Pearson correlation for nonlinear alternatives, achieving detection rates up to 2-3 times higher in simulations involving , elliptic, or wiggly dependencies (e.g., power >0.8 at n=50 for moderate effects where Pearson power <0.3). This superiority stems from sensitivity to all dependence forms, with high-dimensional variants maintaining elevated power against sparse signals as p/n \to c >0.

Generalizations and extensions

Multivariate and high-dimensional cases

Distance correlation, originally defined for bivariate random vectors, naturally extends to multivariate settings where each vector may reside in high-dimensional spaces \mathbb{R}^p and \mathbb{R}^q. In this framework, it measures dependence between two random vectors \mathbf{X} \in \mathbb{R}^p and \mathbf{Y} \in \mathbb{R}^q without assuming linearity or specific distributional forms, leveraging Euclidean distances to capture nonlinear associations across multiple dimensions. For k > 2 random vectors \mathbf{X}_1, \dots, \mathbf{X}_k, the concept generalizes to distance multivariance, which quantifies overall dependence among the set by summing pairwise distance covariances or using tensor-like products of distance matrices to detect serendipitous independence. This extension preserves properties like non-negativity and zero value under independence, enabling tests for joint dependence in multiparty systems. In high-dimensional regimes where p or q grows large, computing distance correlation faces of dimensionality: Euclidean distances between points tend to concentrate, making pairwise dissimilarities less informative and inflating variance in estimators. This challenge arises because the volume of high-dimensional space grows exponentially, leading to sparse data and diminished signal in distance matrices, which can degrade dependence detection power. To mitigate this, regularization techniques such as random projections reduce dimensionality by mapping vectors to lower-dimensional subspaces while approximately preserving distances, allowing robust computation of projected distance correlations that maintain asymptotic validity. Recent theoretical advances address high-dimensional inference directly. For instance, under conditions where dimensions p and q grow with sample size n (e.g., p + q \to \infty as n \to \infty, with finite moments), a bias-corrected distance correlation exhibits asymptotic , converging to a standard for independence testing. This result highlights a "blessing of dimensionality," where higher dimensions improve the accuracy of normal approximations and test , contrasting the curse in other metrics. An illustrative application occurs in , where distance correlation tests independence among high-dimensional profiles with p > 1000 features across hundreds of samples. In analysis, it identifies nonlinear dependencies between thousands of genes (e.g., in data with 3,611 genes and 329 samples), outperforming Pearson correlation by capturing complex modules enriched for biological pathways like .

Kernel and functional variants

Kernel distance correlation extends the standard distance correlation by replacing the Euclidean distance with distances induced by a kernel function, enabling the capture of nonlinear dependencies in the data. This adaptation leverages reproducing kernel Hilbert spaces (RKHS) to embed the data, where the kernel defines a metric that can handle complex structures such as non-Euclidean geometries or high-dimensional nonlinear relationships. For instance, the radial basis function (RBF) kernel, defined as k(x, x') = \exp\left(-\frac{\|x - x'\|^2}{2\sigma^2}\right), induces a distance d(x, x') = \sqrt{k(x, x) + k(x', x') - 2k(x, x')}, which maps data into an infinite-dimensional feature space suitable for nonlinear dependence measurement. A seminal result establishes the equivalence between distance covariance and the Hilbert-Schmidt Independence Criterion (HSIC) when using appropriate characteristic kernels, such as the RBF kernel, which ensures that the embedding is injective for distributions with finite second moments. Specifically, the population HSIC, defined as \mathrm{HSIC}(X, Y) = \langle C_{XY}, C_{XY} \rangle_{\mathcal{H} \otimes \mathcal{H}}, where C_{XY} is the cross-covariance operator, coincides with the squared distance covariance under a product kernel k((x,y), (x',y')) = k_X(x,x') k_Y(y,y'). This equivalence allows distance correlation to be interpreted as a normalized HSIC, facilitating independence testing in kernel-embedded spaces with consistent power against nonlinear alternatives. The sample estimator follows a U-statistic form, maintaining computational tractability while inheriting the robustness of kernel methods to distributional assumptions. In the functional data setting, distance correlation is adapted by employing metrics in infinite-dimensional spaces, such as the L^2 norm on Hilbert spaces of functions, to measure dependence between curves or density functions. For functional random elements X(t) and Y(s) observed over domains \mathcal{T} and \mathcal{S}, the distance covariance is computed using \|X_i - X_j\|_{L^2}^2 = \int_{\mathcal{T}} (X_i(t) - X_j(t))^2 dt, enabling detection of serial or cross-dependencies in trajectories like time-varying signals. Recent theoretical advances provide functional limit theorems for sequential distance correlation processes under absolute regularity conditions, supporting tests for practically significant dependence (e.g., correlation exceeding a threshold \Delta > 0) in stationary functional data with mixing properties. This framework applies to scenarios like biomedical curves or environmental densities, where traditional finite-dimensional assumptions fail. Post-2020 extensions include applications of distance correlation to analysis in . For example, in characterizing performance for forecasting, distance correlation quantifies nonlinear associations between input features and hidden states to evaluate model effectiveness across varying lag structures and noise levels.

Alternative formulations

Brownian covariance

The Brownian covariance offers an alternative probabilistic formulation of distance covariance, interpreting it through the lens of stochastic processes. Specifically, for random vectors X \in \mathbb{R}^p and Y \in \mathbb{R}^q with finite second moments, the distance covariance V(X,Y) equals the Brownian distance covariance W(X,Y), introduced by Székely and Rizzo in 2009. The Brownian distance covariance is defined as W(X,Y) = \mathbb{E} \left[ (W(X) - \mathbb{E}[W(X)]) (W(Y) - \mathbb{E}[W(Y)]) \right], where W is a Brownian motion on [0, \infty) with covariance function \mathbb{E}[W(s)W(t)] = 2 \min(s, t), and independent copies are used for centering. For multivariate vectors, this is equivalently expressed in terms of expectations of distance products: W^2(X,Y) = \mathbb{E}\|X - X'\| \|Y - Y'\| + \mathbb{E}\|X - X'\| \mathbb{E}\|Y - Y'\| - \mathbb{E}\|X - X'\| \|Y - Y''\| - \mathbb{E}\|X - X''\| \|Y - Y'\|, where X', Y', X'', Y'' are independent copies. The equivalence between this Brownian formulation and the original function-based definition of distance , given by V^2(X,Y) = \frac{1}{c_p c_q} \int_{\mathbb{R}^{p+q}} \bigl| f_{X,Y}(t,s) - f_X(t) f_Y(s) \bigr|^2 \|t\|^{-(1+p)} \|s\|^{-(1+q)} \, dt \, ds, is established via properties. The proof relies on integrating the squared difference of functions against the weights derived from the Brownian kernel, showing that the two expressions coincide for vectors with finite moments. This Brownian perspective provides an intuitive interpretation of distance covariance as a generalized form of classical Pearson , capturing all types of dependence—linear and nonlinear—through the expected discrepancies in associated processes. It facilitates extensions to arbitrary dimensions without assuming equal dimensionality between X and Y, offering a natural probabilistic framework for dependence measurement. Historically, the Brownian covariance formulation builds on Székely's foundational work in statistics, which introduced distance-based measures of dependence as weighted L^2 norms on characteristic functions, providing a stochastic process root for these metrics.

Energy distance connection

The energy distance between the distributions of two random vectors X and Y in \mathbb{R}^p and \mathbb{R}^q, respectively, is defined as \mathcal{D}^2(X, Y) = 2 \mathbb{E} \|X - Y\| - \mathbb{E} \|X - X'\| - \mathbb{E} \|Y - Y'\|, where X' and Y' are copies of X and Y, and \| \cdot \| denotes the norm. This quantity is nonnegative and equals zero X and Y have the same , making it a on the space of probability measures with finite first moments. Distance correlation connects directly to energy distance through the distance covariance, which measures dependence between paired random vectors. Specifically, the squared distance covariance V^2(X, Y) equals the squared between the joint distribution of (X, Y) and the product of the marginal distributions of X and Y. The squared distance correlation is then obtained by normalizing this as R^2(X, Y) = \frac{V^2(X, Y)}{\sqrt{V^2(X) V^2(Y)}}, yielding a value in [0, 1] that is zero X and Y are independent. In this scaling, V(X) = \mathbb{E} \|X - X'\| and similarly for V(Y), aligning the dependence measure with the metric properties of . Energy distance finds application in goodness-of-fit testing by comparing an empirical distribution to a theoretical one; for instance, the test statistic based on \mathcal{D}^2 between samples and a hypothesized distribution rejects equality when large, leveraging the metric's characterization of distributional identity. The energy distance framework generalizes to non-Euclidean metric spaces via \alpha-distance correlation, where the Euclidean norm is replaced by \| \cdot \|^\alpha for $0 < \alpha \leq 2, accommodating distributions with finite \alpha-moments and enabling analysis under \alpha-mixing conditions for dependent data.

Robustness and variants

Sensitivity to outliers

Standard distance correlation is highly sensitive to outliers, as even a single contaminated observation can drastically alter the estimated dependence measure. For the typical formulation (with exponent \alpha = 1), the is bounded, indicating qualitatively bounded gross-error , but the quantitative impact scales as O(1/n) due to sample size effects, making the estimator vulnerable to small fractions of . The point of distance correlation is zero asymptotically, with the finite-sample value being $1/n, meaning a single suffices to inflate the sample distance correlation R_n arbitrarily, even to its maximum value of 1 in cases of true . This occurs because an at a large distance u causes the distance variance to grow as O(u^4/n^2), leading to as u \to \infty. Leyder, Raymaekers, and Rousseeuw (2024) provide a rigorous proof that standard distance correlation lacks -point robustness, contrasting it with median-based statistics like the , which maintain a positive point of 0.5 regardless of sample size. Their analysis in the supplementary material derives the exact behavior, showing that the fails under minimal contamination, unlike robust or measures. Simulations demonstrate these vulnerabilities in practice: for bivariate data generated from a nonlinear dependence model (e.g., Y = X^2 + \epsilon), adding just one outlier reduces the true distance correlation from around 0.7 to near zero, effectively masking the underlying dependence and causing independence tests to fail with high probability. In contaminated samples with 5% outliers, test rejection rates for dependent data drop by over 50% compared to clean samples of size n=100, while for independent data, outliers spuriously elevate rejection rates to exceed 20%. This sensitivity underscores the limitations of distance correlation in noisy real-world , where robust alternatives, such as transformation-based variants, offer greater without such risks.

Robust distance correlation

Robust distance correlation addresses the sensitivity of the classical measure to outliers by employing transformations on the or distances to enhance breakdown points while maintaining the ability to detect . One approach, detailed in a 2025 study, introduces robust versions through transformations such as replacing raw observations with their ranks or normal scores before computing interpoint distances, or applying a novel biloop transformation that bounds and redescends influences from extreme values. These modifications ensure the distance remains zero the variables are , preserving the core property of the original measure. Computationally, both transformation-based robust distance correlations retain the O(n²) complexity of the standard algorithm due to pairwise distance evaluations, augmented by robust centering steps like trimmed means for the double centering to further mitigate outlier effects. For instance, in R, transformed versions using ranks can be computed by applying the transformation to the data and then using the energy package's dcor() or dcov() functions. Empirical evaluations demonstrate these variants' superior performance in contaminated settings, such as genetic data analysis, where classical measures fail but robust ones retain power.

Applications

Dependence detection in statistics

Distance correlation serves as a robust tool for independence testing in multivariate data, particularly excelling in detecting nonlinear dependencies that traditional measures like Pearson's correlation may overlook. The distance covariance test, derived from the distance correlation coefficient, provides a nonparametric approach to assess whether two random vectors are independent, with the null hypothesis of independence rejected when the sample distance correlation significantly exceeds zero under a permutation or bootstrap framework. Monte Carlo simulations demonstrate that this test exhibits superior power against nonlinear alternatives compared to classical tests such as the chi-squared or multivariate Cramér-von Mises tests, especially in bivariate and low-dimensional settings. In comparisons with kernel-based methods like the Hilbert-Schmidt Independence Criterion (HSIC), distance correlation often shows competitive or superior power for certain nonlinear associations in multivariate scenarios, particularly when sample sizes are moderate. In for tasks, distance correlation is employed to rank predictor by computing the between each and the response , prioritizing those with higher values to identify relevant dependencies in high-dimensional datasets. This approach is particularly advantageous in ultrahigh-dimensional settings, where it facilitates screening by capturing both linear and nonlinear relationships, outperforming marginal screening methods based on in terms of sure screening properties. For instance, in models with thousands of features, distance correlation-based ranking reduces dimensionality while preserving predictive power, as validated in simulation studies and real-world applications. For time series analysis, lagged distance correlation extends the measure to detect serial dependence by computing the distance correlation between a series and its delayed version, enabling the identification of structures that may be nonlinear. This lagged formulation, applied to univariate or multivariate processes, quantifies temporal dependencies more flexibly than linear functions, with empirical auto-distance correlation functions providing insights into short- and long-range serial correlations. A recent application integrates lagged distance correlation with recurrent neural networks (RNNs) to characterize properties for improved , linking serial dependence patterns to RNN component effectiveness in capturing nonlinear dynamics. In bioinformatics, distance correlation aids in constructing gene co-expression networks by measuring dependencies between gene expression profiles, revealing nonlinear associations that Pearson or Spearman correlations might miss. Applied to high-throughput RNA sequencing data, it generates weighted networks where edges reflect distance correlation strengths, enhancing the detection of functional gene modules in complex biological systems. A 2022 study on human and mouse datasets demonstrated that distance correlation-based networks better capture biologically relevant co-expression patterns compared to traditional methods, improving downstream analyses like pathway enrichment.

Applications in machine learning and other fields

In , distance correlation serves as a powerful tool for nonlinear by measuring dependencies between variables without assuming linearity, enabling the identification of relevant features in high-dimensional datasets such as those from simulations for top-quark tagging. For instance, it has been integrated into regression models to filter features based on their nonlinear associations with the target variable, improving predictive performance on benchmark datasets. In , distance correlation facilitates the selection of features that capture subtle nonlinear patterns in data streams, such as in intrusion detection systems where it is combined with methods like tests to enhance accuracy for rare events. Additionally, Statistics version 31, released in 2025, incorporates distance correlation as a built-in procedure to detect both linear and nonlinear dependencies in multivariate data, supporting applications in and model building within workflows. In , distance correlation has been applied to construct market graphs that reveal nonlinear dependencies among returns, providing a more robust representation of inter-stock relationships compared to traditional correlation-based graphs. For example, a 2023 analysis of stocks used distance correlation to build thresholded graphs, demonstrating its ability to capture complex market dynamics and improve by identifying hidden couplings during volatile periods. Distance correlation also aids in global sensitivity analysis for models with dependent inputs, where it quantifies the nonlinear influence of input parameters on outputs in scenarios like simulations, allowing for more accurate propagation when inputs exhibit correlations. A 2019 method formalized this approach, showing through numerical examples that distance correlation-based indices outperform variance-based Sobol indices in handling input dependencies, with applications extended to 2025 studies on complex systems. Beyond these areas, distance correlation finds use in physics simulations, such as selecting features in tagging for high-energy particle collisions, where it effectively handles nonlinear relationships in event data to boost accuracy. In climate science, it detects nonlinear couplings in atmospheric , enabling the construction of that model teleconnections between variables like and across regions, as demonstrated in analyses of global datasets revealing non-monotonic dependencies.

Other nonlinear dependence measures

Several measures have been developed to quantify nonlinear dependencies between variables, offering alternatives to distance correlation by leveraging different mathematical frameworks such as , kernel methods, and divergence metrics. These approaches aim to detect a broad range of associations, including non-monotonic and complex relationships, while varying in computational demands and applicability to specific data types. The Maximal Information Coefficient (MIC) is an information-theoretic measure designed to identify pairwise associations of varying strengths and forms in large datasets. It operates by partitioning the data into bins to approximate mutual information, then maximizing this value over possible grid configurations to capture diverse functional relationships, such as linear, nonlinear, or periodic patterns. MIC is normalized to range between 0 and 1, where 0 indicates independence and 1 denotes perfect association, and it is particularly noted for its equitability, meaning it assigns similar scores to relationships with equivalent noise levels regardless of form. Introduced by Reshef et al. in 2011, MIC has been widely adopted for exploratory data analysis in genomics and other high-dimensional fields due to its ability to detect novel associations without assuming a specific model. The Hilbert-Schmidt Independence Criterion (HSIC) provides a kernel-based approach to measuring statistical dependence, quantifying the distance between the joint distribution of two variables and their product under . It uses the Hilbert-Schmidt norm of the operator in a , allowing flexibility through choice of kernels (e.g., Gaussian) to capture nonlinear interactions. HSIC equals zero the variables are , and its empirical estimator enables consistent testing of hypotheses. Proposed by Gretton et al. in 2005, HSIC shares conceptual similarities with kernel embeddings of distance correlation but extends to high-dimensional and structured data, finding applications in and . Mutual information (MI) is a foundational information-theoretic quantity that measures the shared information between two random variables, capturing all forms of dependence, linear or nonlinear, without parametric assumptions. It is defined as the Kullback-Leibler divergence between the joint distribution and the product of marginals, with MI = 0 implying independence. For practical estimation in continuous data, nonparametric methods using k-nearest neighbors (k-NN) distances have proven effective, as they adapt to the local geometry of the data and reduce bias compared to histogram-based approaches. The k-NN estimator by Kraskov et al. (2004) computes entropies from nearest-neighbor distances, making it data-efficient for moderate sample sizes and applicable in for quantifying neural dependencies. For categorical data, distance-based measures like the offer a robust way to assess dependence by comparing the joint distribution to the independence assumption. The Hellinger distance between two probability distributions is the L2 norm of their square-root densities, providing a bounded metric (0 to 1) that is sensitive to discrepancies in both marginal and joint probabilities. When applied to the joint versus product distributions, it yields a dependence measure that is zero under and increases with association strength, suitable for discrete variables due to its affinity to chi-squared statistics. This approach, explored in dependence frameworks by Wu (2010), facilitates tests for in contingency tables and extends to mixed data types in statistical modeling.

Comparisons with classical correlations

Distance correlation provides a more comprehensive measure of dependence than Pearson's correlation coefficient, which solely quantifies linear relationships between variables. For instance, in cases of nonlinear dependence, such as when one variable is a of the other (e.g., Y = X^2 with X symmetric around zero), Pearson's coefficient yields zero while distance correlation detects the association with a positive value. However, distance correlation is computationally more intensive, requiring O(n^2) operations to compute pairwise distance matrices for n observations, in contrast to the O(n) complexity of Pearson's coefficient. In comparison to rank-based measures like Spearman's rho and Kendall's tau, which assess monotonic associations, distance correlation captures both monotonic and non-monotonic dependencies. Spearman's rho and Kendall's tau, derived from ranked data, perform well for strictly increasing or decreasing relationships but fail to detect non-monotonic patterns, such as those in a relationship over an unrestricted (e.g., a parabolic curve exhibiting both increases and decreases). Distance correlation, by contrast, identifies such general dependencies through its basis in distances. A comparative study evaluating the power of various dependence measures, including , , , and , demonstrates the universality of distance correlation in simulation-based power analyses. Across scenarios with nonlinear and non-monotonic associations in high-dimensional , distance correlation exhibited superior power to detect dependencies compared to the classical measures, which showed reduced sensitivity outside linear or monotonic regimes. This underscores its effectiveness for broad dependence detection, though classical methods remain preferable for confirming specific linear or monotonic structures due to their simplicity and interpretability. Distance correlation is particularly suited for exploratory analyses where nonlinear dependencies are suspected, offering a zero-independence property that classical correlations lack—namely, it equals zero the variables are independent. In practice, it serves as a preliminary tool to identify potential associations before applying targeted classical methods for further validation.

References

  1. [1]
    Measuring and testing dependence by correlation of distances
    Distance correlation is a new measure of dependence between random vectors. Distance covariance and distance correlation are analogous to product-moment ...
  2. [2]
    [PDF] Distance Correlation for Vectors: A SAS Macro - Lex Jansen
    Regarding the latter point, the standard example found in some statistical texts is: let Y = X2; then cov(X,Y)=0 even though Y is a function of X; e.g., see ...
  3. [3]
    Review Energy statistics: A class of statistics based on distances
    Energy statistics are functions of distances between statistical observations. This concept is based on the notion of Newton's gravitational potential energy.
  4. [4]
    Brownian distance covariance - Project Euclid
    Székely, Maria L. Rizzo "Brownian distance covariance," The Annals of Applied Statistics, Ann. Appl. Stat. 3(4), 1236-1265, (December 2009). Include ...
  5. [5]
    The distance correlation t-test of independence in high dimension
    Distance correlation characterizes independence and determines a test of multivariate independence for random vectors in arbitrary dimension. In this work, a ...
  6. [6]
    CRAN: Package energy
    Aug 24, 2024 · E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample ...
  7. [7]
    dcor: Distance correlation and energy statistics in Python
    This article presents dcor, an open-source Python package dedicated to distance correlation and other statistics related to energy distance.
  8. [8]
    The Energy of Data and Distance Correlation - Taylor & Francis Online
    Aug 22, 2023 · Implemented in the R software, the energy package is available at https://CRAN.R-project.org/package=energy. The book and corresponding ...
  9. [9]
  10. [10]
    [PDF] Measuring and testing dependence by correlation of distances - arXiv
    Distance correlation measures dependence between random vectors, and is zero only if the vectors are independent. It is a measure of dependence.<|control11|><|separator|>
  11. [11]
  12. [12]
    Rejoinder: Brownian distance covariance - Project Euclid
    In this section we present an unbiased estimator of the population distance co- variance, define a corrected distance correlation statistic Cn, and propose a ...
  13. [13]
    [1410.1503] Fast Computing for Distance Covariance - arXiv
    Oct 6, 2014 · Szekely. View a PDF ... The new formula we derive for an unbiased estimator for squared distance covariance turns out to be a U-statistic.
  14. [14]
    [1810.11332] A fast algorithm for computing distance correlation
    Oct 26, 2018 · In this paper, we present a simple exact \mathcal{O}(n\log(n)) algorithm to calculate the sample distance covariance between two univariate random variables.
  15. [15]
  16. [16]
    A Statistically and Numerically Efficient Independence Test Based ...
    In this paper, we introduce a test of independence method based on random projection and distance correlation, which achieves nearly the same power as the ...
  17. [17]
    Parallel Calculation of Distance Correlation (dcor) from DataFrame
    Apr 19, 2018 · I have a pandas DataFrame with 50 rows and 22000 columns, and I would like to calculate a distance correlation (dcor package) between each pair of columns.using sklearn pairwise_distances to compute distance correlation ...Compute a Pairwise Distance Matrix: is a scalable, big-data-ready ...More results from stackoverflow.com
  18. [18]
    [PDF] The distance correlation t-test of independence in high dimension
    Feb 27, 2013 · Distance correlation is extended to the problem of testing the independence of random vec- tors in high dimension.
  19. [19]
    [PDF] Brownian distance covariance - arXiv
    Oct 6, 2010 · Distance correlation is a new class of multivariate dependence coefficients applicable to random vectors of arbitrary and not neces-.
  20. [20]
    None
    Summary of each segment:
  21. [21]
    ASYMPTOTIC DISTRIBUTIONS OF HIGH-DIMENSIONAL ...
    Most existing works have explored its asymptotic distributions under the null hypothesis of independence between the two random vectors when only the sample ...
  22. [22]
  23. [23]
    Distance correlation application to gene co-expression network ...
    Feb 21, 2022 · Distance correlation is better at revealing complex biological relationships between gene profiles compared with other correlation metrics.Gene Enrichment Comparison · Stability Analysis · Methods
  24. [24]
    [PDF] Equivalence of distance-based and RKHS-based statistics in ... - arXiv
    This distance measure was also applied to the problem of testing for independence, with the associated test statistic being the Hilbert–Schmidt independence ...
  25. [25]
    [PDF] arXiv:2106.07725v2 [math.ST] 1 Aug 2024
    Aug 1, 2024 · generalized kernel distance correlation test of independence between X and Y. The power formula in particular unveils an interesting ...
  26. [26]
    None
    ### Summary: Adaptation of Distance Correlation for Functional Data in Infinite Dimensions
  27. [27]
    A distance correlation-based approach to characterize the ...
    Feb 18, 2025 · Neurocomputing. A distance correlation-based approach to characterize the effectiveness of recurrent neural networks for time series forecasting.
  28. [28]
    [2307.15830] A Distance Correlation-Based Approach to ... - arXiv
    Jul 28, 2023 · ... A Distance Correlation-Based Approach to Characterize the Effectiveness of Recurrent Neural Networks for Time Series Forecasting.Missing: Neurocomputing | Show results with:Neurocomputing
  29. [29]
    Energy distance - Rizzo - 2016 - WIREs Computational Statistics
    Dec 28, 2015 · Energy distance is a metric that measures the distance between the distributions of random vectors. Energy distance is zero if and only if the distributions ...Missing: definition | Show results with:definition
  30. [30]
    [2403.03722] Robust Distance Covariance - arXiv
    Mar 6, 2024 · To address this sensitivity to outliers we construct a more robust version of distance correlation, which is based on a new data transformation.
  31. [31]
    Robust Distance Covariance
    ### Summary of Robust Distance Covariance from https://onlinelibrary.wiley.com/doi/10.1111/insr.70005
  32. [32]
  33. [33]
    Large-scale kernel methods for independence testing
    Jan 24, 2017 · ... distance correlation (dCor). RKHS-based dependence measures like ... Although dCor and its large-scale approximation give superior power ...
  34. [34]
    Feature Screening via Distance Correlation Learning - PMC - NIH
    This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields.
  35. [35]
    Applications of distance correlation to time series - Project Euclid
    The distance covariance and correlation, developed by Székely et al. (Ann. Statist. 35 (2007) 2769–2794) and. Székely and Rizzo (Ann. Appl. Stat. 3 (2009) ...
  36. [36]
    [2212.00046] Feature Selection with Distance Correlation - arXiv
    Nov 30, 2022 · We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and W-tagging.Missing: regression | Show results with:regression
  37. [37]
    Distance Correlation-Based Feature Selection in Random Forest - NIH
    The distance correlation measures all types of dependencies between random vectors X and Y in arbitrary dimensions, not just the linear ones. In this paper, we ...
  38. [38]
    A feature selection-driven machine learning framework for anomaly ...
    Apr 28, 2025 · This study presents a feature selection framework for anomaly-based attack detection systems by combining machine learning and heuristic algorithms.
  39. [39]
    Discover Hidden Relationships with Distance Correlation in IBM ...
    Jun 11, 2025 · What Is Distance Correlation? Distance Correlation is a universal measure of statistical dependence between two variables or datasets.
  40. [40]
    Distance Correlation Market Graph: The Case of S&P500 Stocks
    Sep 7, 2023 · This study investigates the use of a novel market graph model for equity markets. Our graph model is built on distance correlation instead of the traditional ...
  41. [41]
    Feature selection with distance correlation | Phys. Rev. D
    Mar 6, 2024 · We develop a new feature selection method based on distance correlation, and demonstrate its effectiveness on the tasks of boosted top- and -tagging.Article Text · INTRODUCTION · METHOD · CONCLUSIONS
  42. [42]
    Exploring Non-Linear Dependencies in Atmospheric Data ... - MDPI
    Relations between atmospheric variables are often non-linear, which complicates research efforts to explore and understand multivariable datasets.
  43. [43]
    Detecting Novel Associations in Large Data Sets - Science
    Dec 16, 2011 · Here, we describe an exploratory data analysis tool, the maximal information coefficient (MIC), that satisfies these two heuristic properties.
  44. [44]
    [PDF] Measuring Statistical Dependence with Hilbert-Schmidt Norms
    HSIC is an independence criterion using the Hilbert-Schmidt norm of the cross-covariance operator, an empirical estimate of the population HSIC.
  45. [45]
    Equitability, mutual information, and the maximal information ... - PNAS
    The Maximal Information Coefficient. In contrast to mutual information, Reshef et al. (1) define MIC as a statistic, not as a dependence measure. At the ...
  46. [46]
    [PDF] A Kernel Statistical Test of Independence - NIPS papers
    This paper introduces a novel test for the independence hypothesis using the Hilbert-Schmidt independence criterion (HSIC), which is a kernel independence ...
  47. [47]
    Estimating mutual information | Phys. Rev. E
    Mutual information (MI) is estimated using entropy from k-nearest neighbor distances, which are data efficient and adaptive. MI is zero if variables are ...Abstract · Collections · Article Text
  48. [48]
    [cond-mat/0305641] Estimating Mutual Information - arXiv
    May 28, 2003 · The paper presents improved estimators for mutual information using k-nearest neighbor distances, which are data efficient, adaptive, and have ...
  49. [49]
    [PDF] A new look at measuring dependence - Department of Statistics
    Hellinger distance is another widely used metric to measure distances between distributions. Proposition 0.6 shows that the local Hellinger dependence measure ...
  50. [50]
    A comparative study of statistical methods used to identify ...
    Aug 20, 2013 · This work seeks to summarize the main methods (Pearson's, Spearman's and Kendall's correlations; distance correlation; Hoeffding's D measure; Heller–Heller– ...