Fact-checked by Grok 2 weeks ago

Tensor contraction

In multilinear algebra, tensor contraction is a fundamental operation that reduces the rank of a tensor by two through the summation of products over a paired contravariant (upper) index and covariant (lower) index, effectively applying the canonical pairing between a vector space and its dual space.^[1]^[2] This process, often denoted using the Einstein summation convention by repeating indices, generalizes familiar operations such as the trace of a matrix (for a type (1,1) tensor) or the inner product of vectors, yielding a scalar or lower-rank tensor while preserving multilinearity.^[1]^[3] Tensor contractions play a central role in tensor analysis by enabling the extraction of invariants and the simplification of complex expressions in higher-dimensional spaces.^[2] For instance, contracting a type (1,3) tensor like the Riemann curvature tensor over one upper and one lower index produces the type (0,2) Ricci tensor, a key component in the Einstein field equations of general relativity.^[2] In practical computations, such as those in fluid dynamics, contraction of a force-velocity dyad yields the scalar power, representing the rate of energy dissipation.^[3] Beyond physics, tensor contractions are essential in numerical multilinear algebra for efficient computation in high-dimensional data analysis and machine learning, where they facilitate operations on multidimensional arrays while minimizing computational complexity.^[1] The operation's basis-independent nature ensures its applicability across coordinate systems, making it indispensable for coordinate transformations and the study of geometric properties in manifold theory.^[2]

Definition and Formulation

Abstract formulation

In multilinear algebra, tensor contraction originates from the canonical pairing between a finite-dimensional vector space V over a field k and its dual space V^*, which consists of all linear functionals \phi: V \to k. This pairing induces a natural bilinear map \mathrm{ev}: V \otimes V^* \to k defined by \mathrm{ev}(v \otimes \phi) = \phi(v) for v \in V and \phi \in V^*, extended linearly to the entire tensor product.^[4] This construction generalizes to tensors of higher type. A tensor of type (m,n) belongs to the space T^m_n(V) = (V)^{\otimes m} \otimes (V^*)^{\otimes n}, where \otimes denotes the tensor product of vector spaces. The contraction operation along the p-th contravariant factor (from V) and the q-th covariant factor (from V^*) is the linear map c_{p,q}: T^m_n(V) \to T^{m-1}_{n-1}(V) obtained by applying the evaluation map \mathrm{ev} to those specific factors while leaving the others unchanged. For instance, if T = v_1 \otimes \cdots \otimes v_m \otimes \phi_1 \otimes \cdots \otimes \phi_n, then c_{p,q}(T) contracts v_p with \phi_q via \mathrm{ev}(v_p \otimes \phi_q), resulting in a tensor of type (m-1, n-1). This reduces the total rank (order) of the tensor by 2.^[5]^[4] Tensor contraction extends the classical trace operation on endomorphisms of V. A (1,1)-tensor A \in T^1_1(V) = V \otimes V^* corresponds to a linear endomorphism via the identification \mathrm{End}(V) \cong V \otimes V^*, and its contraction c(A) = \mathrm{ev}(A) \in k is precisely the trace \mathrm{Tr}(A). In basis terms, if \{e_i\} is a basis for V and \{\phi_i\} its dual basis, then \mathrm{Tr}(A) = \sum_i \phi_i(A e_i), but the abstract definition relies solely on the pairing without coordinates.^[4]^[5] As a concrete illustration, consider a (1,1)-tensor T = \sum_{i=1}^{\dim V} e_i \otimes \phi_i, which represents the identity endomorphism. Its contraction yields c(T) = \sum_{i=1}^{\dim V} \phi_i(e_i) = \dim V, the dimension of V. This example underscores contraction as a scalar-valued invariant generalizing the trace.^[4]

Contraction in index notation

In tensor analysis, the Einstein summation convention provides a compact notation for expressing contractions by implying summation over repeated indices, without explicitly writing the summation symbol. This convention, introduced by Albert Einstein, stipulates that whenever an index appears twice in a term—once as an upper (contravariant) index and once as a lower (covariant) index—it is automatically summed over all possible values, typically from 1 to the dimension of the space.^[6] The use of upper and lower indices distinguishes the type of tensor components, ensuring that contractions pair contravariant and covariant indices appropriately to maintain the tensorial nature of the result.^[7] A basic example of contraction arises when pairing a covector (a (0,1) tensor) f_\gamma with a vector (a (1,0) tensor) v^\gamma, yielding the scalar f_\gamma v^\gamma. In an n-dimensional space, this expands to f_\gamma v^\gamma = \sum_{i=1}^n f_i v^i, where the repeated index \gamma (or i) is summed over, effectively computing the inner product.^[6] This operation reduces the total rank from 2 to 0, illustrating how contraction eliminates the shared index dimension.^[7] For a general single contraction within a mixed tensor T^{i_1 \dots i_m}_{j_1 \dots j_n}, one identifies a specific upper index, say i_k, and a specific lower index, say j_l, setting them equal to a repeated index \alpha and summing over \alpha. The result is a tensor of lower rank: S^{i_1 \dots \hat{i_k} \dots i_m}_{j_1 \dots \hat{j_l} \dots j_n} = \sum_\alpha T^{i_1 \dots i_k=\alpha \dots i_m}_{j_1 \dots j_l=\alpha \dots j_n}, where hats denote omitted indices and the rank decreases by 2.^[7] This process preserves the multilinearity and transformation properties of the original tensor.^[7] Unmixed contractions typically involve pairing an upper index from one tensor with a lower index from another, as in the product T^{ab}_c U^c_d, which contracts over the repeated index c to produce a (2,1) tensor S^{ab}_d = \sum_c T^{ab}_c U^c_d. Here, the summation ensures compatibility between the contravariant and covariant positions, avoiding the need for metric tensors to raise or lower indices.^[7] In a d-dimensional space, each such contraction involves exactly d terms in the sum, as the repeated index ranges from 1 to d, directly tying the computational cost to the space's dimensionality.^[6] A simple illustration treats a (1,1) tensor as a matrix T^i_j, where contraction over both indices yields the trace T^i_i = \sum_{i=1}^d T^i_i, equivalent to the matrix trace operation. This scalar invariant highlights contraction's role in extracting dimension-independent quantities from higher-rank objects.^[7]

Variants of Contraction

Metric contraction

Metric contraction refers to the process of contracting a tensor by first using the metric tensor to raise or lower indices, thereby enabling summation over indices of the same variance type—contravariant with contravariant or covariant with covariant. The metric tensor g_{\mu\nu}, a symmetric non-degenerate (0,2) tensor, defines the inner product on the tangent space and serves as the fundamental tool for index manipulation in Riemannian or pseudo-Riemannian geometries.^[8] Its inverse, the contravariant metric g^{\mu\nu}, satisfies g^{\mu\sigma} g_{\sigma\nu} = \delta^\mu_\nu, allowing the transformation of tensor components while preserving the tensor's type under coordinate changes.^[9] To perform a metric contraction, one typically raises or lowers an index using the metric before summing over repeated indices. For instance, consider a mixed tensor T^\mu_{\ \nu\rho}; lowering the first index yields T_{\sigma\nu\rho} = g_{\sigma\mu} T^\mu_{\ \nu\rho}, or raising another allows contraction such as T_{\ \rho} = g^{\mu\nu} T_{\mu \nu \rho}, resulting in a (0,1) tensor.^[9] This process reduces the tensor rank by two and requires the metric's non-degeneracy to ensure the transformation is invertible, distinguishing it from direct contractions that pair one contravariant and one covariant index without metric involvement.^[8] A representative example is the trace of a symmetric (0,2) tensor, such as the stress-energy tensor T_{\mu\nu} in general relativity, contracted with the inverse metric to form a scalar: T = g^{\mu\nu} T_{\mu\nu}. This operation yields a Lorentz-invariant quantity essential for characterizing energy-momentum distributions.^[9] In pseudo-Riemannian manifolds, common in relativity with signature (-,+,+,+), the metric's indefinite nature introduces sign effects in contractions; for example, the trace may include negative contributions from timelike components, influencing physical interpretations like causality.^[8] The concept originated in the development of general relativity, where Einstein employed metric contractions to simplify gravitational field equations, such as forming the trace-reversed Ricci tensor in the Einstein tensor G_{\mu\nu} = R_{\mu\nu} - \frac{1}{2} g_{\mu\nu} R, ensuring covariance and simplifying expressions for spacetime curvature.^[10]

Contraction of two tensors

The contraction of two tensors is performed by first forming their tensor product and then summing over a pair of compatible indices, specifically one covariant index from the first tensor and one contravariant index from the second, thereby reducing the total tensor rank by two.^[11] This operation, rooted in multilinear algebra, arises from the natural pairing between a vector space and its dual, enabling the evaluation of multilinear maps on shared dimensions.^[11] For tensors T of type (p, q) and U of type (r, s) over a vector space of dimension n, where the indices to be contracted match in range from 1 to n, the result is a tensor of type (p + r - 1, q + s), where q counts the non-contracted covariant indices of T (total covariant indices of T being q+1).^[11] In component notation, the contraction is expressed as

(T \contract U)^{\alpha_1 \cdots \alpha_p \, \beta_1 \cdots \beta_{r-1}}_{\gamma_1 \cdots \gamma_q \, \delta_1 \cdots \delta_{s}} = \sum_{\kappa=1}^n T^{\alpha_1 \cdots \alpha_p}_{\gamma_1 \cdots \gamma_q \, \kappa} \, U^{\kappa \, \beta_1 \cdots \beta_{r-1}}_{\delta_1 \cdots \delta_{s}},

where the summation is over the paired index \kappa, and the remaining indices retain their variance.^[11] This pairing is inherently mixed, as it contracts a lower (covariant) index of T with an upper (contravariant) index of U, preserving the overall multilinearity of the result.^[11] A canonical example is matrix multiplication, which contracts two tensors of type (1,1): for A^\alpha_\beta and B^\beta_\gamma, the product yields N^\alpha_\gamma = \sum_\beta A^\alpha_\beta B^\beta_\gamma, a (1,1)-tensor representing the composition of linear maps.^[11] While a single contraction operation pairs exactly one such index pair, extensions to multiple shared indices are achieved through successive contractions on the tensor product, applying the operation iteratively to reduce the rank further by two for each pair.^[11] These successive contractions are associative in the sense that the order of pairing can be rearranged without altering the final result, owing to the associativity of the underlying tensor product: (T \otimes U) \otimes V \cong T \otimes (U \otimes V), which extends to the contracted forms.^[12] This property facilitates efficient computation and reordering in applications involving chains of tensor operations.^[12]

Applications in Geometry and Physics

Tensor fields

In the context of differential geometry, tensor contraction on tensor fields is defined pointwise: for tensor fields T and S on a smooth manifold M, the contraction is performed at each point x \in M using the values T(x) and S(x) in the tensor spaces at x, yielding a new tensor field whose components are obtained by summing over paired indices in local coordinates. For example, the trace of a smooth (1,1)-tensor field T, given by U(x) = T^i{}_i(x) where the Einstein summation convention applies, produces a smooth scalar field U: M \to \mathbb{R}.^[13] This construction is compatible with the transformation laws of tensor fields under coordinate changes, as contractions commute with pullbacks and pushforwards induced by diffeomorphisms, ensuring the result transforms as a tensor of the appropriate type.^[14] In physics, a canonical application occurs with the electromagnetic field strength tensor F^{\mu\nu}, an antisymmetric (2,0)-tensor field on Minkowski spacetime; its full contraction F_{\mu\nu} F^{\mu\nu} (after lowering indices with the metric) forms a smooth Lorentz-scalar field invariant under Poincaré transformations, equal to $2(B^2 - E^2) in natural units, proportional to the difference between the squares of the magnetic and electric field strengths and related to the difference in their contributions to the electromagnetic energy density.^[15] The operation preserves smoothness: if the input tensor fields are smooth (i.e., their components are smooth functions in every coordinate chart), the contracted field is smooth, as it arises from finite sums and products of these components via the contraction mapping.^[13] From a coordinate-independent viewpoint, contractions on tensor fields align with morphisms between tensor bundles and commute with pullbacks \phi^* and pushforwards \phi_* for diffeomorphisms \phi: M \to N, facilitating global definitions on manifolds.^[14] In the framework of differential forms, where tensor fields may be alternating multivectors or forms, specific contractions correspond to the interior product \iota_X \omega, which inserts a vector field X into a k-form \omega to yield a (k-1)-form by contracting the first slot.^[16]

Tensor divergence

The tensor divergence, or covariant divergence, of a contravariant vector field V^\alpha on a pseudo-Riemannian manifold is defined as the scalar obtained by contracting the covariant derivative with the vector:

\operatorname{div} V = \nabla_\alpha V^\alpha.

This operation measures the local expansion or flux of the vector field, analogous to the divergence in vector calculus but adapted to curved spaces.^[17] For higher-rank tensors, the divergence generalizes by contracting the covariant derivative on a contravariant index, reducing the rank by one. For a (1,2) tensor T^\alpha_{\beta\gamma}, the divergence is the (0,2) tensor

(\operatorname{div} T)_{\beta\gamma} = \nabla_\alpha T^\alpha_{\beta\gamma},

where the covariant derivative \nabla_\alpha acts on the entire tensor, incorporating connection terms.^[18] This applies to tensor fields, providing a differential operator that maps tensor fields to lower-rank tensors while preserving tensorial character. Unlike the contraction with partial derivatives \partial_\alpha V^\alpha, the covariant divergence includes Christoffel symbols \Gamma^\lambda_{\alpha\beta} in its expansion,

\nabla_\alpha V^\alpha = \partial_\alpha V^\alpha + \Gamma^\alpha_{\alpha\lambda} V^\lambda,

ensuring tensor covariance in non-coordinate bases and accounting for the manifold's geometry.^[19] In conservation laws, such as in general relativity, the divergence-free condition \nabla_\mu T^{\mu\nu} = 0 for the stress-energy tensor T^{\mu\nu} encodes local energy-momentum conservation, derived from diffeomorphism invariance of the action.^[20] The divergence satisfies a Leibniz rule for tensor products: for tensors T and S with compatible indices,

\operatorname{div}(T \otimes S) = (\operatorname{div} T) \otimes S + T \otimes (\operatorname{div} S),

following from the Leibniz property of the covariant derivative \nabla(T \otimes S) = (\nabla T) \otimes S + T \otimes (\nabla S). An integral form, generalizing the divergence theorem, states that for a tensor field on a compact manifold with boundary, the volume integral of the divergence equals a boundary flux term: \int_M (\operatorname{div} T) \, d\mathrm{vol}_g = \int_{\partial M} T(n, \cdot) \, d\sigma_g, where n is the outward normal.^[21]

Examples in differential geometry

In differential geometry, tensor contraction plays a central role in constructing curvature invariants on Riemannian manifolds, particularly through the Riemann curvature tensor and its decompositions. The Ricci tensor, a fundamental object, is obtained by contracting the Riemann tensor R^\lambda_{\mu\lambda\nu}, which traces the first and third indices to yield the (0,2)-tensor R_{\mu\nu} = R^\lambda_{\mu\lambda\nu}. This contraction captures the volumetric aspect of curvature, representing the trace of how geodesics deviate in volume along certain directions.^[22] Further contraction of the Ricci tensor with the inverse metric g^{\mu\nu} produces the scalar curvature R = g^{\mu\nu} R_{\mu\nu}, a single invariant that summarizes the overall intrinsic curvature of the manifold. This metric contraction is essential for simplifying higher-order expressions and appears in many geometric theorems, such as those involving the Gauss-Bonnet integrand. In the decomposition of the Riemann tensor into Weyl, Ricci, and scalar parts, the Weyl tensor C_{\mu\nu\rho\sigma} remains traceless, and its full contraction C_{\mu\nu\rho\sigma} C^{\mu\nu\rho\sigma} forms a conformally invariant quadratic scalar, highlighting the conformal structure independent of local scaling.^[22]^[23] Another prominent full contraction is the Kretschmann scalar R_{\mu\nu\rho\sigma} R^{\mu\nu\rho\sigma}, which contracts all indices of the Riemann tensor pairwise, providing a gauge-invariant measure of total curvature strength without relying on the metric's trace components. This invariant is particularly useful in analyzing singularities, as it remains finite away from curvature blow-ups and decomposes into contributions from the Weyl and Ricci parts. In the context of general relativity on pseudo-Riemannian manifolds with the Levi-Civita connection, such contractions underpin the Einstein field equations G_{\mu\nu} = R_{\mu\nu} - \frac{1}{2} R g_{\mu\nu}, where the Einstein tensor G_{\mu\nu} inherits divergence-free properties from the contracted second Bianchi identity \nabla^\mu G_{\mu\nu} = 0, ensuring consistency with conservation laws.^[24]^[25]^[26] These contractions extend to non-metric aspects in the Levi-Civita framework, where the identity implies that tensors like the Einstein tensor are divergence-free solely due to the metric-compatible, torsion-free nature of the connection, without additional metric contractions in the divergence operator itself. This property facilitates the study of geometric flows and stability in manifolds, emphasizing contraction's role in revealing intrinsic symmetries.^[26]

Computational Aspects

Algorithms for tensor contraction

The naive algorithm for tensor contraction implements the summation over repeated indices through explicit nested loops, directly following the Einstein summation convention. For a contraction involving k summed indices in a tensor of dimension d, this approach has a time complexity of O(d^k), as it iterates over all possible combinations of index values without optimization.^[27] This method is straightforward for small-scale problems but becomes computationally prohibitive for higher-order tensors or large dimensions due to the exponential growth in operations.^[28] To address the limitations of the naive approach, optimized index ordering algorithms determine the sequence of pairwise contractions that minimizes the sizes of intermediate tensors, thereby reducing overall computational cost. These methods model the tensor network as a hypergraph, where tensors are nodes and indices are hyperedges, and employ techniques such as hypergraph partitioning or simulated annealing to find low-cost contraction paths.^[29] For instance, dynamic programming or greedy heuristics can evaluate contraction orders by estimating the floating-point operations required at each step, achieving speedups of orders of magnitude over naive summation for complex networks. In symbolic computation, libraries like SymPy facilitate tensor contraction through automated index manipulation and summation, enabling exact algebraic expressions without numerical evaluation. SymPy's tensor module supports the creation of indexed objects and applies contractions via the Einstein convention, handling metric raises/lowers and simplifying results symbolically.^[30] This is particularly useful for deriving analytical forms in theoretical contexts, where the Tensor class manages contractions through methods like contract_metric.^[31] For numerical implementations, libraries such as NumPy and TensorFlow provide efficient routines for batched tensor contractions, leveraging optimized linear algebra kernels. NumPy's einsum function evaluates contractions specified in Einstein notation, supporting automatic path optimization via einsum_path to minimize intermediate memory usage.^[32] Similarly, TensorFlow's tensordot operation performs contractions along specified axes, integrating seamlessly with GPU acceleration for large-scale computations in machine learning workflows.^[33] These tools handle broadcasting and parallelization implicitly, making them suitable for high-throughput applications. Handling sparsity in tensor contractions avoids unnecessary dense computations by exploiting zero entries, using algorithms that process only non-zero elements during summation. Approaches like those in the Sparta framework employ element-wise sparse accumulation, indexing non-zeros to reduce memory footprint and achieve speedups of 28 to 576 times over traditional sparse tensor contraction methods using a sparse accumulator, on irregular sparsity patterns.^[34] Such techniques map contractions to sparse matrix multiplications or hash-based lookups, preserving efficiency for tensors with low fill-in. As an illustrative example, consider the pairwise contraction S^{ik} = T^{ij} U_{jk}, implemented in pseudocode as follows:

function contract_pairwise(T, U, d):
    S = zeros(d, d)
    for i in 0 to d-1:
        for k in 0 to d-1:
            temp = 0
            for j in 0 to d-1:
                temp += T[i][j] * U[j][k]
            S[i][k] = temp
    return S
function contract_pairwise(T, U, d):
    S = zeros(d, d)
    for i in 0 to d-1:
        for k in 0 to d-1:
            temp = 0
            for j in 0 to d-1:
                temp += T[i][j] * U[j][k]
            S[i][k] = temp
    return S

This naive loop structure can be optimized by reordering indices or vectorizing in libraries like NumPy via einsum('ij,jk->ik', T, U).^[32]

Complexity and implementations

The computational complexity of tensor contraction is a critical factor in its scalability, particularly for high-dimensional tensors. For a single contraction between an m-index tensor and an n-index tensor over one shared index of dimension d, the time complexity is O(d^{m+n-1}), arising from the summation over d terms, each involving multiplications across the remaining indices.^[35] In multi-contraction scenarios, such as those involving networks of multiple tensors, the complexity grows exponentially with the number of indices and the contraction width, often scaling as O(2^t) where t is the treewidth of the contraction graph, making exact computation infeasible for large systems.^[36] Space complexity also poses significant challenges, especially in sequential contractions where intermediate tensors must be stored. Sequential pairwise contraction requires space bounded below by \Omega(2^c), with c denoting the contraction width, as each step merges tensors and accumulates large intermediates.^[36] In contrast, parallel approaches, such as index slicing, decompose the network into independent subtasks, reducing space demands to fit within a single computational unit while distributing the workload, though they introduce minor overhead from task partitioning.^[36] To mitigate these costs, parallelization on specialized hardware is essential for large-scale contractions. Graphics processing units (GPUs) and tensor processing units (TPUs) accelerate tensor operations through tensor parallelism, where indices are sharded across devices to parallelize summations and multiplications, achieving speedups of up to 10x over CPU-based methods for batched contractions.^[37] This is particularly effective for deep learning and simulation workloads, where TPUs optimize matrix-like tensor cores for mixed-precision arithmetic.^[38] Several libraries provide robust implementations for tensor contraction, balancing efficiency and usability. The TensorNetwork library in Python, developed by Google, supports contractions via a graph-based interface compatible with backends like TensorFlow and JAX, enabling automatic optimization of contraction paths for networks up to hundreds of tensors.^[39] ITensor, a C++ library, uses "intelligent indices" that automatically sum over matching indices during multiplication (via the * operator), with features like prime levels to control contraction order, making it suitable for high-performance simulations in quantum physics.^[40] In Mathematica, the TensorContract function performs contractions on symbolic or numeric tensors by specifying slot pairs, supporting arbitrary-rank operations with built-in optimization for algebraic simplifications.^[41] Key challenges in implementations include numerical overflow during high-precision computations, where accumulated sums in fixed- or arbitrary-precision arithmetic exceed representable bounds, leading to instability in long contractions.^[42] For very large tensors, approximation methods such as low-rank decompositions—where intermediate tensors are projected onto binary tree networks with controlled rank—enable feasible computation by bounding errors while reducing costs asymptotically.^[43] Recent advances, such as hyperoptimized approximate contractions using randomized protocols (as of 2024), further mitigate costs by finding high-quality paths for complex networks.^[44] For instance, contracting 4-index tensors, such as those arising in general relativity simulations, on modern GPU frameworks can achieve high throughputs with careful index ordering, offering significant speedups over sequential CPU execution.

Generalizations and Modern Contexts

Algebraic generalizations

Tensor contraction generalizes to the setting of modules over a commutative ring R, where the operation relies on bilinear pairings between a module and its dual. For an R-module M, the dual module is \hom_R(M, R), and the canonical bilinear pairing \hom_R(M, R) \times M \to R induces contractions on the tensor powers M^{\otimes k} \otimes (\hom_R(M, R))^{\otimes l}, reducing the rank by pairing covariant and contravariant factors. This construction is well-defined for free modules of finite rank, where the tensor algebra admits a basis allowing explicit contractions analogous to the vector space case, but extends more broadly via the universal property of the tensor product over R. Such generalizations are crucial in commutative algebra for studying invariants and resolutions.^[45] In the sheaf-theoretic framework, tensor contractions operate on sections of tensor bundles over ringed spaces, such as topological spaces equipped with a sheaf of rings or schemes. For sheaves of O_X-modules \mathcal{F} and \mathcal{G} on a space X, the tensor product sheaf \mathcal{F} \otimes_{O_X} \mathcal{G} has sections that locally behave like tensor products of modules, and contractions are defined fiberwise via pairings on stalks, which are local rings. This yields a sheaf morphism from sections of mixed tensor bundles to lower-rank sheaves, preserving the sheaf structure. On schemes, this aligns with operations in coherent sheaf theory, enabling contractions in derived categories. A specific example arises in complex geometry: on a complex manifold, holomorphic tensor fields are global sections of holomorphic tensor bundles, and contraction between a holomorphic covector field (section of the dual bundle) and a holomorphic vector field produces a holomorphic 0-form, i.e., a holomorphic function, thereby preserving analyticity due to the holomorphic nature of the pairing on fibers. This property ensures that contractions map the space of holomorphic tensors to holomorphic sections of the resulting bundle.^[46] In non-commutative algebras, tensor contractions generalize to traces that project onto the center, with properties governed by Hochschild cohomology, which classifies infinitesimal deformations and extensions where contractions play a role in defining cyclic invariants. This framework extends classical traces to non-commutative settings, linking contractions to cyclic homology via the Connes-Hochschild-Kostant-Rosenberg theorem.^[47] From a categorical viewpoint, tensor contractions in the category of vector spaces over a field manifest as natural transformations, particularly the counit of the adjunction between the tensor product functor and the internal Hom functor: for vector spaces V, W, the evaluation map V^* \otimes V \to k is the component of a natural transformation from -\otimes V to \hom(V, -), extended multiplicatively to higher tensors. This perspective unifies contractions across dimensions and emphasizes their functorial coherence.^[48] The algebraic generalizations of tensor contraction trace their origins to Élie Cartan's development of exterior differential systems in the 1920s, where he extended the exterior algebra by introducing the interior product—a contraction operator pairing forms with vector fields—facilitating invariant formulations in differential geometry.^[49]

Applications in quantum computing and machine learning

In quantum computing, tensor contractions play a central role in tensor network methods, such as matrix product states (MPS) and projected entangled pair states (PEPS), which represent quantum many-body states efficiently while respecting entanglement area laws.^[50] These networks enable the computation of expectation values of local operators by contracting the tensor network with the operator tensor, scaling polynomially with system size for one-dimensional systems via MPS. For two-dimensional systems, PEPS contractions approximate expectation values and entanglement entropy, such as the von Neumann entropy, by boundary MPS approximations that bound the entanglement entropy by the boundary area. For instance, a two-qubit gate, represented as a four-index tensor with indices corresponding to input and output qubits, is incorporated into the network through contractions that propagate the state evolution.^[50] Variational Monte Carlo algorithms leverage approximate tensor contractions to optimize ground-state wavefunctions in strongly correlated quantum systems. By parameterizing the wavefunction as a tensor network and sampling configurations via Monte Carlo, contractions evaluate the energy expectation value, enabling efficient ground-state searches for lattice models like the Hubbard model. This approach combines the variational principle with tensor network ansätze to achieve accuracies comparable to exact diagonalization for small systems while scaling to larger ones through bond dimension truncation.^[51] In machine learning, tensor contractions underpin decompositions like CANDECOMP/PARAFAC (CP) and Tucker, which reduce high-dimensional data tensors for tasks such as image compression and feature extraction. The CP decomposition approximates a tensor as a sum of rank-one tensors, with contractions computing the factor matrices via alternating least squares to minimize reconstruction error, achieving dimensionality reduction by capturing low-rank structure in multi-way arrays. Tucker decomposition extends this by contracting a core tensor with factor matrices, facilitating multilinear subspace learning in applications like recommender systems.^[52] In transformer models, self-attention mechanisms rely on tensor contractions between query, key, and value matrices to compute scaled dot-product attention scores, enabling the model to weigh token interactions dynamically across sequences.^[53] Post-2020 developments have integrated tensor contractions into quantum machine learning hybrids, such as tensorized neural networks that embed quantum-inspired tensor structures into classical architectures for enhanced expressivity. These networks use contractions to parameterize layers with low-rank tensors, reducing parameters while improving generalization in tasks like generative modeling, as demonstrated in variational quantum circuits hybridized with neural networks. Efficient contraction protocols, including randomized path optimizations, have accelerated simulations in these hybrids, enabling scalable training on NISQ devices. Recent advances as of 2025 include hybrid quantum-classical algorithms for approximating ground states of two-dimensional systems using isometric tensor networks and neuralized fermionic tensor networks that improve accuracy in simulating Fermi-Hubbard models by orders of magnitude.^[54]^[55]^[51] A key challenge in these applications is the scalability of contractions for high-rank tensors in quantum simulations, where exponential growth in intermediate tensor sizes limits classical feasibility. For example, simulating Google's Sycamore processor's 53-qubit, 20-cycle random circuits requires hyper-optimized contraction paths to manage the computational cost, which can exceed supercomputer resources without approximations like lightcone slicing. One illustrative example is contracting a quantum circuit tensor network to compute measurement probabilities: the circuit is represented as a sequence of gate tensors, and full contraction yields the probability distribution over output bitstrings, as achieved in big-batch methods that parallelize over output subspaces for circuits up to 60 qubits.^[50]

References

[1]
[PDF] Multilinear Mappings and Tensors - UCSD CSE
This operation is also called contraction. Note that if we start with a (1¡) tensor T, then we can contract the components of T to obtain the scalar Tiб.
[2]
[PDF] Appendix A Multilinear algebra and index notation
Operations such as the tensor product, wedge product, tensor contractions, musical isomorphisms and the Hodge star are defined on bundles exactly the same way ...
[3]
[PDF] An Introduction to Tensors for Students of Physics and Engineering
This document introduces tensors for physics and engineering students, serving as a bridge from undergraduate math to tensor analysis. It records early notions ...
[4]
[PDF] Tensors and Hypermatrices
The first two sections introduce (1) a hypermatrix, (2) a tensor as an element of a tensor product of vector spaces, its coordinate representation as a ...
[5]
https://arxiv.org/pdf/2210.16182.pdf
[6]
Einstein Summation -- from Wolfram MathWorld
Einstein summation is a notational convention for simplifying expressions including summations of vectors, matrices, and general tensors.
[7]
Tensor Contraction -- from Wolfram MathWorld
Tensor Contraction. The contraction of a tensor is obtained by setting unlike indices equal and summing according to the Einstein summation convention.
[8]
[PDF] Introduction to Tensor Calculus for General Relativity - MIT
The second way to change the rank of a tensor is by contraction, which reduces the rank of a (m, n) tensor to (m − 1,n − 1). The third way is the gradient. We.
[9]
[PDF] Introduction to Tensor Calculus arXiv:1603.01660v3 [math.HO] 23 ...
May 23, 2016 · Therefore, the contraction of a rank-2 tensor is a scalar, the contraction ... The contravariant metric tensor is used for raising indices of ...
[10]
[PDF] THE FOUNDATION OF THE GENERAL THEORY OF RELATIVITY
he “HE special theory of relativity is based on the following postulate, which is also satisfied by the mechanics of Galileo and Newton.
[11]
https://arxiv.org/pdf/1603.01660.pdf
[12]
None
### Summary of Tensor Analysis Concepts from the Document
[13]
[PDF] lee-smooth-manifolds.pdf - MIT Mathematics
... Lee, John M., 1950- p. cm. - (Graduate texts in mathematics; 218). Includes bibliographical references and index. K.A. Ribet. Mathematics Department.
[14]
[PDF] Differential geometry Lecture 11: Tensor bundles and tensor fields
May 25, 2020 · Contractions of tensor fields commute with the pushforward and with the pullback defined above.
[15]
[PDF] 5. THE ELECTROMAGNETIC FIELD TENSOR
The anti-symmetric tensor Fµν is called the electromagnetic field tensor; its components will be detailed shortly. • Eq. (7) is the covariant form of the ...
[16]
[PDF] Differential Forms - MIT Mathematics
Feb 1, 2019 · will discuss the interior product operation of vector fields on differential forms, a general- ization of the duality pairing of vector ...
[17]
https://arxiv.org/pdf/1407.4322.pdf
[18]
Generalized divergence of tensor in GR - Physics Stack Exchange
Oct 1, 2018 · The general equation for the divergence of a completely contravariant tensor in terms of a coordinate derivative operator ∂μ is ∇μTμν1…νn=∂μTμ ...
[19]
Divergence from covariant derivative? - Mathematics Stack Exchange
Sep 24, 2019 · When you define the basis vectors to be partial derivatives in tensor calculus, they don't line up with the unit vectors used in vector calculus ...
[20]
https://arxiv.org/pdf/gr-qc/9806050.pdf
[21]
Divergence theorem for tensor fields on manifolds
Sep 12, 2017 · The divergence of such a tensor is defined as divT=tr(∇T), where the trace is on the last two indices -- the contravariant TM index, and the ...Does the Divergence Theorem hold for arbitrary tensor fields?Integration by parts for covariant tensor fields - Math Stack ExchangeMore results from math.stackexchange.com
[22]
[PDF] Physical and Geometric Interpretations of the Riemann Tensor, Ricci ...
Sep 7, 2016 · R(P, Q, S, T) = RµνρσPµQνSρTσ. The Ricci tensor is the contraction of the Riemann tensor, and will be written as R with just two indeces or ...
[23]
[PDF] arXiv:1612.00351v1 [hep-th] 1 Dec 2016
Dec 1, 2016 · 2. HOLOGRAPHIC WEYL ANOMALIES tensor squared W2 ≡ WabcdWabcd which is the obvious independent Weyl-invariant local curvature combination (type-B) ...
[24]
[PDF] Kretschmann Invariant and Relations between Spacetime ... - arXiv
... Kretschmann scalar (from then on KS) which sometimes is also called Riemann tensor ... scalar, which can be obtained by contraction of the Riemann tensor, first α.
[25]
[PDF] An Introduction to FRW Cosmology and dark energy models - arXiv
Jan 15, 2021 · In this thesis we will focus on Einstein's interpretation of gravity. We will exam- ine how the most famous equations in cosmology are ...
[26]
[PDF] arXiv:2212.08304v2 [gr-qc] 14 Apr 2023
Apr 14, 2023 · In GR the contracted Bianchi identities validate the energy conservation law of ordinary matter, and it profoundly separates the geometrical and ...
[27]
[PDF] A Parallel Tensor Network Contraction Algorithm and Its ...
However, the time complexity of the naive algorithm is often too high to be practical: Since the definition requires iterating over every possible combination ...
[28]
Algorithms for tensor network contraction ordering - IOPscience
We explore the performance of simulated annealing and genetic algorithms, two common discrete optimization techniques, to this ordering problem.
[29]
Hyper-optimized tensor network contraction - Quantum Journal
Mar 15, 2021 · In this work, we implement new randomized protocols that find very high quality contraction paths for arbitrary and large tensor networks.
[30]
Tensor - SymPy 1.14.0 documentation
Tensor¶. A module to manipulate symbolic objects with indices including tensors. Contents¶. N-dim array · N-dim array expressions · Indexed Objects · Methods ...
[31]
Tensor - SymPy 1.14.0 documentation
Tensor indices are contracted with the Einstein summation convention. An index can be in contravariant or in covariant form.
[32]
numpy.einsum — NumPy v2.3 Manual
numpy.einsum evaluates the Einstein summation convention on operands, representing multi-dimensional, linear algebraic array operations in a simple way.
[33]
tf.tensordot | TensorFlow v2.16.1
Tensor contraction of a and b along specified axes and outer product.
[34]
Sparta: high-performance, element-wise sparse tensor contraction ...
We show that Sparta brings 28 -- 576× speedup over the traditional sparse tensor contraction with sparse accumulator.
[35]
[PDF] Reducing Computational Complexity of Tensor Contractions via ...
Sep 8, 2021 · The paper introduces TTCP, a Tensor-Train based method to reduce the computational complexity of TCP from exponential to independent of tensor ...
[36]
Efficient parallelization of tensor network contraction for simulating ...
Sep 13, 2021 · A cleverly chosen order of pairwise contraction can often reduce the total time complexity of tensor network contraction by several orders of ...
[37]
Tensor Parallelism with jax.pjit - irhum.github.io
Oct 10, 2022 · Tensor parallelism is powerful, allowing us to scale from 1 GPU/TPU to all 8 connected GPU/TPUs, and when using larger slices of a TPU pod ...
[38]
An in-depth look at Google's first Tensor Processing Unit (TPU)
May 12, 2017 · In this post, we'll take an in-depth look at the technology inside the Google TPU and discuss how it delivers such outstanding performance.
[39]
https://github.com/google/TensorNetwork
[40]
Contracting ITensors
ITensor contraction is a commutative operation. A More Complex Example. Of course, the real usefulness of a tensor library is handling cases where tensors have ...
[41]
TensorContract - Wolfram Language Documentation
TensorContract[tensor, {{s11, s12}, {s21, s22}, ...}] yields the contraction of tensor in the pairs {s i1, s i2} of slots.
[42]
Numerical behavior of NVIDIA tensor cores - PMC - NIH
We explore the floating-point arithmetic implemented in the NVIDIA tensor cores, which are hardware accelerators for mixed-precision matrix multiplication.
[43]
[2406.09769] Approximate Contraction of Arbitrary Tensor Networks ...
Jun 14, 2024 · We introduce a method to efficiently approximate tensor network contractions using low-rank approximations.
[44]
[PDF] High-Performance Tensor Contractions for GPUs
Abstract. We present a computational framework for high-performance tensor contractions on GPUs. High-performance is difficult to obtain using existing ...Missing: overflow | Show results with:overflow
[45]
[PDF] Tensors on free modules - SageManifolds
Let be a commutative ring and a free module of finite rank over , i.e. a module ... • tensor contraction. Tensor product. The tensor product is formed with ...
[46]
Section 17.16 (01CA): Tensor product—The Stacks project
17.16 Tensor product. We have already briefly discussed the tensor product in the setting of change of rings in Sheaves, Sections 6.6 and 6.20.Missing: contraction bundles
[47]
[PDF] A Beginner's Guide to Holomorphic Manifolds
A manifold M is “complex” if TM is a complex vector bundle, and is “holomorphic” if TM is a holomorphic vector bundle. This is in accord with the informal usage ...
[48]
[PDF] On a generalized Connes-Hochschild-Kostant-Rosenberg theorem
Connes's theorem shows that cyclic homology is indeed a far-reaching generalisation of de Rham cohomology for noncommutative topological algebras, and serves as ...
[49]
[PDF] Tensor Categories - MIT Mathematics
(Mathematical surveys and monographs ; volume 205).
[50]
[PDF] Remarks on the history of the notion of Lie differentiation
The Cartan formula (1) represents L(X) as a bracket, as defined in (3), of d and i(X). The contraction i(X) generalizes to fields of vector-valued exterior ...<|control11|><|separator|>