Fact-checked by Grok 2 weeks ago

Tensor network

A tensor network is a graphical and algebraic representation of a high-dimensional tensor obtained by contracting a collection of lower-dimensional tensors according to a specific connectivity pattern, enabling efficient approximation and manipulation of complex data structures such as quantum many-body wave functions in systems exhibiting limited entanglement.^[1] Originating from efforts in condensed matter physics to simulate strongly correlated quantum systems, tensor networks trace their roots to the density matrix renormalization group (DMRG) algorithm developed by Steven R. White in 1992, which provided a variational method for finding ground states of one-dimensional Hamiltonians using a matrix product ansatz.^[2] This approach was later reinterpreted in the language of quantum information theory during the early 2000s, leading to the formalization of matrix product states (MPS) as a canonical one-dimensional tensor network form, where the wave function coefficients are parameterized by a chain of matrices with a fixed bond dimension controlling the representational power and entanglement. Extensions to higher dimensions include projected entangled pair states (PEPS) for two-dimensional lattice models, which embed local tensors on sites connected by virtual bonds to capture area-law entanglement, and the multiscale entanglement renormalization ansatz (MERA) for critical systems with scale-invariant correlations. Tensor networks facilitate numerical algorithms such as variational optimization, time evolution via tensor contractions, and entanglement renormalization, making them indispensable for studying ground states, excited states, and thermal properties of quantum systems that are intractable by exact diagonalization.^[1] Beyond quantum physics, they have found applications in classical statistical mechanics for computing partition functions, quantum chemistry for molecular simulations, and emerging fields like machine learning for dimensionality reduction and generative modeling.^[3]

Fundamentals

Definition and Basic Concepts

A tensor is a multi-dimensional array of numbers, generalizable from scalars (rank 0, no indices), vectors (rank 1, one index), and matrices (rank 2, two indices) to higher ranks with multiple indices that label the array elements. These indices typically represent dimensions or degrees of freedom, such as particle positions or spin states in physical systems.^[1] A tensor network is a factorization of a high-dimensional (high-rank) tensor into a collection of interconnected lower-rank tensors, where the connections are defined by contractions over shared indices. Tensor contraction involves summing over the values of one or more shared indices between tensors, effectively reducing the overall rank or computing composite quantities like scalars or lower-rank tensors.^[1] The topology of the network—its graph-like structure of tensors and contraction paths—encodes the logical or physical connectivity, enabling efficient storage and manipulation of the original high-rank tensor. This approach motivates tensor networks by addressing the curse of dimensionality, where the number of elements in a rank-r tensor with dimension d per index grows as d^r, rendering direct computation infeasible for large r in multi-particle quantum systems or high-dimensional data.^[1] By decomposing into lower-rank components, tensor networks exploit structure, such as low entanglement or sparsity, to approximate and process these tensors with polynomial resources. For illustration, consider a simple two-site tensor network representing a rank-4 tensor T_{ijkl} as a contraction of two rank-3 tensors A_{ijm} and B_{mkl}:

T_{ijkl} = \sum_m A_{ijm} B_{mkl},

where i,j,k,l are uncontracted (open) indices and m is the contracted index linking the tensors. This factorization reduces storage from d^4 to $2d^3 elements (assuming dimension d per index), demonstrating the efficiency gain for larger networks.^[1]

Diagrammatic Notation

Diagrammatic notation provides a visual language for representing tensor networks, originally developed by Roger Penrose for multilinear algebra and later adapted to quantum many-body systems and tensor contractions. In this notation, tensors are depicted as boxes or nodes, with lines emanating from them to represent indices; connected lines between boxes indicate contractions over those indices, corresponding to summation in the algebraic expression. This graphical approach simplifies the manipulation of high-order tensors by leveraging spatial intuition, making complex networks more accessible without explicit index tracking.^[4] Standard rules govern the diagrams to ensure consistency and equivalence. Physical indices, which correspond to the local degrees of freedom in quantum systems, are typically represented by vertical lines extending from the tensor boxes. Virtual or bond indices, which connect adjacent tensors, are shown as horizontal lines. Bending rules allow for diagrammatic equivalences, such as the "cup" and "cap" operators that raise or lower indices, and the snake equation, which equates a looped contraction to a simpler identity like \delta_{ij} \delta_{jk} = \delta_{ik}. These conventions facilitate the verification of algebraic identities visually, reducing errors in computations involving multiple summations.^[4] Simple examples illustrate the notation's utility. A basic scalar contraction involves two rank-2 tensors (matrices) with all indices connected pairwise, yielding a scalar value equivalent to the trace of their product. For a more involved case, a closed loop network—such as a cycle of tensors with all indices contracted—evaluates to a trace operation over the effective transfer matrix formed by the network. The advantages of this notation lie in its ability to reveal structural properties of tensor networks. It visually encodes the entanglement structure through the topology of connections and the dimensions of bond indices, denoted by \chi, which control approximation accuracy via truncation in methods like singular value decomposition. This scalability supports efficient handling of large networks, as the diagram highlights contraction orders that minimize computational cost.^[4] Notation conventions distinguish between open and closed indices to clarify the network's output. Open indices remain as unconnected lines, representing free variables or outputs like quantum state components, while closed indices form loops through contractions, yielding invariants such as scalars. In quantum contexts, bra-ket-like representations depict states as kets (|) with downward-pointing physical legs and bras (<|) with upward-pointing ones, aligning the diagrams with Dirac notation for wave functions and operators.^[4]

Types and Constructions

Matrix Product States

Matrix product states (MPS) represent quantum many-body wave functions in one dimension as a chain of tensors, providing an efficient parametrization that exploits the area-law structure of entanglement in low-dimensional systems. For a chain of N sites with local physical dimension d, an MPS is defined as

|\psi\rangle = \sum_{\{s_i\}} \sum_{\{\alpha_i\}} \prod_{i=1}^N A^{(i)}_{s_i \alpha_{i-1} \alpha_i} |s_1 \dots s_N\rangle,

where s_i labels the local basis states, the auxiliary bond indices \alpha_i run from 1 to \chi (the bond dimension), and the tensors A^{(i)} have dimensions d \times \chi \times \chi. This form, introduced in the context of the thermodynamic limit of density-matrix renormalization, allows exact representation of any state with \chi = d^N but enables approximations with smaller \chi for states obeying entanglement area laws. MPS are often expressed in canonical forms to facilitate computations like normalization and expectation values. In the left-canonical form, the tensors satisfy \sum_{s_i} [A^{(i)}_{s_i}]^\dagger A^{(i)}_{s_i} = I up to site i, ensuring isometry from the left; the right-canonical form analogously satisfies \sum_{s_i} B^{(i)}_{s_i} [B^{(i)}_{s_i}]^\dagger = I. These forms are interconverted using singular value decomposition (SVD), where a bipartition tensor is decomposed as M = U \Lambda V^\dagger, with U and V isometries and \Lambda diagonal containing singular values; truncation discards small singular values below a threshold, reducing the bond dimension while bounding the error \| |\psi\rangle - |\psi_{\rm trunc}\rangle \|^2 \leq 2 \sum_k \epsilon_k^2, where \epsilon_k are discarded values. Key properties of MPS include efficient storage and computation, scaling as O(N \chi^2 d) in parameters, far superior to the exponential O(d^N) for full wave functions, enabling simulations of systems with hundreds of sites. They approximate ground states of local Hamiltonians via the density matrix renormalization group (DMRG) algorithm, which variationally minimizes the energy \langle \psi | H | \psi \rangle / \langle \psi | \psi \rangle by iteratively optimizing site tensors through sweeps, solving effective Hamiltonians of size O(d \chi^2). MPS can be constructed exactly for product states (\chi=1) or built variationally from initial guesses, with seminal examples including the Affleck-Kennedy-Lieb-Tasaki (AKLT) state for spin-1 chains, an exact MPS with \chi=2 given by matrices A^{0} = -\frac{1}{\sqrt{2}} \sigma^y, A^{\pm} = \frac{1}{2} (\sigma^x \pm i \sigma^z) \sigma^y / \sqrt{2}, where \sigma^\mu are Pauli matrices; this state is the unique ground state of the AKLT Hamiltonian and exhibits short-range correlations. A limitation of MPS is the growth of the required bond dimension \chi: in gapped systems, \chi remains constant or grows polynomially due to exponentially decaying entanglement, but in critical (gapless) systems, \chi scales algebraically as \chi \sim N^{(c+1)/6} (with central charge c) to capture logarithmic entanglement entropy, demanding larger \chi for accuracy.^[5]

Projected Entangled Pair States

Projected entangled pair states (PEPS) represent a class of tensor network states designed to efficiently describe quantum many-body systems on two-dimensional lattices, such as square or honeycomb grids. In this framework, the state is constructed by first creating a network of maximally entangled virtual pairs across the bonds of the lattice and then applying local projection operators at each site to map these virtual degrees of freedom onto the physical Hilbert space. For a lattice with sites labeled by i, each site hosts a tensor A^{(i)} with one physical index of dimension d (corresponding to the local physical degrees of freedom, e.g., spin-1/2 with d=2) and virtual indices of bond dimension D connecting to neighboring sites. The resulting wave function takes the form

|\psi\rangle = \sum_{\{s\}} \left( \prod_i A^{(i)}_{s_i \alpha_1 \dots \alpha_{z_i}} \right) |\{s\}\rangle,

where \{s\} denotes the physical basis states, the \alpha_k are virtual indices summed over (contracted) along the bonds, and z_i is the coordination number of site i. This construction generalizes the one-dimensional matrix product states to higher dimensions by embedding entanglement through these virtual pairs before projection, allowing PEPS to capture short-range correlations efficiently while respecting the lattice geometry.^[6]^[7] The bond dimension D controls the amount of entanglement per direction, with larger D enabling more complex correlations at the cost of increased computational resources. For translationally invariant systems on infinite lattices, a single tensor A suffices, repeated across sites, which simplifies optimizations and simulations. PEPS naturally satisfy an area law for entanglement entropy, where the entropy S(\rho_R) of a subsystem R scales linearly with the boundary area |\partial R|, specifically bounded by S(\rho_R) \leq |\partial R| \log D for injective PEPS (those where the projection is full rank). This property makes them suitable for gapped systems but also highlights their limitation in exactly representing states with long-range entanglement, such as those at quantum critical points, though approximations with finite D can capture power-law correlations. Unlike one-dimensional matrix product states, which allow polynomial-time contractions, PEPS contractions in two dimensions scale exponentially with the system size, often approximated by treating the boundary as an effective matrix product state.^[7]^[8] Representative examples include the ground state of the two-dimensional Ising model, where PEPS with bond dimension D=2 approximate the critical state by projecting from a classical Gibbs ensemble, achieving high fidelity for correlation functions. Another key application is in quantum error correction, such as the toric code, represented exactly as a PEPS with D=2 that enforces stabilizer constraints and exhibits topological order with fixed entanglement independent of system size. These states demonstrate PEPS's ability to encode protected subspaces for fault-tolerant quantum computing.^[7]^[9] Computing physical observables poses significant challenges due to the inherent complexity of tensor network contractions in two dimensions, which is #P-hard and scales as O(D^{10}) or worse for exact expectation values like \langle \psi | O | \psi \rangle on finite lattices. Partial tracing to obtain reduced density matrices for entanglement measures or thermodynamics is similarly demanding, requiring sophisticated approximations such as boundary matrix product state methods or coarse-graining techniques to achieve feasible scaling, typically limiting simulations to bond dimensions D \leq 10 for moderate lattice sizes. These hurdles underscore the trade-off between representational power and simulability in higher-dimensional systems.^[7]^[10]

Tree Tensor Networks and Multiscale Variants

Tree tensor networks (TTNs) extend the concept of matrix product states to hierarchical, branched structures, enabling the representation of quantum states on irregular geometries such as tree-like lattices or dendrimers. In a TTN, the wavefunction is expressed as a contraction of tensors arranged in a binary tree topology, where leaf nodes correspond to physical sites and internal nodes are branching tensors connecting subtrees. This structure incorporates isometries—unitary tensors that preserve information during coarse-graining—allowing efficient truncation of bond dimensions while maintaining accuracy for systems with bounded entanglement. The binary tree configuration reduces the path length between any two sites to O(\log n), facilitating simulations of systems with nonlocal correlations that are challenging for linear tensor networks.^[11] The multiscale entanglement renormalization ansatz (MERA) builds on similar hierarchical principles but introduces a layered architecture designed specifically for capturing scale-invariant correlations in critical quantum systems. MERA consists of alternating layers of disentanglers—unitary tensors that remove short-range entanglement—and isometries that perform coarse-graining, effectively implementing a real-space renormalization group transformation. A ternary MERA variant, where each isometry connects three sites, is particularly suited for one-dimensional critical models, as it aligns with the scaling dimensions of operators in conformal field theories. Unlike TTNs, MERA's causal cone structure ensures that local observables can be computed with a lightcone of fixed width, independent of system size.^[12] Both TTNs and MERA exhibit logarithmic scaling of entanglement entropy with subsystem size, S \approx c \log l / 3 (where c is the central charge and l the length), making them efficient for scale-invariant systems like critical points where area-law entanglement would overwhelm chain-like representations. The bond dimension \chi controls the expressive power, with typical values of \chi = 4-16 sufficient for one-dimensional critical systems, while the number of layers in MERA scales as O(\log N) for N sites, leading to storage costs of O(\chi^3 N) and contraction times of O(\chi^6 N). These networks excel in handling multi-scale correlations by recursively partitioning the system, preserving essential physics across length scales.^[11]^[12] Construction of these networks proceeds via recursive decomposition: starting from the full Hilbert space, singular value decompositions are applied along tree edges to identify isometries and truncate to bond dimension \chi, iteratively building the hierarchy from leaves to root. For the one-dimensional critical Ising model, a ternary MERA constructed this way reproduces the exact conformal invariance, with correlation functions decaying as r^{-q} (q the scaling dimension) and entanglement entropy matching the analytic form S = (c/3) \log l + const., demonstrating its ability to encode universal critical behavior with fixed \chi.^[12] Variants of TTNs include branching structures for three-dimensional systems, where the tree topology is adapted to volumetric lattices by allowing variable arity at nodes to match geometric irregularities, improving efficiency over planar networks for bulk simulations. Adaptive topologies further enhance flexibility by dynamically adjusting branchings and weights during optimization, particularly for disordered or inhomogeneous quantum many-body states, reducing required \chi by up to 50% in benchmarks on random lattices.^[13]^[14]

Applications in Physics

Simulation of Quantum Many-Body Systems

Tensor networks provide powerful numerical tools for simulating strongly correlated quantum many-body systems, particularly by representing quantum states with low entanglement in a compact form. In one dimension, matrix product states (MPS) serve as efficient ansätze for ground states and low-energy dynamics of local Hamiltonians, enabling variational optimization to approximate solutions with controlled accuracy. This approach has revolutionized the study of 1D quantum lattice models, where exact diagonalization becomes infeasible for large system sizes.^[15] The density matrix renormalization group (DMRG) algorithm, formulated using MPS, finds ground states through iterative variational optimization. For a local Hamiltonian H = \sum_i h_i acting on nearest neighbors, DMRG employs a sweeping procedure: it optimizes site tensors sequentially across the chain, truncating the bond dimension \chi based on the dominant eigenvectors of the reduced density matrix to retain essential entanglement. This process converges to the variational minimum within the MPS manifold, achieving exponential accuracy in \chi for gapped systems due to the area-law scaling of entanglement. The original formulation demonstrated its efficacy for spin chains, with subsequent refinements improving stability and efficiency.^[2] For real-time evolution, the time-evolving block decimation (TEBD) method approximates the unitary e^{-iHt} via Trotterization, decomposing the evolution operator into local gates applied sequentially to MPS tensors, followed by singular value decomposition to truncate bonds. This preserves low entanglement during short-time dynamics, with errors scaling as the square of the time step. Complementarily, the time-dependent variational principle (TDVP) projects the Schrödinger equation onto the tangent space of the MPS manifold, solving local equations of motion for tensor updates without explicit time slicing, offering second-order accuracy for longer evolutions in gapped phases. In higher dimensions, projected entangled pair states (PEPS) extend the MPS framework to 2D lattices, representing states as networks of local tensors with physical and virtual indices, optimized variationally for ground states of 2D Hamiltonians. The infinite PEPS (iPEPS) variant uses translational invariance and boundary mean-field approximations to simulate infinite systems, though contraction costs scale exponentially with bond dimension, necessitating approximations like simple updates or corner transfer matrices. For critical systems with scale-invariant correlations, the multiscale entanglement renormalization ansatz (MERA) efficiently captures logarithmic entanglement via layered isometries and disentanglers, enabling accurate simulations near quantum phase transitions.^[6]^[16] Benchmarks illustrate the precision of these methods. For the 1D spin-1/2 Heisenberg antiferromagnet, DMRG with \chi \approx 500 yields ground-state energies per site accurate to $10^{-6} relative to the exact Bethe ansatz value of -\ln 2 + 1/4 \approx -0.443147, with truncation errors decaying exponentially as e^{-\sqrt{\chi}} for gapped excitations. In the 1D Hubbard model at half-filling and intermediate coupling U/t=4, DMRG achieves energies converging to within $10^{-8} for chains up to 100 sites, capturing Mott insulator properties with bond dimensions \chi \sim 1000. In 2D, iPEPS simulations of the Heisenberg model on square lattices report energies per site within approximately 1.5% of stochastic series expansion results for \chi=4, though accuracy degrades for frustrated cases.^[15]^[15]^[17] A key advantage of tensor network methods is their avoidance of the fermion sign problem in stoquastic or frustration-free Hamiltonians, where matrix elements are non-negative in a suitable basis, allowing stable contractions without phase cancellations that plague quantum Monte Carlo. This enables reliable simulations of models like the transverse-field Ising chain or AKLT states, though exponential entanglement growth in highly frustrated or gapless systems limits applicability beyond low dimensions.^[18] Recent advances as of 2025 include scalable tensor network algorithms for finite-temperature properties of 2D systems, enabling studies of thermal entanglement and phase transitions, and hybrid tensor networks that incorporate noise models for simulating near-term quantum devices. Neuralized fermionic tensor networks have also emerged for efficient approximation of dynamics in fermionic many-body systems.^[19]^[20]

Quantum Information and Entanglement

Tensor networks provide powerful tools for quantifying entanglement in quantum states, particularly through their inherent connection to the Schmidt decomposition. In matrix product states (MPS), a quantum state is represented as a chain of tensors, where the bond dimension \chi bounds the maximum entanglement across any bipartition. The Schmidt decomposition arises naturally when contracting the MPS across a cut, yielding singular values that correspond to the Schmidt coefficients, which quantify the entanglement between subsystems. This allows efficient computation of entanglement measures, such as the von Neumann entanglement entropy S = -\mathrm{Tr} (\rho \log \rho), where \rho is the reduced density matrix obtained from the squared Schmidt coefficients. For example, a Bell pair state can be exactly represented as a simple MPS with bond dimension 2, exhibiting maximal entanglement entropy of \log 2 across the bipartition, while more complex states like GHZ states require higher effective dimensions but still adhere to area-law scaling in one dimension. In higher dimensions, projected entangled pair states (PEPS) extend this framework, enabling the study of entanglement structures beyond one-dimensional chains. PEPS naturally capture area-law entanglement for gapped systems but can represent volume-law scaling in critical or gapless phases, where entanglement grows with subsystem volume rather than boundary area. This distinction highlights tensor networks' utility in distinguishing physical regimes of quantum matter, with diagrammatic notation briefly illustrating entanglement links as tensor contractions. Tensor networks also facilitate the representation and simulation of quantum circuits and gates, encoding unitary evolutions as sequences of tensor operations. For instance, one-dimensional quantum circuits can be mapped to MPS evolutions under local gates, preserving entanglement bounds during time propagation. In measurement-based quantum computation (MBQC), multiscale entanglement renormalization ansatz (MERA) networks model the resource states and measurement patterns, allowing efficient classical simulation of the computation graph through layer-by-layer contractions. This approach leverages the hierarchical structure of MERA to handle logarithmic-depth circuits with controlled entanglement growth. For quantum error correction, PEPS provide a natural encoding of topological codes, such as the Kitaev toric code, where stabilizer constraints are embedded in the tensor projections. Error syndromes are decoded by contracting the PEPS network to minimize effective errors, often using approximate methods like belief propagation or tensor network renormalization to identify logical operators. This contraction-based decoding scales favorably for local errors, achieving thresholds comparable to exact methods in two dimensions and extending to higher-dimensional codes. State preparation in tensor networks involves injecting target quantum states by optimizing tensor parameters to maximize fidelity with the desired output. Variational algorithms initialize tensors with simple product states and iteratively refine them via contractions that compute overlaps and gradients, ensuring high-fidelity approximation within the network's entanglement capacity. For entangled targets like cluster states, this process injects correlations through successive gate applications represented as tensor updates, with fidelity optimized under bond dimension constraints to balance accuracy and computational cost.

Historical Development

Origins in Quantum Mechanics

The conceptual foundations of tensor networks in quantum mechanics emerged from early efforts to represent and approximate entangled quantum states in many-body systems. In the 1930s, Linus Pauling developed valence bond theory (VBT), which described chemical bonding through localized electron pairs forming singlet states, providing an intuitive framework for understanding quantum correlations in molecules and solids.^[21] Pauling extended this in 1949 with the resonating valence bond (RVB) concept, proposing that metals and insulators could be modeled as superpositions of valence bond coverings, capturing delocalized entanglement without full many-body wavefunction expansions.^[22] These ideas laid groundwork for later tensor network representations of valence bond solids (VBS), such as the 1987 Affleck-Kennedy-Lieb-Tasaki (AKLT) model, where exact ground states of spin chains are constructed from projected singlets, prefiguring matrix product states. Graphical methods for handling multilinear algebraic structures further influenced tensor network origins. In 1971, Roger Penrose introduced a diagrammatic notation for tensors, representing them as boxes with lines denoting indices and contractions as connected lines, originally motivated by applications in relativity and quantum gravity but adaptable to quantum many-body calculations.^[23] This notation simplified the visualization of tensor contractions, enabling compact depictions of quantum states and operators, and directly inspired the diagrammatic language of modern tensor networks for tracking entanglement patterns.^[24] By the mid-1970s, numerical techniques addressed intractable problems in quantum impurity models. Kenneth Wilson developed the numerical renormalization group (NRG) in 1975 to solve the Kondo effect, where a magnetic impurity interacts with conduction electrons, leading to logarithmic divergences unresolvable by perturbation theory. NRG iteratively truncates high-energy degrees of freedom while preserving low-energy physics, approximating ground-state wavefunctions through successive diagonalizations—a strategy that prefigures tensor network renormalization by exploiting scale separation and entanglement locality.^[25] These pre-1990s developments, driven by Pauling's bond concepts, Penrose's visuals, and Wilson's numerics, motivated tensor networks as efficient tools for insoluble quantum models.

Key Advances and Milestones

The density matrix renormalization group (DMRG) algorithm, introduced by Steven R. White in 1992, marked a pivotal advancement in simulating one-dimensional quantum many-body systems by providing a variational method to approximate ground states with high accuracy, drastically improving upon earlier renormalization techniques.^[2] This method revolutionized computational approaches to strongly correlated systems, enabling precise calculations of properties like correlation functions in low-dimensional lattices that were previously intractable. In the mid-2000s, the development of projected entangled pair states (PEPS) by Frank Verstraete and J. Ignacio Cirac extended tensor network representations to two-dimensional systems, allowing efficient approximations of ground states while respecting area-law entanglement scaling.^[26] Building on this, Guifre Vidal introduced the multiscale entanglement renormalization ansatz (MERA) in 2007, a hierarchical tensor structure particularly suited for critical systems with scale-invariant correlations, facilitating the study of conformal field theories and quantum phase transitions.^[16] Tensor networks saw significant extensions for real-time dynamics starting in the early 2000s with the time-evolving block decimation (TEBD) method, which applies Trotter decomposition to evolve matrix product states under local Hamiltonians,^[27] and further in the 2010s with the time-dependent variational principle (TDVP), which optimizes time evolution within the manifold of variational states for more accurate long-time simulations.^[28] These advancements enabled approximations in higher dimensions (2D/3D) via techniques like coarse-graining and boundary methods, broadening applications beyond equilibrium ground states.^[29] Concurrently, open-source libraries such as ITensor emerged around 2010 and were formalized in the 2020s, providing robust tools for implementing these algorithms in C++ and Julia, thus democratizing access to tensor network simulations.^[30] Key contributors like Ulrich Schollwöck advanced DMRG through comprehensive reviews and extensions to matrix product states, while Norbert Schuch contributed to PEPS symmetries and entanglement theory, and Román Orús provided influential overviews that bridged tensor networks to broader quantum applications. From 2020 to 2025, hybrid quantum-classical tensor methods integrated variational tensor networks with noisy intermediate-scale quantum hardware, enhancing simulations of open systems and reducing classical computational overhead.^[31] These approaches found use in quantum advantage experiments, where tensor networks verified claims in Gaussian boson sampling by efficiently contracting circuits that classical methods struggled with.^[32] Additionally, machine learning-driven advances enabled automated design of optimal tensor network topologies, originating from physics-inspired optimizations and improving efficiency in high-dimensional representations.^[33]

Connections to Machine Learning

Tensor Decompositions in Data Representation

In machine learning, tensor decompositions based on tensor networks offer a powerful framework for representing high-dimensional data, such as multidimensional arrays arising in imaging or user-item interactions, by exploiting low-rank structures to reduce storage and computational demands while preserving essential correlations. These methods extend beyond pairwise relationships captured by matrices, enabling the modeling of intricate multi-way dependencies in data that would otherwise suffer from the curse of dimensionality. The Tensor Train (TT) decomposition represents a d-dimensional tensor T \in \mathbb{R}^{n_1 \times \cdots \times n_d} as a sequential contraction of three-dimensional core tensors G_k \in \mathbb{R}^{r_{k-1} \times n_k \times r_k} for k=1,\dots,d, with r_0 = r_d = 1 and ranks r_k controlling the approximation quality:

T_{i_1 \dots i_d} = G_1(i_1)_{1,j_1} G_2(i_2)_{j_1,j_2} \cdots G_d(i_d)_{j_{d-1},1},

where the indices j_k run over the bond dimensions r_k. This format, analogous to matrix product states in quantum mechanics, achieves linear storage complexity O(d n r^2) in the tensor order d and mode sizes n, assuming bounded ranks r, making it suitable for compressing multidimensional arrays like video sequences or genomic data.^[34] Hierarchical Tucker (HT) formats extend this idea to tree-structured networks, organizing the tensor into a hierarchy of low-rank subspaces that further exploit sparsity in high-dimensional data. In HT, the tensor is decomposed via a tree where leaf nodes correspond to modes and internal nodes to transfer tensors of reduced rank, slashing parameter counts from exponential O(n^d) to near-linear O(d n r^2) by capturing nested correlations. This structure is particularly effective for sparse tensors, such as those in signal processing or network analysis, where traditional full-rank representations are infeasible. These decompositions find applications in image and video compression, where TT formats enable efficient storage by approximating pixel correlations across spatial and temporal dimensions, achieving significant bitrate reductions without substantial quality loss. In recommender systems, TT and HT models handle user-item-context tensors to predict preferences, outperforming matrix-based methods by incorporating multi-way interactions like time or social factors. The bond dimensions r_k in these formats are tuned to balance approximation error and fidelity, often via cross-validation to minimize reconstruction loss. Compared to principal component analysis (PCA) or singular value decomposition (SVD), which flatten multi-way data into vectors or matrices and thus overlook higher-order interactions, tensor network decompositions better capture multi-way correlations, leading to improved dimensionality reduction and analysis in domains like fMRI data, where spatial, temporal, and subject variabilities are preserved more accurately. For instance, tensor methods yield lower reconstruction errors in 3D medical imaging tasks by maintaining structural dependencies that PCA disrupts. Initialization often employs the TT-SVD algorithm, which sequentially applies SVD to tensor unfoldings to obtain a quasi-optimal low-rank approximation before rounding to the target ranks.^[37]^[34]

Algorithms and Optimization Techniques

Alternating least squares (ALS) is a foundational iterative algorithm for optimizing tensor train (TT) and hierarchical Tucker (HT) decompositions in machine learning tasks, particularly for tensor completion and low-rank approximation of high-dimensional data. The method proceeds by cyclically fixing all but one core tensor and solving a least-squares problem for the remaining core, involving tensor contractions to form unfolding matrices followed by singular value decomposition (SVD) to update the core while enforcing rank constraints. This process minimizes the Frobenius norm error between the original tensor and its low-rank approximation, making it suitable for compressing multi-way data arrays in applications like recommender systems and signal processing. Variants of ALS, such as those incorporating overrelaxation, enhance convergence speed for incomplete tensors by accelerating updates beyond the standard least-squares solution. For HT decompositions, ALS extends naturally to the tree-structured format, where updates propagate through hierarchical clusters, enabling efficient handling of exponentially large tensors in probabilistic modeling.^[38]^[39]^[40] In unsupervised learning, density matrix renormalization group (DMRG)-inspired techniques adapt the sweeping optimization from quantum many-body methods to tensor networks, optimizing TT representations for tasks like clustering high-dimensional datasets. The TT-DMRG algorithm iteratively sweeps through the TT cores, performing local optimizations akin to DMRG's eigenvalue problem on reduced density matrices, which identifies cluster centroids by minimizing reconstruction error while controlling effective dimensionality through bond dimensions. This approach excels in discovering latent structures in datasets, such as grouping similar patterns in image or genomic data, by leveraging the entanglement structure to prune irrelevant correlations. For clustering, TT-DMRG initializes with random TT approximations and refines them via successive SVD-based truncations, achieving scalable performance on datasets with thousands of features.^[41]^[42]^[43] Gradient-based optimization methods have emerged for tensor networks in machine learning, enabling end-to-end training through automatic differentiation (AD) of network contractions and parameter updates. AD computes exact gradients by reverse-mode propagation through the tensor network graph, allowing stochastic gradient descent to minimize loss functions in tasks like supervised classification or regression. A key application is neural network compression via tensorization, where pre-trained weights are decomposed into TT or HT formats, and fine-tuned using AD to recover performance with reduced parameters—achieving significant compression ratios, such as 10-20x on convolutional layers in some implementations, while maintaining accuracy.^[44] These methods integrate seamlessly with frameworks like TensorFlow or JAX, supporting differentiable tensor operations for scalable training on large models.^[45]^[46]^[47] Cross-validation techniques are essential for tuning tensor network hyperparameters, particularly balancing tensor rank against reconstruction fidelity to prevent overfitting in generative models. In tensor train variational autoencoders (TT-VAEs), k-fold cross-validation assesses the trade-off by evaluating evidence lower bound (ELBO) scores across held-out data splits, selecting ranks that maximize generalization while minimizing KL divergence in the latent space. For instance, in modeling continuous data distributions, cross-validation guides bond dimension selection, ensuring the TT-encoded decoder generates samples with low perplexity on validation sets, as demonstrated in density estimation tasks where higher ranks improve fidelity but risk memorization. This process often involves grid search over ranks, with early stopping based on validation loss plateaus.^[48]^[49]^[50] Recent advances up to 2025 include quantum-inspired machine learning libraries like Quimb, which facilitate tensor network optimization through GPU-accelerated contractions and integration with AD frameworks for hybrid quantum-classical workflows. Quimb's modular design supports custom loss functions and scalable simulations, enabling efficient training of TT-based classifiers on datasets exceeding 10^6 samples. Complementing this, GPU implementations such as those in tn4ml and ExaTN leverage tensor cores for parallel SVD and contraction operations, achieving significant speedups with GPU acceleration in gradient computations for large-scale tensorized neural networks. In 2025, further developments include the THOR AI framework for efficient compression of high-dimensional objects using tensor networks^[33] and applications in explainable machine learning for cybersecurity^[51], alongside perspectives on integrating tensor networks in AI4Science^[52]. These tools emphasize interoperability with PyTorch and JAX, broadening adoption in generative modeling and dimensionality reduction.^[53]^[54]^[55]^[56]

References

[1]
A Practical Introduction to Tensor Networks: Matrix Product States ...
Jun 10, 2013 · This is a partly non-technical introduction to selected topics on tensor network methods, based on several lectures and introductory seminars ...
[2]
Density matrix formulation for quantum renormalization groups
Nov 9, 1992 · A generalization of the numerical renormalization-group procedure used first by Wilson for the Kondo problem is presented.
[3]
Tensor Networks for Big Data Analytics and Large-Scale ... - arXiv
Jul 11, 2014 · In this paper we review basic and emerging models and associated algorithms for large-scale tensor networks, especially Tensor Train (TT) decompositions.Missing: introduction | Show results with:introduction
[4]
[1008.3477] The density-matrix renormalization group in the age of ...
Aug 20, 2010 · The density-matrix renormalization group in the age of matrix product states. Authors:Ulrich Schollwoeck.
[5]
Renormalization algorithms for Quantum-Many Body Systems in two ...
Jul 2, 2004 · Renormalization algorithms for Quantum-Many Body Systems in two and higher dimensions. Authors:F. Verstraete, J. I. Cirac.
[6]
[2011.12127] Matrix Product States and Projected Entangled Pair ...
Nov 24, 2020 · We review how matrix product states and projected entangled pair states describe many-body wavefunctions in terms of local tensors.
[7]
Criticality, the Area Law, and the Computational Power of Projected ...
Jun 6, 2006 · The projected entangled pair state (PEPS) representation of quantum states on two-dimensional lattices induces an entanglement based hierarchy in state space.
[8]
https://link.aps.org/doi/10.1103/PhysRevLett.96.220601
[9]
https://arxiv.org/abs/1001.3807
[10]
Classical simulation of quantum many-body systems with a tree ...
Nov 8, 2005 · We show how to efficiently simulate a quantum many-body system with tree structure when its entanglement is bounded for any bipartite split along an edge of ...
[11]
A class of quantum many-body states that can be efficiently simulated
Oct 12, 2006 · We introduce the multi-scale entanglement renormalization ansatz (MERA), an efficient representation of certain quantum many-body states on a D-dimensional ...Missing: original | Show results with:original
[12]
[2404.05784] Hybrid Tree Tensor Networks for quantum simulation
Apr 8, 2024 · In this work, we introduce a novel algorithm to perform ground state optimizations with hybrid Tree Tensor Networks (hTTNs), discussing its advantages and ...
[13]
[2111.12398] Adaptive-weighted tree tensor networks for disordered ...
Nov 24, 2021 · We introduce an adaptive-weighted tree tensor network, for the study of disordered and inhomogeneous quantum many-body systems.
[14]
The density-matrix renormalization group | Rev. Mod. Phys.
Apr 26, 2005 · The density-matrix renormalization group (DMRG) is a numerical algorithm for the efficient truncation of the Hilbert space of low-dimensional strongly ...
[15]
Entanglement Renormalization | Phys. Rev. Lett.
Nov 28, 2007 · The multiscale entanglement renormalization ansatz (MERA) consists of a network of isometric tensors (namely the isometries u and ...
[16]
[0907.2796] Matrix Product States, Projected Entangled Pair ... - arXiv
Jul 16, 2009 · This article reviews recent developments in the theoretical understanding and the numerical implementation of variational renormalization group methods.
[17]
Valence Bond Theory—Its Birth, Struggles with Molecular Orbital ...
Mar 15, 2021 · This essay describes the successive births of valence bond (VB) theory during 1916–1931. The alternative molecular orbital (MO) theory was born in the late ...
[18]
A resonating-valence-bond theory of metals and intermetallic ...
The resonating-valence-bond theory of metals discussed in this paper differs from the older theory in making use of all nine stable outer orbitals of the ...
[19]
[PDF] Roger Penrose: Applications of Negative Dimensional Tensors
A motivation for the above notation is that even in the case of ordinary finite dimensional systems we can retain the full flexibility and simplicity of the ...
[20]
Charged string tensor networks - PNAS
Tensor networks in physics can be traced back to a 1971 paper by Penrose (1). Such network diagrams appear in digital circuit theory, and they form the ...
[21]
[PDF] The Tensor Networks Anthology: Simulation techniques for ... - SciPost
Mar 18, 2019 · This anthology presents numerical simulation techniques using tensor networks for many-body quantum mechanics, focusing on low-dimensional ...
[22]
Criticality, the area law, and the computational power of PEPS - arXiv
Jan 11, 2006 · The projected entangled pair state (PEPS) representation of quantum states on two-dimensional lattices induces an entanglement based hierarchy in state space.Missing: 2004 | Show results with:2004<|separator|>
[23]
Time Dependent Variational Principle for Tree Tensor Networks - arXiv
Aug 8, 2019 · Abstract:We present a generalization of the Time Dependent Variational Principle (TDVP) to any finite sized loop-free tensor network.
[24]
Matrix product states and projected entangled pair states: Concepts ...
Dec 17, 2021 · Matrix product states and projected entangled pair states describe many-body wave functions in terms of local tensors is reviewed.
[25]
The ITensor Software Library for Tensor Network Calculations - arXiv
Jul 28, 2020 · ITensor is a system for programming tensor network calculations with an interface modeled on tensor diagram notation.
[26]
Hybrid Tree Tensor Networks for Quantum Simulation
Jan 29, 2025 · The origin of this behavior can be observed in Fig. 10(b) , which shows the evolution of the energy during the optimization of the quantum ...
[27]
Higher-efficiency quantum algorithm for quantum advantage - Q-NEXT
In this paper, a team introduced a classical algorithm based on tensor networks that can simulate the most recent noisy Gaussian boson sampling experiments.
[28]
AI tensor network-based computational framework cracks a 100 ...
Sep 30, 2025 · The Tensors for High-dimensional Object Representation (THOR) AI framework employs tensor network algorithms to efficiently compress and ...
[29]
Tensor-Train Decomposition | SIAM Journal on Scientific Computing
Tensor-Train decomposition is a simple, nonrecursive tensor decomposition in d dimensions, stable, and uses low-rank approximation of auxiliary unfolding ...
[30]
[PDF] Efficient tensor completion for color image and video recovery
This paper proposes a tensor completion approach using tensor train (TT) rank, which captures global correlation, and introduces two new algorithms for tensor ...
[31]
[2101.11714] TT-Rec: Tensor Train Compression for Deep Learning ...
Jan 25, 2021 · In this paper, we demonstrate the promising potential of Tensor Train decomposition for DLRMs (TT-Rec), an important yet under-investigated context.
[32]
[PDF] PCA vs. Tensor-Based Dimension Reduction Methods
A standard method for this step is principal component analysis (PCA). Unlike PCA that uses vector-based representations, varied tensor-based dimension.
[33]
[PDF] Alternating Least Squares Tensor Completion in the TT-Format - arXiv
Sep 2, 2015 · In this section we introduce the neccessary tools to work with matrix blocks in order to derive and formulate the core step of the ALS and ADF ...
[34]
Variants of Alternating Least Squares Tensor Completion in the ...
We compare an alternating least squares (ALS) fit to an overrelaxation scheme inspired by the LMaFit method for matrix completion.
[35]
(PDF) The Alternating Linear Scheme for Tensor Optimization in the ...
Aug 18, 2025 · In this article, we show how optimization tasks can be treated in the TT format by a generalization of the well-known alternating least squares (ALS) algorithm.
[36]
[2106.12974] Tensor networks for unsupervised machine learning
Jun 24, 2021 · A tensor network model combining matrix product states from quantum many-body physics and autoregressive modeling from machine learning.
[37]
Tensor networks for unsupervised machine learning | Phys. Rev. E
A tensor network model combining matrix product states from quantum many-body physics and autoregressive modeling from machine learning.
[38]
[PDF] Machine Learning with Tensor Networks
[1] Román Orús. A practical introduction to tensor networks: Matrix product ... [55] Norbert Schuch, Michael M Wolf, Frank Verstraete, and J Ignacio Cirac.
[39]
Differentiable Programming Tensor Networks | Phys. Rev. X
Sep 5, 2019 · This automatic differentiation allows for gradient-based optimization of the network parameters and obtains tensor network representation of ...Missing: compression | Show results with:compression
[40]
Compressing Neural Networks Using Tensor Networks with ...
May 15, 2025 · In this study, we propose a general compression scheme that considerably reduces the variational parameters of NNs, regardless of their specific types.
[41]
Variational tensor neural networks for deep learning - Nature
Aug 16, 2024 · We propose an integration of tensor networks (TN) into NN frameworks, combined with a variational DMRG-inspired training technique.Missing: automated | Show results with:automated
[42]
Generative Learning of Continuous Data by Tensor Networks - arXiv
Jul 25, 2024 · We develop methods for modeling different data domains, and introduce a trainable compression layer which is found to increase model performance ...
[43]
[PDF] Tensor-Train Density Estimation
We propose a new generative tensor-based approach tensor-train density estimation (TTDE) that allows fast sampling and efficient computation of functionals of.
[44]
[PDF] Generative modeling via tree tensor network states
The only training parameter is the learning rate which is picked according to cross-validation. In Fig. 14, we plot the result of the comparison between BM ...
[45]
6. Optimizing a Tensor Network using Tensorflow - Quimb
In this example we show how a general machine learning strategy can be used to optimize arbitrary tensor networks with respect to any target loss function.
[46]
bsc-quantic/tn4ml: Tensor Networks for Machine Learning - GitHub
tn4ml is a Python library that handles tensor networks for machine learning applications. It is built on top of Quimb, for Tensor Network objects, and JAX, for ...
[47]
Tensor Network Training and Customization for Machine Learning
Feb 18, 2025 · This paper introduces tn4ml, a novel library designed to seamlessly integrate Tensor Networks into optimization pipelines for Machine Learning ...
[48]
ExaTN: Scalable GPU-Accelerated High-Performance Processing of ...
Jul 5, 2022 · We present ExaTN (Exascale Tensor Networks), a scalable GPU-accelerated C++ library which can express and process tensor networks on shared- as well as ...