Computing the permanent
The permanent of an n \times n matrix A = (a_{ij}) with entries in \mathbb{R} or \mathbb{C} is a scalar-valued function defined as
\operatorname{per}(A) = \sum_{\sigma \in S_n} \prod_{i=1}^n a_{i,\sigma(i)},
where S_n is the symmetric group on \{1, 2, \dots, n\}, and the sum runs over all n! permutations \sigma. This multilinear function is analogous to the determinant but lacks the alternating sign (-1)^{\operatorname{sgn}(\sigma)} in its terms, making it non-vanishing under certain transformations that simplify the determinant.
Computing the permanent exactly is a cornerstone problem in algebraic and counting complexity theory, shown by Valiant to be #P-complete in 1979, even for matrices with 0-1 entries, implying it is at least as hard as counting solutions to NP-complete problems. This hardness holds under parsimonious reductions for 0-1 permanents, though general permanents require many-one reductions, and no polynomial-time algorithm is known despite extensive study.[1] In contrast, the determinant can be evaluated in O(n^\omega) time via Gaussian elimination or related methods, where \omega \leq 2.371339 (as of 2024) is the matrix multiplication exponent, highlighting the permanent's computational intractability.[2]
The permanent arises in diverse applications, notably in combinatorics where \operatorname{per}(A) for a 0-1 biadjacency matrix A of a bipartite graph counts the number of perfect matchings, with implications for matching theory and network flows.[3] It also features in statistical mechanics for partition functions of dimer models, in probability for evaluating expectations in certain stochastic processes, and in invariant theory for studying polynomial representations.[3] Exact computation for general n \times n matrices requires exponential time in the worst case, with practical algorithms like Ryser's inclusion-exclusion formula running in O(n \cdot 2^n) time, while dynamic programming variants achieve similar bounds for structured inputs.[2]
For approximation, fully polynomial randomized approximation schemes (FPRAS) exist: Jerrum and Sinclair introduced a scheme in O(n^9 / \epsilon^2) time for 0-1 matrices in 1989, later improved by Jerrum, Sinclair, and Vigoda to O(n^{10} \poly(\log n / \epsilon^2)) time for arbitrary nonnegative matrices in 2004, enabling (1+\epsilon)-approximations with high probability.[4][5] Recent advances include deterministic approximations within factors like e^n for certain dense matrices and optimized implementations for moderate n, though exact computation remains infeasible for large n > 50 in general.[6]
Fundamentals
Definition of the permanent
The permanent of an n \times n matrix A = (a_{ij}) with entries in a commutative ring is defined as
\operatorname{per}(A) = \sum_{\sigma \in S_n} \prod_{i=1}^n a_{i,\sigma(i)},
where S_n denotes the symmetric group consisting of all permutations of \{1, \dots, n\}.[7] This summation extends over the n! permutations, with each term in the product corresponding to a selection of entries from distinct rows and columns, without the alternating sign that distinguishes the determinant.[7]
The permanent shares several algebraic properties with the determinant. It is multilinear in the rows (or columns), meaning that if one row is replaced by a linear combination of the rows, the permanent transforms linearly accordingly while the other rows remain fixed.[8] Additionally, it is homogeneous of degree n: scaling the entire matrix by a scalar \lambda multiplies the permanent by \lambda^n.[8]
The concept originated in the work of Augustin-Louis Cauchy, who introduced "fonctions symétriques permanentes" in his 1812 memoir on symmetric functions related to permutations, as a counterpart to alternating functions like the determinant.
To illustrate, consider a $2 \times 2 matrix
A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}.
The permanent is \operatorname{per}(A) = ad + bc, summing the two permutation products without signs.[7] For a $3 \times 3 matrix
B = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix},
the permanent expands to
\operatorname{per}(B) = a_{11}a_{22}a_{33} + a_{11}a_{23}a_{32} + a_{12}a_{21}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} + a_{13}a_{22}a_{31},
capturing all six permutation terms additively.[7]
Combinatorial interpretation and relation to determinant
The permanent of a nonnegative matrix A = (a_{ij}) admits a natural combinatorial interpretation as a weighted count of perfect matchings in the complete bipartite graph K_{n,n} with bipartition (U, V), where U = \{u_1, \dots, u_n\} and V = \{v_1, \dots, v_n\}, and the weight of edge (u_i, v_j) is a_{ij}. Specifically, \operatorname{per}(A) equals the sum of the products of edge weights over all perfect matchings in this graph.[9] When A is a $0-$1 matrix representing the biadjacency matrix of a bipartite graph G (with no weights, so a_{ij} = 1 if edge (u_i, v_j) exists and $0 otherwise), the permanent simplifies to the unweighted count of perfect matchings in G. This interpretation underscores the permanent's role in enumerative combinatorics, such as counting the number of ways to assign n distinct tasks to n agents under compatibility constraints encoded by the graph.[9]
In contrast, the determinant of a matrix A shares a structurally similar Leibniz formula but incorporates an alternating sign:
\det(A) = \sum_{\sigma \in S_n} \operatorname{sgn}(\sigma) \prod_{i=1}^n a_{i,\sigma(i)},
where S_n is the symmetric group on n elements and \operatorname{sgn}(\sigma) = (-1)^{\operatorname{inv}(\sigma)} is the sign of the permutation \sigma, with \operatorname{inv}(\sigma) denoting the number of inversions. This sign function introduces antisymmetry: swapping two rows (or columns) multiplies the determinant by -1, enabling row reduction techniques like Gaussian elimination that preserve the value up to sign. The permanent, lacking this alternation, has all terms positive, resulting in no such symmetry or cancellation between summands.[10] Consequently, while the determinant benefits from efficient algebraic methods—such as LU decomposition, where \det(A) = \det(L) \det(U) as the product of diagonal entries after factorization, or spectral methods involving eigenvalues—the permanent resists analogous decompositions or factorizations due to the absence of signs, precluding cancellation and linear algebra tools that exploit antisymmetry.[10]
This structural disparity manifests clearly in simple examples. Consider the n \times n all-ones matrix J_n, where every entry is $1. Here, \operatorname{per}(J_n) = n!, counting the n! permutations in S_n (or equivalently, all possible perfect matchings in the complete bipartite graph K_{n,n}). In contrast, \det(J_n) = 0 for n > 1, as the rows are linearly dependent, leading to complete cancellation of the signed terms. This example highlights how the permanent accumulates without interference, yielding large values, whereas the determinant's signs often result in zeros or smaller magnitudes through destructive interference.[10]
Exact Computation Algorithms
Naive algorithm via permutation summation
The naive algorithm for computing the permanent of an n \times n matrix A = (a_{i,j}) directly implements its definition by enumerating all n! permutations \sigma of \{1, 2, \dots, n\} and summing the products \prod_{i=1}^n a_{i,\sigma(i)} for each permutation.[11] This approach requires generating each permutation explicitly, computing the corresponding product (which involves n multiplications), and accumulating the sum with n! additions overall.[11]
The time complexity of this method is O(n! \cdot n), arising from the n! permutations, each processed in O(n) time for the product calculation and summation.[11] Space complexity is O(1) additional space beyond the input matrix if permutations are generated iteratively, though recursive implementations (common for permutation generation) require O(n) space for the recursion stack and tracking used columns.[11]
A recursive outline for the algorithm, which builds permutations column by column while avoiding reuse, is as follows:
[function](/page/Function) permanent(A, n, row = 0, used = [false] * n):
if row == n:
prod = 1
for i in 0 to n-1:
prod *= A[i][perm[i]] // perm tracks current assignment
return prod
sum = 0
for col in 0 to n-1:
if not used[col]:
used[col] = true
perm[row] = col
sum += permanent(A, n, row + 1, used)
used[col] = false
return sum
[function](/page/Function) permanent(A, n, row = 0, used = [false] * n):
if row == n:
prod = 1
for i in 0 to n-1:
prod *= A[i][perm[i]] // perm tracks current assignment
return prod
sum = 0
for col in 0 to n-1:
if not used[col]:
used[col] = true
perm[row] = col
sum += permanent(A, n, row + 1, used)
used[col] = false
return sum
This pseudocode starts with an empty partial permutation and branches on available columns at each row, yielding the full sum upon reaching depth n.[11]
Due to the factorial growth in time complexity, the algorithm is practically feasible only for small n \leq 10, where $10! \approx 3.6 \times 10^6 terms can be processed in seconds on modern hardware; beyond this, computation becomes prohibitively slow even on high-performance systems.[11] For instance, an n=10 case with modest entry values completes rapidly, but n=12 escalates to roughly $4.8 \times 10^8 operations, often taking minutes or more depending on implementation efficiency.[11]
This direct summation method mirrors the permanent's original definition and traces its computational roots to early 19th-century work by Binet and Cauchy in 1812, who introduced the function in the context of symmetric multivariable expressions.[5]
The Ryser formula provides an exact method for computing the permanent of an n \times n matrix A = (a_{ij}) using inclusion-exclusion over subsets of the column indices = \{1, \dots, n\}. Introduced by Herbert J. Ryser in 1963, this approach significantly improves upon the naive permutation summation by reducing the computational burden from factorial to exponential time, making it the foundational exact algorithm for general matrices.[12][13]
The derivation relies on the principle of inclusion-exclusion applied to the combinatorial interpretation of the permanent as the number of perfect matchings in a bipartite graph, or equivalently, the sum over derangement-like terms avoiding fixed points in column selections. Specifically, for each subset S \subseteq , define the row sum R_i(S) = \sum_{j \in S} a_{ij} for row i. The formula then expresses the permanent as an alternating sum over these subsets, where the sign accounts for overcounting inclusions of columns, ultimately isolating the contributions from full permutations. This inclusion-exclusion structure ensures the formula equals the permanent by canceling extraneous terms from partial matchings.[14][15]
The explicit formula is
\operatorname{per}(A) = \sum_{S \subseteq } (-1)^{n - |S|} \prod_{i=1}^n R_i(S),
where the empty subset contributes the term (-1)^n \prod_{i=1}^n 0 = 0 if n > 0. Equivalently, it can be written as
\operatorname{per}(A) = (-1)^n \sum_{S \subseteq } (-1)^{|S|} \prod_{i=1}^n \sum_{j \in S} a_{ij}.
[13][14]
Naive implementation iterates over all $2^n subsets, computing each of the n row sums in O(n) time and the product in O(n) time, yielding O(2^n n^2) overall time complexity. An optimization uses Gray code ordering of subsets, where consecutive subsets differ by exactly one element, allowing incremental updates to row sums in O(1) time per subset after initial computation, reducing the complexity to O(2^n n). This traversal minimizes redundant calculations by flipping the inclusion of one column at a time, akin to efficient enumeration in combinatorial search.[13][15]
In practice, Gray code implementation proceeds by generating subsets in binary-reflected order (e.g., via bitwise XOR for transitions) and maintaining an array of current row sums, updating only the affected row sum when a column is added or removed. For illustration, consider a $3 \times 3 matrix A with entries a_{ij}. The subsets in Gray code order are \emptyset, \{1\}, \{3\}, \{2\}, \{2,3\}, \{1,3\}, \{1,2\}, \{1,2,3\}. Starting with all row sums zero, add column 1 to get sums R_1=\{a_{11}\}, R_2=\{a_{21}\}, R_3=\{a_{31}\}; the product is multiplied by (-1)^{3-1} = 1. Transition to {3} by removing 1 and adding 3, updating sums accordingly, and continue, accumulating the signed products to yield \operatorname{per}(A). This example demonstrates how only three updates occur per transition for n=3, scaling efficiently.[16][15]
On modern hardware, including GPUs and supercomputers, the optimized Ryser formula remains practical for matrices up to approximately n \approx [50](/page/50)--$56, with computations feasible in hours to days depending on matrix density and precision requirements; larger n may require weeks or more on extensive resources.[17]
The Balasubramanian–Bax–Franklin–Glynn formula provides an exact method for computing the permanent of an n \times n matrix A = (a_{ij}) over fields of characteristic not equal to 2, given by
\operatorname{per}(A) = 2^{-n} \sum_{S \subseteq } (-1)^{|S|} \prod_{i=1}^n \left( r_i - 2 \sum_{j \in S} a_{ij} \right),
where r_i = \sum_{j=1}^n a_{ij} is the i-th row sum.[18] This expression leverages inclusion-exclusion principles similar to earlier methods but reformulates the summation using row sum differences, which can be implemented with $2^{n-1} terms by pairing subsets S with their complements to exploit symmetry.[18]
The derivation builds directly on Ryser's formula by re-expressing each row's contribution as the difference between the full row sum and twice the subset sum over S, which transforms the alternating signed products into a form amenable to vectorized evaluation of outer products between row vectors and indicator vectors for subsets.[18] This approach maintains the exponential scaling but facilitates parallel computation on modern processors by precomputing full row sums and updating subset sums incrementally.[18]
With a time complexity of O(2^n n), the formula matches Ryser's asymptotic cost but can be up to twice as fast in practice for optimized implementations, owing to the effective reduction to $2^{n-1} terms via complement pairing, which halves the number of distinct products needed while preserving accuracy.[18] For example, in numerical experiments on dense matrices up to n=20, vectorized versions outperform standard Ryser by factors of 1.5–2 due to fewer floating-point operations per iteration.
In implementation, bitwise operations (such as XOR for complement generation) are used to traverse subsets efficiently, often in Gray code order to minimize sum updates between consecutive terms.[18] For floating-point arithmetic, numerical stability is a concern, as large positive and negative terms can cause catastrophic cancellation; mitigation strategies include scaling rows to unit sums or using higher-precision libraries like MPFR.
The formula traces its origins to a series of contributions starting with Balasubramanian's 1980 PhD thesis on combinatorial matrix methods, refined by Bax and Franklin's 1997 finite-difference sieve for sparse cases, and finalized by Glynn in 2010 through invariant theory over fields.[19][18] Its development saw further generalizations around 2017 for quantum optics applications, particularly in efficiently simulating Boson Sampling probabilities via repeated row permanents.
Special Cases for Exact Computation
Planar bipartite graphs and K_{3,3}-free graphs
For planar bipartite graphs, the permanent of the biadjacency matrix can be computed exactly in polynomial time using Kasteleyn's method, which equates the permanent to the square of the Pfaffian of a suitably oriented skew-symmetric matrix derived from the graph's adjacency matrix.[20] Specifically, a Pfaffian orientation assigns directions and signs to the edges such that the Pfaffian of the resulting oriented adjacency matrix O satisfies \operatorname{per}(A) = |\operatorname{Pf}(OA)|^2 = |\det(OA)|^{1/2}, where A is the biadjacency matrix.[21] This orientation exists for all planar graphs and can be constructed in linear time by ensuring that around every face, the product of edge signs corresponds to an odd number of negative signs. The Pfaffian is then computed via Gaussian elimination on the skew-symmetric matrix, yielding a running time of O(n^3) for an n \times n matrix, or faster O(n^\omega) using advanced matrix multiplication algorithms where \omega < 2.371339.[22]
The Fisher-Kasteleyn-Temperley (FKT) algorithm provides a practical implementation of this approach tailored to counting perfect matchings in planar bipartite graphs, directly yielding the permanent for 0-1 biadjacency matrices. It involves constructing the Pfaffian orientation, forming the signed matrix, and evaluating its determinant, with the square root providing the exact count of perfect matchings, equivalent to the permanent. This method is particularly effective for lattice structures, such as grid graphs, where the permanent enumerates dimer coverings—tilings by 1x2 dominoes—offering closed-form expressions in some cases, like for an m \times n grid. but use [23]
This technique extends beyond strictly planar graphs to K_{3,3}-minor-free graphs, which admit Pfaffian orientations, allowing the permanent (or perfect matching count) to be computed in polynomial time via similar determinant evaluations. Little proved that every K_{3,3}-minor-free graph is Pfaffian orientable, generalizing Kasteleyn's result, and Vazirani developed an NC algorithm for counting perfect matchings in this class by decomposing the graph and applying oriented determinant computations.[24] For bipartite instances, this captures all planar cases since bipartite K_{3,3}-minor-free graphs coincide with planar bipartite graphs, but the extension applies to non-bipartite K_{3,3}-minor-free graphs as well, using the full Pfaffian framework.
Representative examples include even cycles, where the permanent is 2 (two perfect matchings), and rectangular grid graphs, which model statistical mechanics problems like the dimer model on a lattice. These cases highlight the method's utility in combinatorial enumeration, with the permanent directly counting the number of ways to pair vertices without shared edges.
A key limitation is the need to verify the graph's planarity or K_{3,3}-minor-freeness, which can be done in linear time using algorithms like Hopcroft-Tarjan for planarity testing. However, constructing the Pfaffian orientation for non-planar K_{3,3}-minor-free graphs may require more involved decomposition steps, though still polynomial-time overall.
Computation modulo a prime or integer
Computing the permanent of an n \times n matrix modulo a prime p or a general integer m leverages number-theoretic tools to reduce the problem to computations over finite fields or prime power rings. For a composite modulus m, the Chinese Remainder Theorem (CRT) allows decomposition into computations modulo the prime power factors p_i^{k_i} of m, followed by recombination to obtain the result modulo m. This approach is efficient when the factors are small, as the overall time depends on the individual prime power computations.
For prime powers p^k, specialized algorithms outperform the general exponential-time methods. In particular, for p=2, the permanent modulo $2^k can be computed in time O(n^{k + O(1)}), improving upon the standard O(2^n n^2) Ryser formula, which can be adapted directly for modular arithmetic. A randomized variant based on Ryser's formula with tabulation achieves expected time $2^n / \exp(\Omega (n^{1/3}/(2 \ln n))) for 0-1 matrices.[25] For general odd primes p, similar techniques yield time $2^n / \exp(\Omega(\gamma^2 n / p \log p)) for fixed \gamma > 0.[26]
In characteristic 2, the permanent modulo 2 coincides with the determinant modulo 2, computable in O(n^3) time via Gaussian elimination over \mathbb{F}_2. For characteristic 3, polynomial-time algorithms exist for specific matrix classes. A matrix over \mathbb{F}_3 is 1-semi-unitary if its Gram matrix differs from the identity by rank at most 1; the permanent of such matrices can be computed in O(n^4) time using linear algebra techniques. More generally, the permanent over fields of characteristic 3 is polynomial-time computable for 0-semi-unitary (unitary) and 1-semi-unitary matrices, but #3P-complete for k > 1-semi-unitary matrices.[27]
Adapting the Ryser formula for general modulo m yields time O(2^n n^2), but optimizations using fast arithmetic can reduce it to O(2^n n / \log m) when m is smooth. Advances in the 1990s, building on Valiant's reductions, established #P-completeness for the permanent modulo fixed primes p > 2, highlighting the hardness while motivating modular techniques for practical computation.[28]
Approximation and Randomized Methods
Fully polynomial randomized approximation scheme (FPRAS)
A fully polynomial randomized approximation scheme (FPRAS) for computing the permanent of an n × n matrix with nonnegative entries was introduced by Jerrum, Sinclair, and Vigoda in 2004, providing a multiplicative (1 + ε)-approximation with probability at least 1 - δ in polynomial time. The algorithm relies on a Markov chain Monte Carlo (MCMC) method that samples from a distribution over weighted matchings in the associated bipartite graph, where the permanent corresponds to the sum of weights over all perfect matchings. This approach leverages the combinatorial interpretation of the permanent as counting weighted perfect matchings, enabling estimation through repeated sampling rather than exact enumeration. The method requires the matrix entries to be nonnegative to ensure the weights are valid probabilities in the MCMC transitions, and it outputs a multiplicative approximation rather than an additive one, scaling with the permanent's magnitude.
The core of the algorithm involves constructing a Markov chain on the space of matchings (including imperfect ones with "holes") of the bipartite graph, with a stationary distribution that approximates uniformity over perfect matchings. It starts with an empty matching and proposes moves to add, remove, or shift edges using the Metropolis-Hastings rule, where transition probabilities are determined by the edge weights from the matrix entries and auxiliary "hole weights" to maintain detailed balance. These hole weights are computed iteratively to adjust the chain's bias toward perfect matchings. The chain is run for a number of steps sufficient to reach approximate stationarity, typically O(n^5 / ε^2) steps per sample, with multiple independent samples used to estimate the partition function via a telescoping product that relates the permanent to the expected weight of sampled matchings. The overall time complexity of the original algorithm, with improved analysis, is O(n^7 \poly(\log n, 1/\epsilon, \log(1/\delta))).[29]
Subsequent variants have improved the runtime for specific cases, such as very dense matrices where most entries are positive, achieving O(n^4 poly(log n / ε)) time by refining the sampling procedure and bounding techniques.[30] For instance, in approximating the permanent of the adjacency matrix of a random bipartite graph with edge probability p = Θ(1), the FPRAS efficiently estimates the expected number of perfect matchings by tuning ε and δ appropriately, demonstrating practical utility despite the #P-hardness of exact computation. This randomized scheme thus provides a viable tool for applications in statistical mechanics, network reliability, and combinatorial optimization where exact values are intractable.
Quantum-inspired approximation algorithms
Quantum-inspired approximation algorithms draw from the computational challenges posed by boson sampling, a model of quantum computation where the probability of detecting photons in specific output modes after passing through a linear optical network is given by the square of the permanent of a submatrix of the unitary matrix describing the network. Classical simulations of boson sampling often rely on optimized variants of the Ryser formula to estimate these permanents, particularly for unitary matrices, enabling verification of quantum experiments by computing collision probabilities.[31]
A key quantum-inspired classical algorithm for approximating the permanent of Hermitian positive semidefinite (PSD) matrices was introduced in 2017 by Chakhmakhchyan, Cerf, and Garcia-Patrón. This method exploits the analogy to boson sampling with thermal input states, where the permanent corresponds to the expected value of a random variable derived from photon-counting statistics in a linear optical interferometer. The algorithm estimates the permanent via Monte Carlo sampling, achieving an additive error of \pm \epsilon (\lambda_{\max})^n with high probability, where \lambda_{\max} is the largest eigenvalue and n is the matrix dimension, and runs in polynomial time O(n^2 (n + 1/\epsilon^2)) for PSD matrices satisfying specific spectral conditions, such as bounded eigenvalues ensuring the product of normalized eigenvalues remains above a constant threshold.[32]
Practical classical emulations of related quantum sampling tasks, including those involving permanent computations, have leveraged tensor network contractions to achieve approximations with controlled error bounds, particularly for structured inputs like those in Gaussian boson sampling. These tensor network approaches, such as matrix product states, enable efficient simulation by decomposing the probability distribution into low-entanglement representations, with error scaling exponentially in the bond dimension but polynomial in system size for sparse or low-depth networks.
Recent advances as of 2025 include improved approximation algorithms for permanents of PSD matrices, such as the work by Oveis Gharan providing new bounds on approximability within constant factors in polynomial time.[33] Such algorithms find applications in quantum optics simulations, where estimating permanents of PSD or unitary matrices computes transition amplitudes for multi-photon states in interferometric setups, aiding in the design and validation of photonic quantum devices. These methods are most effective for low-rank or spectrally structured PSD matrices, where the approximation quality improves due to reduced effective dimensionality, but they do not extend efficiently to general non-negative matrices without additional assumptions.[32][31]
Computational Complexity
#P-completeness
The class #P, introduced by Valiant, consists of the counting functions associated with decision problems in NP; formally, it includes all functions f: \{0,1\}^* \to \mathbb{N} such that there exists a nondeterministic polynomial-time Turing machine whose number of accepting paths on input x is exactly f(x). The permanent of an n \times n nonnegative matrix A belongs to #P because \mathrm{per}(A) equals the number of perfect matchings in the bipartite graph with adjacency matrix A, and counting perfect matchings reduces to counting accepting paths in a suitable nondeterministic machine that verifies permutations.
In a seminal result, Valiant proved that computing the permanent of a 0-1 matrix is #P-complete, establishing its fundamental hardness under parsimonious and many-one reductions. His proof proceeds in two main steps: first, a parsimonious reduction from #SAT (counting satisfying assignments to a Boolean formula) to the permanent over integer matrices, preserving the exact number of solutions via a direct bijection between satisfying assignments and permutation supports; second, a further reduction to 0-1 permanents by scaling and gadget constructions that maintain the count. Similar parsimonious techniques appear in reductions for related counting problems, such as from #3-Partition, where the number of ways to partition a set into triples of equal sum bijects to matrix permanents encoding the partitions.
This #P-completeness implies that no polynomial-time algorithm exists for exact computation of the permanent unless \mathrm{P} = \mathrm{NP}, as solving a #P-complete problem would allow efficient solution of all #P functions, including those underlying NP decisions. In stark contrast, the determinant of a matrix over finite fields (or integers via modular computation) lies in P and can be evaluated in O(n^3) time using Gaussian elimination, highlighting the algebraic intractability introduced by the permanent's lack of sign cancellations.
The hardness of the permanent extends to broader complexity hierarchies via Toda's theorem, which shows that \mathrm{PH} \subseteq \mathrm{P}^{\#P}, meaning every problem in the polynomial hierarchy is polynomial-time reducible to a #P oracle; thus, #P-completeness of the permanent positions it as a complete problem for accessing the full power needed to resolve PH.
Inapproximability results
The computation of the permanent exhibits strong inapproximability properties, particularly for positive-semidefinite matrices. A seminal result establishes that there is no polynomial-time algorithm to approximate the permanent of an n \times n positive-semidefinite matrix within a multiplicative factor of $2^{n^{1-\epsilon}} for any constant \epsilon > 0, unless \mathsf{P} = \mathsf{NP}.[34] This hardness stems from the #P-completeness of the problem and the self-reducibility of the permanent, which allows reductions that preserve the approximation gap. For nonnegative matrices, an FPRAS exists, but the case for matrices with possible negative entries, such as positive-semidefinite matrices, remains intractable under this assumption.
Building on #P-hardness, gap amplification techniques further demonstrate that distinguishing permanents differing by a multiplicative factor of e^{n / \log n} is computationally hard.[35] These results imply that even modest approximation guarantees are infeasible without resolving major open questions in complexity theory, such as the relationship between NP and randomized polynomial time.
Recent advancements have extended inapproximability to structured matrices. For positive-semidefinite (PSD) matrices, it is NP-hard to approximate the permanent within any constant factor, implying no polynomial-time approximation scheme (PTAS) exists unless \mathsf{P} = \mathsf{NP}.[34] Moreover, assuming the exponential time hypothesis (ETH), no PTAS exists even under subexponential time bounds.
Similar hardness holds for variants like the hafnian, the undirected analog of the permanent defined on symmetric matrices. These results underscore the broader challenges in approximating permanental functions across matrix classes.
Despite these barriers, tight bounds on general approximation ratios remain an open question, with ongoing research seeking to close the gap between known algorithms and hardness thresholds.[4]