Fact-checked by Grok 2 weeks ago

L0

The L0 norm, often denoted as \|\mathbf{x}\|_0 for a vector \mathbf{x} \in \mathbb{R}^n, is a pseudo-norm in mathematics that quantifies the sparsity of \mathbf{x} by counting the number of its nonzero entries, formally defined as \|\mathbf{x}\|_0 = \sum_{i=1}^n \mathbf{1}_{\{x_i \neq 0\}}, where \mathbf{1} is the indicator function. Unlike true norms, it fails to satisfy the triangle inequality and absolute homogeneity, rendering it a non-convex, discontinuous function that is challenging to optimize directly. This measure is foundational in sparsity-promoting techniques, distinguishing it from p-norms for p > 0, which instead penalize the magnitude of nonzero components. In optimization and , the L0 norm plays a central role in sparse problems, where the goal is to minimize \|\mathbf{x}\|_0 subject to linear constraints like \mathbf{y} = \mathbf{Ax}, seeking the sparsest solution that explains observed data \mathbf{y}. This formulation underpins compressive sensing, a paradigm introduced in the mid-2000s that enables efficient acquisition and of sparse signals from far fewer measurements than traditional sampling theorems require, with exact guaranteed under conditions like the when using tractable proxies for the L0 objective. However, direct L0 minimization is NP-hard, prompting the development of convex relaxations—most notably the L1 norm (basis pursuit)—which provably recover the L0 minimizer for sufficiently incoherent measurement matrices. Beyond compressive sensing, the L0 norm is pivotal in for inducing sparsity in high-dimensional models, such as in or in neural networks, where minimizing \|\mathbf{w}\|_0 over parameters \mathbf{w} promotes compact, interpretable architectures without sacrificing performance. Recent advances include differentiable approximations and variational methods to enable gradient-based optimization of L0-regularized losses, facilitating sparse neural networks that reduce computational overhead in resource-constrained settings like edge devices. Applications span diverse domains, including for selecting minimal asset sets, geophysical inversion for sparse density models, and for privacy-preserving sparse representations. Despite computational hurdles, ongoing research leverages branch-and-bound algorithms and penalty methods to solve exact L0 problems in moderate dimensions, underscoring its enduring value in promoting parsimonious solutions.

Definition and Notation

Vector L0 Norm

The \ell_0 norm of a x \in \mathbb{R}^n, denoted \|x\|_0, is defined as the number of non-zero entries in x, formally given by \|x\|_0 = \bigl| \bigl\{ i : x_i \neq 0 \bigr\} \bigr|. This measure, often referred to as the sparsity index or pseudo-norm, quantifies the size of the vector rather than its , distinguishing it from traditional s that scale with entry values. For instance, consider the vector x = (0, 2, 0, 3); here, \|x\|_0 = 2 since only the second and fourth components are non-zero. Similarly, for the zero vector x = (0, 0), \|x\|_0 = 0. These examples illustrate how the \ell_0 norm acts as a counting function, ignoring the actual values of non-zero elements and focusing solely on their presence. Alternative notations for the \ell_0 norm include \operatorname{card}(x), which denotes the cardinality of the support of x, i.e., the size of the set where x_i \neq 0. This terminology emphasizes its combinatorial nature. The concept of the \ell_0 norm emerged in optimization contexts during the 1970s and 1980s, particularly in efforts to identify sparse solutions in and related fields. Geophysicists applied sparsity-promoting techniques, including precursors to \ell_0 minimization, for problems like seismic .

Matrix and General L0 Norm

The L0 norm for a A \in \mathbb{R}^{m \times n} is defined as the total number of nonzero entries in the matrix, counting each individual element across all rows and columns. This extends the vector case by treating the matrix as a flattened collection of entries, where \|A\|_0 = \sum_{i=1}^m \sum_{j=1}^n \mathbf{1}_{A_{ij} \neq 0}, with \mathbf{1} denoting the . Alternative definitions of the matrix L0 norm appear in contexts emphasizing structural sparsity, such as the cardinality of the support set, which aligns with the nonzero entry count but may focus on the positions of nonzero elements. In low-rank matrix approximation problems, an alternative formulation relates the L0 norm to the rank of the matrix, defined as the number of nonzero singular values in the singular value decomposition of A, i.e., \|A\|_0 = \rank(A). This rank-based interpretation promotes low-dimensional structure rather than element-wise sparsity. For a concrete illustration, consider the matrix A = \begin{pmatrix} 1 & 0 \\ 0 & 2 \end{pmatrix}; here, \|A\|_0 = 2, as there are two nonzero entries. The L0 norm generalizes to function spaces, where for a f on a \Omega equipped with a measure \mu, \|f\|_0 is defined as the measure of the set \{ x \in \Omega : f(x) \neq 0 \}, i.e., \|f\|_0 = \mu(\supp(f)). This captures the "size" of the region where the is active, often using in continuous settings to quantify sparsity in infinite-dimensional spaces.

Properties

Basic Characteristics

The \ell_0 norm of a vector \mathbf{x} \in \mathbb{R}^n, denoted \|\mathbf{x}\|_0, counts the number of nonzero entries in \mathbf{x}. It satisfies non-negativity, meaning \|\mathbf{x}\|_0 \geq 0 for all \mathbf{x}, with equality if and only if \mathbf{x} = \mathbf{0}. The \ell_0 norm exhibits positive homogeneity of degree zero: for any scalar c \neq 0, \|c \mathbf{x}\|_0 = \|\mathbf{x}\|_0. However, it fails the standard norm homogeneity property \|c \mathbf{x}\|_0 = |c| \|\mathbf{x}\|_0 required for all scalars c, as the left side remains \|\mathbf{x}\|_0 for c \neq 0 while the right side scales by |c| \neq 1 in general. For example, take \mathbf{x} = (0, 1)^T; then \|3\mathbf{x}\|_0 = \|(0, 3)^T\|_0 = 1 = \|\mathbf{x}\|_0, but $3 \|\mathbf{x}\|_0 = 3. Note that the property holds trivially for c = 0, as both sides equal zero. The \ell_0 norm does not satisfy the \|\mathbf{x} + \mathbf{y}\|_0 \leq \|\mathbf{x}\|_0 + \|\mathbf{y}\|_0 in general. A occurs when supports overlap: let \mathbf{x} = (1, 0)^T and \mathbf{y} = (-1, 0)^T; then \mathbf{x} + \mathbf{y} = \mathbf{0}, so \|\mathbf{x} + \mathbf{y}\|_0 = 0 < 2 = \|\mathbf{x}\|_0 + \|\mathbf{y}\|_0. Equality can hold for disjoint supports, such as \mathbf{x} = (1, 0)^T and \mathbf{y} = (0, 1)^T, where \|\mathbf{x} + \mathbf{y}\|_0 = 2 = \|\mathbf{x}\|_0 + \|\mathbf{y}\|_0. The \ell_0 norm is discontinuous at every point in \mathbb{R}^n except the origin. This arises because small perturbations can alter the nonzero count; for instance, at any \mathbf{x} with at least one zero entry, an arbitrarily small change can make that entry nonzero, increasing the norm discontinuously. Due to these failures of homogeneity and the triangle inequality, the \ell_0 "norm" is not a true norm but is often termed a pseudo-norm or quasi-norm in the literature.

Comparison to Other Lp Norms

The \ell_p norms, for $1 \leq p < \infty, are defined as \|x\|_p = \left( \sum_{i=1}^n |x_i|^p \right)^{1/p}, while the \ell_\infty norm is \|x\|_\infty = \max_i |x_i|. These norms provide measures of vector magnitude based on powered sums or maxima, forming a convex family of functions that satisfy the properties of norms. In contrast, the \ell_0 norm counts the number of nonzero entries in x, \|x\|_0 = \sum_{i=1}^n \mathbf{1}_{\{x_i \neq 0\}}, and does not satisfy the triangle inequality, rendering it a pseudonorm rather than a true norm. A fundamental distinction lies in convexity: the \ell_p norms for p \geq 1 are convex, enabling efficient optimization via convex programming, whereas the \ell_0 norm is nonconvex, leading to combinatorial challenges in minimization. The \ell_1 norm specifically serves as a convex relaxation of the \ell_0 norm, promoting sparsity by penalizing the magnitude of components while approximating the ideal count of nonzeros; this substitution transforms intractable \ell_0-constrained problems into solvable linear programs under conditions like restricted isometry. For example, in sparse regression methods such as , the \ell_1 penalty replaces \ell_0 to induce sparsity without exact equivalence, as \ell_1 minimization recovers the sparsest solution when the signal support satisfies incoherence bounds. The \ell_2 norm, equivalent to the Euclidean norm, quantifies vector energy as \|x\|_2 = \sqrt{\sum_{i=1}^n x_i^2} and exhibits rotation invariance, preserving the norm under orthogonal transformations (\|Ux\|_2 = \|x\|_2 for orthogonal U). The \ell_0 norm lacks this invariance, as rotations can redistribute energy across components, potentially increasing or decreasing the number of nonzeros depending on the basis. Unlike \ell_2, which penalizes all components proportionally to their squared magnitude, \ell_0 strictly enforces cardinality of the support, ignoring magnitudes entirely. Theoretically, the \ell_0 norm connects to the \ell_p family through the limit \lim_{p \to 0^+} \|x\|_p^p = \|x\|_0, as |x_i|^p \to 1 for x_i \neq 0 and 0 otherwise, linking sparsity counting to infinitesimal powering. This relation underscores why nonconvex \ell_p approximations for small p < 1 can closely mimic \ell_0 behavior in sparsity induction, though they remain nonconvex.

Applications

Sparsity Promotion in Optimization

In optimization problems, the L0 norm promotes sparsity by seeking solutions with the minimal number of non-zero elements, particularly in high-dimensional settings where the underlying structure is expected to be sparse. An alternative regularized form is \min f(x) + \lambda \|x\|_0, where f(x) is a smooth objective function capturing data fidelity or loss, and \lambda > 0 is a tuning parameter that balances sparsity against fit to the data. The L0 norm plays a crucial role in within and statistics, encouraging models to rely on a small of relevant variables, reducing , enhancing interpretability, and lowering computational demands. By penalizing the of the , the L0 approach identifies informative predictors without assuming grouping or . In , the problem is \min_\beta \|y - X\beta\|_2^2 + \lambda \|\beta\|_0, driving many coefficients to zero for automated . L0 minimization is NP-hard, necessitating approximations like greedy algorithms. Orthogonal matching pursuit (OMP) iteratively selects correlated features and projects onto selected columns. Its role in sparsity gained prominence in the 2000s via compressed sensing work by and . Recent applications include optimization, where L0-norm minimization selects sparse energy layers and spots to reduce treatment delivery time while preserving dose quality.

Compressive Sensing and Signal Processing

Compressive sensing (CS) reconstructs sparse signals from fewer measurements than the ambient dimension, leveraging inherent sparsity. For a k-sparse signal \mathbf{x} \in \mathbb{R}^n with \|\mathbf{x}\|_0 = k \ll n, measurements are \mathbf{y} = \Phi \mathbf{x}, where \Phi \in \mathbb{R}^{m \times n} and m \ll n. The recovery problem is the \ell_0-minimization: \min_{\mathbf{z}} \|\mathbf{z}\|_0 \quad \text{subject to} \quad \mathbf{y} = \Phi \mathbf{z}. This seeks the sparsest consistent vector, coinciding with the original if unique. Exact recovery holds if \Phi satisfies the restricted isometry property (RIP) of order $2k with \delta_{2k} < 1, implying spark exceeds $2k and no $2k or fewer columns are linearly dependent, ensuring k-sparse signals are unique \ell_0 solutions. In applications, CS with \ell_0 sparsity is valuable for signals sparse in known bases. In magnetic resonance imaging (MRI), images are sparse in the wavelet domain; undersampled k-space \mathbf{y} = \Phi \Psi \mathbf{s} ( \mathbf{s} sparse, \Psi wavelet basis) is recovered by minimizing sparsity in \mathbf{s}, accelerating scans. In radar, sparse targets in spatial or Doppler domains enable compressive radar for efficient detection from limited measurements. Similarly, in ultrasound, L0-norm optimization with Hankel structured low-rank and sparse representation reconstructs compressed plane-wave signals, improving imaging quality and speed. For example, a k-sparse signal from m \geq 2k measurements with RIP-satisfying \Phi is exactly retrieved by \ell_0 minimization if sparsity holds.

Computational Challenges

Approximation Methods

Due to the NP-hard nature of exact L0 norm minimization, practical approximation methods are employed to solve sparse recovery problems efficiently. These techniques replace or surrogate the combinatorial L0 objective with tractable alternatives while preserving sparsity-promoting properties. Convex relaxations form a foundational class of approximations, primarily using the L1 norm as a convex surrogate for the . Under conditions like the restricted isometry property (RIP) of the measurement matrix, L1 minimization recovers exactly k-sparse signals from m = O(k log(n/k)) measurements, where n is the signal dimension, by solving a linear program that promotes sparsity through its geometry. This approach, introduced in the context of error-correcting codes and extended to compressive sensing, enables polynomial-time solutions via interior-point methods or basis pursuit algorithms. To refine the L1 approximation and better align with the L0 "step function," the iteratively reweighted L1 (IRL1) method solves a sequence of weighted L1 problems, where weights are updated as w_i = 1 / (|x_i| + \epsilon) to emphasize small coefficients toward zero. IRL1 converges in few iterations (typically 2–4) and enhances sparsity by iteratively approximating the logarithmic penalty. Greedy algorithms provide nonconvex, iterative approximations by sequentially building the support of the sparse solution. selects the dictionary atom most correlated with the current residual at each step, subtracting its projection to reduce the approximation error, thereby greedily minimizing an in overcomplete dictionaries. This method decomposes signals into sparse linear expansions without solving global optimizations, though it may suffer from suboptimal selections in coherent dictionaries. Continuous relaxations substitute the L0 norm with differentiable functions that closely mimic its indicator-like behavior. The log-sum penalty, \sum_i \log(|x_i| + \epsilon), serves as a smooth, nonconvex surrogate for small \epsilon > 0, transitioning from L1-like behavior near zero to a constant for large values, thus better approximating the L0 than the L1 norm alone. Optimization proceeds via majorization-minimization or , enabling exact sparse when the measurement matrix satisfies an RIP of order 2k with constant \delta_{2k} < \sqrt{2} - 1, often outperforming L1 for signals with equal-magnitude nonzeros. Heuristic methods offer simple, low-complexity approximations focused on direct sparsity enforcement. Hard thresholding zeros out entries below a \tau in each , approximating L0 minimization by constraining the support size to s elements. The Iterative Hard (IHT) algorithm updates via x^{t+1} = H_s(x^t + A^T (y - A x^t)), where H_s retains the s largest entries, providing a computationally efficient proxy for L0-constrained least squares. IHT converges linearly under RIP conditions of order 3s with \delta_{3s} < 1/8, yielding recovery errors bounded by \|x - \hat{x}\|_2 \leq C \epsilon / \sqrt{s} in noisy compressive sensing with noise level \epsilon. In compressive sensing simulations, these methods exhibit varying recovery errors depending on signal structure and noise. Nonconvex approximations like IRL1 and log-sum penalties often outperform L1 for certain signal classes, while and thresholding methods provide efficient recovery competitive with relaxations.

Relation to NP-Hard Problems

The minimization of the \ell_0 norm subject to the linear Ax = b, which seeks the sparsest solution to an underdetermined system, is NP-hard. This result establishes that no polynomial-time algorithm exists for exactly solving the problem unless P = NP. The proof relies on a polynomial-time reduction from the Exact Cover by 3-Sets (X3C) problem, an NP-complete variant of set cover where the universe has $3q elements and subsets are of size 3. In the reduction, given an X3C instance with universe S of m = 3q elements and n triples, construct b as the all-ones vector in \mathbb{R}^m and A \in \mathbb{R}^{m \times n} as the incidence matrix with a_{ij} = 1 if element s_i belongs to triple c_j, and 0 otherwise. An exact cover corresponds to a solution x^* with \|x^*\|_0 = q satisfying Ax^* = b and x^*_j \in \{0,1\}, while any \ell_0-minimal solution with cardinality q yields such a cover. This bijection preserves the decision version: deciding if a solution with \|x\|_0 \leq q exists is equivalent to solving X3C. The construction runs in polynomial time, confirming NP-hardness. The implies that exact \ell_0 minimization is computationally intractable for large n and m, with exponential worst-case , which motivates the use of methods as practical workarounds. Related problems, such as —finding x minimizing \|x\|_0 subject to \|Ax - b\|_2 \leq \epsilon—inherit this hardness via similar reductions. Dictionary learning with \ell_0 constraints, which jointly optimizes a dictionary and sparse codes to minimize representation error under sparsity penalties, is also NP-hard, as it embeds multiple \ell_0 minimizations over the codes. Up to 2025, recent advances offer partial relief but do not resolve the underlying hardness. Quantum algorithms for sparse recovery, including quantum variants of , achieve quadratic speedups over classical methods in query complexity for certain overcomplete dictionaries, enabling faster approximate solutions on quantum hardware. AI-based heuristics, such as curriculum learning organized into of increasing \ell_0 problem difficulty, provide effective approximations for moderate-scale instances by training neural networks to guide searches, though exact solvability remains elusive for general cases.

References

  1. [1]
    10-315 Notes Math Background
    1 Math Background. 1. 1.1 Linear algebra ... 1 Math ... L0 “norm”: Number of non-zero entries in a vector (not technically a norm) ∥v∥0 = Pi |vi|0, where 00 is ...
  2. [2]
    [PDF] Analysis review, Part 1 - MyWeb
    Norm: Definition. • Definition: A norm ... By the equivalence of norms, if, say, krk1 → 0, then krk2 → 0 and so on for all norms (except not the L0 “norm”!)
  3. [3]
    [PDF] Decoding by Linear Programming - arXiv
    Feb 15, 2005 · This paper considers the classical error correcting problem which is frequently dis- cussed in coding theory. We wish to recover an input ...
  4. [4]
    [PDF] Decoding by Linear Programming
    This paper considers the classical error correcting problem which is frequently dis- cussed in coding theory. We wish to recover an input vector f ∈ Rn from ...
  5. [5]
    [PDF] A Note on the Complexity of Lp Minimization - Stanford University
    On the other hand, when p = 1, the problem (1) or (2), which is a relaxation problem for the L0 norm problem, is a linear program, and hence it is solvable in ...
  6. [6]
    [PDF] LEARNING SPARSE NEURAL NETWORKS THROUGH L0 ...
    A conceptually attractive approach is the L0 norm regularization of (blocks of) parameters; this explicitly penalizes parameters for being different than zero ...<|control11|><|separator|>
  7. [7]
    [PDF] Sparse Regularization with the l0 Norm - arXiv
    Nov 16, 2021 · Specifically, we intend to understand how choices of the regularization parameter λ lead to sparsity (under the transform. M) of a global ...
  8. [8]
    l0-norm based Short-term Sparse Portfolio Optimization Algorithm ...
    This paper proposes a novel short-term sparse portfolio optimization (SSPO) model based on -norm. Compared with existing approaches, this model selects the ...
  9. [9]
    Gravity inversion using L0 norm for sparse constraints
    A novel non-convex framework for gravity inversion is proposed. The proposed optimization aims to directly reduce the L 0 norm of the density matrix.
  10. [10]
    Federated Optimization of ℓ0-norm Regularized Sparse Learning
    Regularized sparse learning with the ℓ 0 -norm is important in many areas, including statistical learning and signal processing.
  11. [11]
  12. [12]
    [PDF] An Exact Penalty Approach for General ℓ0-Sparse Optimization ...
    Dec 25, 2023 · The convex approximation schemes typically replace the ℓ0-norm by the ℓ1-norm. This is the most standard approach which works very successfully ...<|control11|><|separator|>
  13. [13]
  14. [14]
    [PDF] Equivalence of Minimal ℓ0 and ℓp Norm Solutions of Linear ...
    In Section 2 we establish the equivalence between the ℓ0 norm minimization problem (2) and the concave ℓp norm minimization problem (4) for sufficiently small p ...
  15. [15]
    [PDF] Linear and Convex Programming, with Applications to Sparse ...
    by geophysicists in the late 1970s and early 1980s, and again in the ...
  16. [16]
    [PDF] l0 Factor Analysis: A P-Stationary Point Theory - arXiv
    The ℓ0 norm ∥·∥0 counts the number of nonzero entries in a matrix or a vector. If A is positive semidefinite, we write A ⪰ 0, and A ≻ 0 means that. A is ...
  17. [17]
    [PDF] randomized approach to matrix completion - arXiv
    May 2, 2025 · ℓ0 norm. The matrix rank may be expressed as the ℓ0 norm of the singular values vector σ, while the nuclear norm is defined as its ℓ1 norm ...
  18. [18]
    [PDF] Maximum Hands-Off Control: A Paradigm of Control Effort Minimization
    May 29, 2015 · {t ∈ [0,T] : u(t) 6= 0}. Then we define the L0 “norm” of measurable function u as ... measure of the support. Hence, the values on the sets ...
  19. [19]
    [PDF] arXiv:1811.04620v1 [cs.CV] 12 Nov 2018
    Nov 12, 2018 · 0 “norm” is not a proper norm because it is not homogeneous, ... 0 measure in place of l0 norm in Eq.(1), it leads to the following ...
  20. [20]
    [PDF] On minkxk 0 s.t. Ax = b - angms.science
    Mar 21, 2021 · ... (triangle inequality). ▷ p(αx) = |α|p(x) (Homogeneous). ▷ L0-norm is not Homogeneous : e.g. x = [0,1], then k3xk0 = kxk0 = 1 6=3=3kxk0. 6 ...
  21. [21]
    [PDF] Lecture 2: August 29, 2018 Convexity 1: Sets and functions 2.1 ...
    (Note that the. L0 norm does not satisfy the triangle inequality, so it is not a norm, not convex, and not our friend.) Operator (also called spectral) and ...
  22. [22]
    [PDF] Penalty Decomposition Methods for l0-Norm Minimization ∗
    Abstract. In this paper we consider general l0-norm minimization problems, that is, the problems with l0-norm appearing in either objective function or ...
  23. [23]
    [PDF] On the Suitability of Lp-norms for Creating and Preventing ... - arXiv
    Feb 27, 2018 · Informally, for images, L0 measures the number of pixels that are different between two images, L2 measures the. Euclidean distance between two ...
  24. [24]
    [PDF] arXiv:2301.07285v1 [cs.LG] 18 Jan 2023
    Jan 18, 2023 · The L0-norm is distinguished by the fact that it is scale-invariant (insensitive to a rescaling of the model weights), and thereby does not ...Missing: comparison | Show results with:comparison
  25. [25]
    [PDF] Why l1 Is a Good Approximation to l0: A Geometric Explanation
    The ℓ0-norm is non-convex. It is known that non-convex optimiza- tion problems are computationally difficult to solve exactly; see, e.g., [8]. Not surprisingly ...
  26. [26]
    [math/0502327] Decoding by Linear Programming - arXiv
    Feb 15, 2005 · Authors:Emmanuel Candes, Terence Tao. View a PDF of the paper titled Decoding by Linear Programming, by Emmanuel Candes and 1 other authors.
  27. [27]
    [PDF] A Constructive Approach to L0 Penalized Regression
    Bertsimas et al. (2016) also considered an MIO approach for solving the best subset selection problem in linear regression with a possible side constraint.
  28. [28]
    [PDF] Orthogonal matching pursuit: recursive function approximation with ...
    We demonstrate the utility of OMP by example of applications to representing functions with respect to time-frequency localized affine wavelet dictionaries. We ...
  29. [29]
    Optimally sparse representation in general (nonorthogonal ... - PNAS
    Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization ... Download this article as a PDF file. PDF. eReader. View this ...
  30. [30]
    Sparse MRI: The application of compressed sensing for rapid MR ...
    Oct 29, 2007 · Images with a sparse representation can be recovered from randomly undersampled k-space data, provided an appropriate nonlinear recovery scheme is used.Abstract · INTRODUCTION · THEORY · METHODSMissing: L0 | Show results with:L0
  31. [31]
    On compressive sensing applied to radar - ScienceDirect.com
    Compressive sensing (CS) techniques offer a framework for the detection and allocation of sparse signals with a reduced number of samples.
  32. [32]
    Sparse Approximate Solutions to Linear Systems | SIAM Journal on ...
    1. R. E. Carlson, B. K. Natarajan, Sparse approximate multiquadric interpolation, Comput. ... NP-hardness and inapproximability of sparse PCA. Information ...
  33. [33]
    [PDF] NP-hardness of l0 minimization problems - SampTA 2019
    This paper proposes a revision of existing analyses of NP- hardness of the penalized i0 problem and it introduces a new proof adapted from Natarajan's ...Missing: seminal | Show results with:seminal
  34. [34]
    [PDF] Computational Intractability of Dictionary Learning for Sparse ... - arXiv
    Nov 5, 2015 · In this paper, we consider the dictionary learning problem for sparse representation. We first establish the NP-hardness of this problem. Then ...
  35. [35]
    [PDF] Quantum Sparse Recovery and Quantum Orthogonal Matching Pursuit
    Oct 8, 2025 · There has been significant progress on quantum algorithms for regularized linear systems, such as ridge regression [31] and LASSO [32, 33], ...
  36. [36]
    [PDF] Learning Trees of l0-Minimization Problems - arXiv
    Feb 6, 2023 · where kxk0 denotes the number of non-zero entries of x. In full generality, this problem is NP-hard [34, 22] but as many hard problems it ...