Fact-checked by Grok 2 weeks ago

Ellipsoid method

The ellipsoid method is an iterative algorithm for solving convex optimization problems, particularly for minimizing a convex function over a convex set, by maintaining and shrinking a sequence of ellipsoids guaranteed to contain an optimal solution until the volume reduction yields a sufficiently accurate approximation.^[1] It operates by querying a separation oracle at the ellipsoid's center to obtain a subgradient or separating hyperplane, which cuts the ellipsoid and updates it to a smaller one enclosing the feasible region, with each iteration reducing the volume by a factor of e^{-1/(2n+2)} in n dimensions.^[1] The method requires O(n^2 \log(R G / \epsilon)) iterations to achieve an \epsilon-accurate solution, where R is the initial ellipsoid radius, G is the Lipschitz constant of the objective, assuming the optimum lies within a ball of radius R and the subgradients are bounded by G.^[1] Developed in the mid-1970s in the Soviet Union independently by Naum Z. Shor and by David B. Yudin and Arkadi S. Nemirovski for general convex programming, the ellipsoid method gained prominence when Leonid G. Khachiyan adapted it in 1979 to solve linear programs in polynomial time, marking the first such algorithm after the simplex method's worst-case exponential complexity was established.^[2] Khachiyan's version addresses linear programming feasibility directly through cutting planes, requiring O(n^4 \log(R / r)) arithmetic operations for an n-variable program with constraints bounding the feasible set between balls of radii R and r.^[3] This breakthrough resolved a key question in computational complexity by placing linear programming in the class P, influencing subsequent developments like interior-point methods while highlighting the ellipsoid method's role in providing polynomial-time separation oracles for combinatorial problems such as maximum weight matching and matroid intersection.^[2] Despite its theoretical significance, the method is rarely used in practice due to high constant factors and slower performance compared to the simplex or interior-point algorithms on typical instances.^[3]

History

Origins and Key Developments

The ellipsoid method traces its origins to the mid-1970s in the Soviet Union, where it was developed by David B. Yudin and Arkadi S. Nemirovski as an iterative technique for approximating solutions to convex optimization problems by enclosing the feasible set within successively smaller ellipsoids.^[4] Their work, published in 1976, laid the foundational framework for using ellipsoidal approximations to manage high-dimensional search spaces efficiently in convex extremal problems.^[4] Independently, Naum Z. Shor contributed similar ideas around the same period, further advancing subgradient-based ellipsoid updates for nonsmooth convex minimization.^[4] In 1979, Leonid G. Khachiyan, a researcher at the Computing Center of the Siberian Branch of the Academy of Sciences of the USSR in Novosibirsk, adapted the ellipsoid method specifically to linear programming, proving that it could solve the feasibility of systems of linear inequalities in polynomial time.^[5] This breakthrough was driven by the unresolved question in operations research and theoretical computer science about the polynomial-time solvability of linear programming, which had been practically addressed since 1947 by George B. Dantzig's simplex method but lacked worst-case polynomial-time guarantees due to its potential exponential complexity on certain inputs.^[6]^[7] Khachiyan's formulation centered on iteratively shrinking an initial ellipsoid containing the feasible region using cutting planes, ensuring volume reduction in high dimensions while maintaining computational tractability.^[5] Khachiyan's results were first disseminated in preliminary form at the 10th International Symposium on Mathematical Programming in Montreal in July 1979, marking a pivotal moment that confirmed linear programming resides in the complexity class P and dispelled lingering doubts about its tractability.^[8] The full proof appeared later that year in Doklady Akademii Nauk SSSR, establishing the ellipsoid method's role as the first explicit polynomial-time algorithm for this foundational problem.^[5] This development not only highlighted the method's theoretical elegance but also underscored its potential for broader convex optimization, though practical implementations would later reveal challenges in numerical stability.^[6]

Impact on Optimization Theory

The ellipsoid method marked a pivotal shift in optimization theory from reliance on practical heuristics like the simplex method, which lacked guaranteed polynomial-time performance, to a rigorous theoretical framework demonstrating that linear programming (LP) belongs to the complexity class P. By adapting earlier work on convex optimization, Leonid Khachiyan showed in 1979 that LP could be solved in polynomial time using an ellipsoid-based algorithm, resolving a long-standing open question in computational complexity and influencing the broader study of P versus NP implications.^[2] This breakthrough underscored the power of separation oracles in certifying feasibility for convex sets, providing a theoretical foundation for analyzing the solvability of optimization problems without enumerating all constraints.^[9] The method's theoretical elegance inspired subsequent developments in approximation algorithms and semidefinite programming (SDP). In particular, it enabled proofs of polynomial-time solvability for combinatorial optimization problems whenever a polynomial-time separation oracle exists, as demonstrated by Grötschel, Lovász, and Schrijver, who extended the ellipsoid framework to yield approximation guarantees for problems like the traveling salesman problem and graph coloring.^[9] This oracle-based approach also laid groundwork for SDP relaxations in approximation algorithms, where the ellipsoid method theoretically solves SDPs in polynomial time, influencing high-impact results such as the 0.878-approximation for maximum cut.^[10] Khachiyan, along with Yudin and Nemirovski, received the 1982 Fulkerson Prize for their contributions to the ellipsoid method, while Grötschel, Lovász, and Schrijver received a separate Fulkerson Prize for their work on its consequences in combinatorial optimization, recognizing its profound theoretical impact.^[11] The method's success in establishing polynomial-time solvability for LP further catalyzed research into efficient algorithms, notably paving the way for Narendra Karmarkar's 1984 interior-point method, which built on this theoretical assurance to deliver a practically viable polynomial-time solver.^[12] Despite its practical limitations, the ellipsoid method's long-term legacy endures as a cornerstone for understanding polynomial-time solvability in convex optimization, emphasizing the value of theoretical tools in guiding algorithmic innovation even when direct implementations falter in efficiency.^[2]

Algorithm Description

Core Mechanism and Ellipsoid Updates

The ellipsoid method operates in n-dimensional Euclidean space by maintaining a sequence of ellipsoids that contain the feasible region of interest, iteratively shrinking this region until a solution is isolated. An ellipsoid E_k at iteration k is defined as the set E_k = \{ x \in \mathbb{R}^n \mid (x - x_k)^T A_k^{-1} (x - x_k) \leq 1 \}, where x_k \in \mathbb{R}^n is the center and A_k is a symmetric positive definite matrix that determines the shape and orientation of the ellipsoid.^[13] This representation leverages the quadratic form to describe an affine transformation of the unit ball, ensuring the set is convex and bounded. The method assumes the feasible set is a convex body, drawing on properties of such sets to guarantee progress.^[14] Initialization begins with a large ellipsoid E_0 that contains the entire feasible set, often taken as a ball centered at an arbitrary point within a known bounding box for the problem. To optimize the starting volume and improve convergence, the initial ellipsoid can be chosen as the Löwner-John ellipsoid, which is the unique ellipsoid of minimal volume enclosing the convex feasible set; by John's theorem, this ellipsoid has volume at most n^n times that of the feasible body itself, providing a tight initial enclosure.^[14] The matrix A_0 is selected accordingly, such as A_0 = R^2 I for a ball of radius R if bounds are known.^[13] At each iteration, the center x_k is tested against the constraints. If x_k lies within the feasible set, the algorithm may terminate or proceed based on the objective; otherwise, a separating hyperplane is identified via the separation oracle, providing a direction c \in \mathbb{R}^n such that the half-space \{ x \mid c^T x \leq c^T x_k \} (passing through x_k) contains all feasible points. The ellipsoid is then updated to E_{k+1}, which is contained in E_k intersected with this half-space. Specifically, normalize the cut direction as b = A_k c / \sqrt{c^T A_k c}, then set the new center x_{k+1} = x_k - \frac{1}{n+1} b and the new matrix

A_{k+1} = \frac{n^2}{n^2 - 1} \left( A_k - \frac{2}{n+1} b b^T \right).

This update projects the ellipsoid onto the feasible side of the hyperplane while preserving the ellipsoidal shape through an affine transformation.^[13] The factor \frac{n^2}{n^2 - 1} scales the volume appropriately, and the rank-one update b b^T adjusts for the cut direction.^[14] Each update reduces the volume of the ellipsoid, ensuring convergence to a small region containing a feasible point (or emptiness). The volume ratio satisfies

\frac{\mathrm{Vol}(E_{k+1})}{\mathrm{Vol}(E_k)} \leq \exp\left( -\frac{1}{2(n+1)} \right) \approx 1 - \frac{1}{2n},

for large n, with the determinant of A_{k+1} related to the volume by \mathrm{Vol}(E_k) = V_n \sqrt{\det A_k}, where V_n is the volume of the unit ball. This geometric contraction bounds the number of iterations needed to reduce the volume below a threshold, typically O(n^2 \log(1/\epsilon)) for precision \epsilon.^[13] The method's reliance on this volume reduction stems from the original polynomial-time framework for linear programming.

Separation Oracle and Cutting Planes

The separation oracle is a fundamental subroutine in the ellipsoid method that, given a point x \in \mathbb{R}^n, either certifies that x belongs to the feasible convex set S or returns a separating hyperplane, typically in the form \{ y \mid c^T (y - x) \leq 0 \}, such that S lies entirely on one side of the hyperplane (i.e., c^T (y - x) \leq 0 for all y \in S).^[15] This oracle enables the algorithm to handle convex sets defined implicitly, without requiring an explicit enumeration of all constraints, by providing a certificate of infeasibility when the queried point lies outside S.^[13] In the ellipsoid method, the separation oracle is queried at each iteration on the current ellipsoid's center x_k; if x_k \in S, the algorithm proceeds accordingly (e.g., toward feasibility certification or optimization), but if x_k \notin S, the oracle supplies the separating hyperplane, which defines a cutting plane that intersects the ellipsoid and discards the infeasible portion.^[15] The cutting plane is then incorporated to update the ellipsoid, shrinking its volume while containing the feasible set, thereby refining the search region iteratively.^[13] This integration ensures that the method converges by systematically eliminating regions guaranteed to exclude feasible points. Cutting planes generated by the oracle can be classified by their geometric impact on the ellipsoid: standard shallow cuts pass near the center, reducing the ellipsoid's volume by a fixed factor (typically at least half in variants designed for efficiency), while deep cuts, introduced in later refinements, pass farther from the center to excise a larger infeasible volume more aggressively.^[1] Deep cuts do not alter the worst-case polynomial complexity but can improve practical performance in some settings.^[1] The theoretical foundation of the separation oracle and cutting planes rests on the properties of convex sets, where the oracle exploits the supporting hyperplane theorem to separate points from closed convex sets.^[15] For feasibility problems over a convex set S, the oracle alone suffices to determine membership or emptiness by iteratively applying cuts until the ellipsoid shrinks sufficiently.^[15] In optimization contexts, such as minimizing a convex function over S, the oracle is combined with evaluations of the objective function at feasible centers to guide convergence toward an optimum.^[13] A representative example arises in linear programming, where the feasible set S = \{ x \mid A x \leq b \} is a polyhedron; the separation oracle, given x, checks if A x \leq b holds componentwise—if a violation occurs at index i (i.e., a_i^T x > b_i), it returns the hyperplane a_i^T (y - x) \leq 0 (i.e., a_i^T y \leq a_i^T x), which contains the feasible set; for implicit or exponentially many constraints, the oracle may invoke a linear programming solver to identify the most violated inequality or use Farkas' lemma to detect global infeasibility by finding a certificate y \geq 0 such that y^T A = 0 and y^T b < 0, providing a proof of emptiness via the separating hyperplane y^T A z \leq y^T b < 0 = y^T A x for any supposed feasible z, adjusted to a central cut y^T A (z - x) \leq 0 if continuing iterations.^[13]

Applications in Optimization

Unconstrained Minimization

The ellipsoid method can be adapted to solve unconstrained minimization problems of the form \min_{x \in \mathbb{R}^n} f(x), where f: \mathbb{R}^n \to \mathbb{R} is a convex function and the search is conducted over a bounded domain known to contain a minimizer. An initial ellipsoid E^0 is selected such that it encloses this domain, with radius R bounding the distance from the center to any point in the domain. At each iteration k, the algorithm queries the center x_k of the current ellipsoid E^k to a subgradient oracle, which returns a subgradient g_k \in \partial f(x_k). This subgradient generates a supporting hyperplane cut f(x) \geq f(x_k) + g_k^T (x - x_k) for all x \in \mathbb{R}^n, leveraging the convexity of f.^[1]^[16] The adaptation treats the sublevel sets \{x \mid f(x) \leq f(x_k)\} as analogous to feasible regions in the standard formulation. The subgradient oracle serves as a separation mechanism: if g_k = 0, then x_k is a minimizer and the algorithm terminates; otherwise, it provides a separating hyperplane that certifies x_k is not optimal. Specifically, the halfspace \{x \mid g_k^T (x - x_k) \leq 0\} approximates the feasible directions toward the sublevel set by excluding the halfspace where f(x) > f(x_k). The updated ellipsoid E^{k+1} is then the minimum-volume ellipsoid containing E^k \cap \{x \mid g_k^T (x - x_k) \leq 0\}.^[1]^[16] Unlike feasibility problems with explicit polyhedral constraints, unconstrained minimization imposes no hard boundaries, relying instead on the cuts to progressively refine the enclosure around the minimizer. However, the core volume reduction mechanism remains intact: each update reduces the volume of the ellipsoid by a factor of at most e^{-1/(2n+2)}, ensuring geometric convergence independent of the specific form of f beyond its convexity.^[1]^[16] The algorithm terminates when the ellipsoid is sufficiently small, yielding an \epsilon-approximate minimizer x^* such that f(x^*) \leq f^* + \epsilon, where f^* is the optimal value. This is achieved after O(n^2 \log(R G / \epsilon)) iterations, where the dependence on dimension n dominates due to the per-iteration volume shrinkage, and \log(R G / \epsilon) accounts for the initial size, subgradient bound G, and desired precision. This bound holds for general convex functions accessible via subgradient oracles and establishes the theoretical efficiency of the method for high-dimensional unconstrained problems.^[16]^[1]

Constrained Minimization Problems

The ellipsoid method addresses constrained minimization problems of the form \min_x f(x) subject to g_i(x) \leq 0 for i = 1, \dots, m, where f: \mathbb{R}^n \to \mathbb{R} and each g_i: \mathbb{R}^n \to \mathbb{R} is a convex function.^[17] This formulation encompasses general convex programs, where the objective and constraints lack smoothness or differentiability, relying instead on oracle access for queries.^[16] To solve such problems, the method reformulates the optimization as a feasibility task in the epigraph set \{(x, t) \mid f(x) \leq t, g_i(x) \leq 0\}, seeking the minimal t that admits a feasible point, or equivalently uses Lagrangian duality to incorporate constraints into a penalized objective.^[1] The approach proceeds in two phases to handle both feasibility and optimality. In Phase 1, the method determines a feasible point in the constraint set \{x \mid g_i(x) \leq 0, \, i=1,\dots,m\} by applying the ellipsoid algorithm to the feasibility problem, using a feasibility oracle that, given a point x, either confirms all g_i(x) \leq 0 or identifies a violated constraint g_j(x) > 0 and returns a separating hyperplane based on a subgradient of g_j at x.^[17] If the feasible set is nonempty, this phase yields a point x^0 inside it; otherwise, it certifies infeasibility. Phase 2 then optimizes the objective starting from x^0, employing a combined oracle that integrates the constraint feasibility check with the objective: for a query point x, if infeasible, it returns a constraint cut as in Phase 1; if feasible, it provides a subgradient of f at x to generate a cutting plane excluding suboptimal directions.^[18] The oracles are central to the method's operation for constrained problems. The constraint oracle returns either confirmation of feasibility or a violated g_i along with its subgradient \partial g_i(x), enabling a halfspace cut \{y \mid g_i(x) + \langle s, y - x \rangle \leq 0\} for s \in \partial g_i(x).^[17] For the optimization phase, the objective oracle supplies a subgradient \partial f(x) when x is feasible, yielding a cut \{y \mid f(x) + \langle g, y - x \rangle \leq f^{\mathrm{ub}}\} (under a current upper bound f^{\mathrm{ub}} on the optimal value or deep-cut variant), thus refining the search toward the minimum.^[1] These oracles assume access in polynomial time relative to the problem's bit length, mirroring the separation oracle paradigm.^[16] To ensure containment, the algorithm initializes with a large ellipsoid E_0 known to enclose the feasible set and optimal solution, such as a ball of radius R centered at the origin assuming \|x^*\| \leq R.^[17] Termination occurs when the ellipsoid's diameter falls below a tolerance \epsilon > 0, guaranteeing an \epsilon-approximate solution within the final ellipsoid, with the center serving as the candidate minimizer.^[1] This bounding strategy relies on the ellipsoids' volume reduction at each step, though the precise update mechanics are detailed elsewhere. In general convex programs, the method applies directly via these oracles, for instance, minimizing a convex loss over polyhedral constraints defined by convex inequalities.^[16] The feasibility phase connects to strong alternatives in duality theory, such as Farkas' lemma, which certifies emptiness of \{x \mid g_i(x) \leq 0\} via a nonnegative combination of subgradients yielding a separating hyperplane from the origin, underpinning the method's ability to resolve infeasibility.^[18] The unconstrained minimization case arises as a special instance with no constraints, reducing to a single-phase optimization over sublevel sets.^[1]

Theoretical Performance

Complexity Guarantees for Convex Programs

The ellipsoid method provides polynomial-time guarantees for solving general convex optimization problems, assuming access to a separation oracle or subgradient oracle. For minimizing a convex function over a convex set, the algorithm achieves an ε-approximate solution in a number of iterations bounded by O(n² log(R G / ε)), where n is the dimension of the space, R is the radius of a ball containing the initial ellipsoid and the optimum, G is the Lipschitz constant of the objective, and ε is the desired accuracy.^[1]^[17] This bound holds under the assumption that the feasible set is bounded and contained within the initial ellipsoid, and that the objective function is convex and Lipschitz continuous with constant G, ensuring that subgradients can be used to generate valid cutting planes.^[1] Unlike interior-point methods, the ellipsoid method does not require self-concordant barrier functions, relying instead on volumetric reduction via ellipsoidal approximations.^[16] For convex programs with rational input data of bit length L, the total computational complexity is measured in terms of arithmetic operations, yielding O(n⁴ L²) operations to obtain an ε-optimal solution, assuming the oracles operate in constant time. This bit-complexity bound accounts for the precision required in handling the rational coefficients and the logarithmic terms in the iteration count, ensuring polynomial-time solvability in the Turing machine model. The assumptions include a bounded feasible region and Lipschitz continuity of the objective, with the separation oracle providing a hyperplane that separates the current ellipsoid center from the feasible set whenever necessary.^[17] The proof of the iteration complexity relies on the geometric properties of ellipsoids under affine transformations. Each update reduces the volume of the enclosing ellipsoid by a factor of less than exp(-1/(2n)), ensuring that after k iterations, the volume is at most Vol(E₀) · exp(-k/(2n)). To guarantee that the final ellipsoid contains an ε-approximate minimizer, the volume must shrink sufficiently to exclude regions outside the ε-sublevel set, which requires k = O(n² log(R G / ε)) steps.^[1]^[17] The volume halves approximately every O(n) iterations due to the multiplicative reduction factor, and each individual ellipsoid update involves O(n²) arithmetic operations for the matrix inversion and reshaping steps.^[1] Combining this with the iteration bound yields the overall O(n⁴ L²) arithmetic complexity, as the bit operations per arithmetic step scale with L² for rational data handling. Theoretically, the ellipsoid method marked the first demonstration of a polynomial-time algorithm for general convex optimization, establishing that such problems are tractable in the oracle model and paving the way for its application to linear programming feasibility.^[16] This breakthrough, originating from the work of Yudin and Nemirovski, highlighted the power of cutting-plane methods in high dimensions without relying on smoothness or barrier techniques.^[16]

Analysis for Linear Programming

The ellipsoid method addresses linear programming (LP) by first reducing the optimization problem to a feasibility problem. Consider the standard LP formulation: minimize c^T x subject to Ax \leq b and x \geq 0, where A \in \mathbb{R}^{m \times n}, b \in \mathbb{R}^m, c \in \mathbb{R}^n, with all data rational and encoded in L bits. This can be transformed into a feasibility problem over a polytope, such as by introducing slack variables s \geq 0 to convert inequalities to equalities (Ax + s = b, x \geq 0, s \geq 0) or by homogenization to bound the search space within an initial ellipsoid containing the feasible region.^[19]^[20] The method then iteratively shrinks the ellipsoid until it either contains a feasible point or certifies emptiness. The separation oracle for LP feasibility is implemented by solving an auxiliary LP to check if a query point x lies in the feasible set or to identify a violating constraint. Given x, one evaluates Ax - b and -x; if all components are non-positive, x is feasible. Otherwise, a positive component provides a separating hyperplane directly. For optimization, the auxiliary LP might involve maximizing a violation over the constraints (e.g., \max_i (a_i^T x - b_i)^+) or using duality to find directions, which can invoke the ellipsoid method recursively. However, the recursion depth is bounded by the number of iterations, ensuring polynomial time overall, as the total calls remain controlled within O(n^2 L).^[19]^[6] A key reduction employs weak alternatives via Farkas' lemma to generate valid cuts without invoking a full LP solver at each step. Farkas' lemma states that either the system Ax \leq b, x \geq 0 has a solution, or there exists y \geq 0 such that y^T A \geq 0 and y^T b < 0, providing an infeasibility certificate as a cutting plane. In the ellipsoid context, if no feasible point is found after sufficient iterations, this alternative system—itself an LP—can be solved to derive the cut, avoiding deep recursion by leveraging the shrinking ellipsoid's volume bound (less than $2^{-L}). This ensures cuts are polynomial-size and maintains the method's efficiency.^[19]^[20] The complexity analysis yields O(n^2 L) iterations for the main algorithm, where n is the number of variables and L the bit length. Each iteration requires O(n^3) time for the oracle (via matrix-vector products or recursive calls) and ellipsoid updates. The initial overall time complexity is thus O(n^6 L^2), accounting for arithmetic and bit operations in rational computations. Subsequent improvements, such as optimized matrix representations and volumetric updates, reduce this to O(n^{3.5} L).^[19]^[21] Theoretically, this establishes that LP lies in P, as the first polynomial-time algorithm for the problem, but in practice, the method is not competitive with the simplex algorithm due to the large constant factors and high iteration counts, often exceeding thousands even for moderate n. It excels in theoretical proofs of complexity rather than numerical efficiency.^[6]^[22]

Practical Considerations

Implementation Challenges

Implementing the ellipsoid method presents several technical challenges, primarily stemming from numerical stability issues in finite-precision arithmetic. The matrix A_k (or equivalently B_k) that defines the ellipsoid undergoes repeated rank-one updates, which can lead to ill-conditioning and indefiniteness due to roundoff errors, potentially causing the computed quadratic form a^T B_k a to become zero or negative.^[2] To address this, stable implementations employ Cholesky or LDL^T factorization to maintain positive definiteness, updating the factors in O(n^2) operations while tracking the ellipsoid volume through log-determinant computations rather than direct determinant evaluation.^[2]^[23] Additionally, floating-point precision limitations, such as those requiring up to 38nL bits for accuracy in proofs, can result in irrational intermediate values (e.g., from square roots in radius or direction computations) that must be rounded, risking loss of polytope containment or premature algorithm termination without detecting infeasibility.^[14]^[2] A blow-up factor, such as \epsilon = 1 + \frac{1}{4(n+1)^2}, is often introduced to compensate for rounding errors, though this slows volume reduction.^[14] Data representation further complicates high-dimensional implementations. The ellipsoid is typically stored via its center x_k and a generating positive definite matrix A_k, where the set is \{ z \mid (z - x_k)^T A_k (z - x_k) \leq 1 \}, requiring O(n^2) storage for the matrix.^[23] Each update, involving a rank-one modification like A_{k+1} = \delta A_k (I - \tau a_k a_k^T) with \delta = n^2 / (n^2 - 1) and \tau = 1/(n+1), demands O(n^2) arithmetic operations, leading to poor scaling beyond dimensions n \approx 100 due to the quadratic cost per iteration.^[2]^[23] The separation oracle's implementation poses another hurdle, particularly for custom convex programs. Efficient oracle coding is essential to minimize per-iteration overhead, as generic solvers for evaluating the separating hyperplane introduce substantial computational delays and fail to exploit problem-specific structure like sparsity.^[2] Early software implementations of the ellipsoid method, dating to the late 1970s and 1980s, were primarily in Fortran, often integrated with numerical libraries like NAG for linear programming tests.^[24] Modern realizations appear in Python (e.g., the ellalgo library for linear and convex optimization) and MATLAB, with interfaces to modeling tools like CVXPY that allow embedding the method in broader convex solvers, though its adoption remains limited owing to large practical constants in runtime bounds.^[25]

Empirical Efficiency and Limitations

The ellipsoid method, despite its theoretical polynomial-time guarantees, demonstrates limited empirical efficiency due to substantial constants in its iteration complexity. Computational experiments indicate that the number of iterations scales approximately as n^{1.7} in practice, leading to hundreds of iterations for dimensions around n=60 and over 30,000 for n=500 in linear programming feasibility problems.^[26] Each iteration involves O(n^2) arithmetic operations for ellipsoid maintenance, resulting in total runtimes that become impractical for dimensions exceeding 20, where numerical stability and volume reduction rates exacerbate the slowdown.^[26] Benchmark studies consistently show the ellipsoid method underperforming compared to the simplex algorithm on standard linear programming instances. Early computational tests revealed it to be much slower on average-case problems, with convergence rates insufficient to compete despite modifications like deep cuts.^[14] More recent evaluations confirm this disparity, positioning the method as suitable mainly for theoretical demonstrations of solvability rather than routine practical use.^[2] Practical limitations further hinder adoption, including high sensitivity to the initial ellipsoid selection, where strategies like the big-M or two-phase initializations yield varying iteration counts without a clear optimum.^[26] Standard implementations use shallow cuts that achieve only marginal volume reduction per step—approximately $1 - 1/(2n)—necessitating deep cuts for acceleration, though even these fail to overcome inherent inefficiencies in high dimensions.^[2] Since the 1980s, the ellipsoid method has been supplanted by faster alternatives like interior-point methods for most applications, rendering it outdated in general optimization workflows.^[2] Post-2000 developments in robust optimization have explored ellipsoidal variants theoretically, but comprehensive empirical studies validating practical gains remain limited. Recent efforts include the Oblivious Ellipsoid Algorithm (2020), which ensures polynomial iterations for feasibility problems using a potential function, and a hybrid ellipsoid-subgradient method (2022) designed for high-dimensional optimization, potentially improving scalability though empirical validation is ongoing.^[26]^[27]^[28] Its enduring strength is worst-case reliability, avoiding the exponential pivoting pitfalls of the simplex method on contrived instances.^[2]

Variants and Extensions

Alternative Cutting Strategies

One prominent alternative to the standard central cuts in the ellipsoid method involves deep cuts, which shift the separating hyperplane to maximize the volume reduction of the ellipsoid while maintaining theoretical guarantees. In this approach, the cut is positioned not at the center of the current ellipsoid but at a location that achieves a more substantial decrease in the feasible region's approximation, potentially reducing the number of iterations by a factor of O(n) in n dimensions. This strategy was proposed by Frenk, Gromicho, and Zhang, who demonstrated its applicability to convex programming problems without increasing the per-iteration computational cost beyond that of central cuts.^[29] In nonsmooth optimization settings, subgradient bundling aggregates multiple subgradients obtained from oracle queries at nearby points to form a stronger, more informative cutting plane that approximates the convex conjugate of the objective function. Rather than relying on a single subgradient for a shallow linear cut, bundling constructs a piecewise linear underestimator, enabling deeper incisions into the ellipsoid and accelerating progress toward the optimum. This approach, developed in bundle methods tailored for the ellipsoid framework, was advanced by Kiwiel, who established its descent properties and polynomial-time convergence for convex nonsmooth minimization.^[30] To address computational bottlenecks in exact separation, oracle enhancements incorporate probabilistic or weak approximate separation oracles, which provide separating hyperplanes with bounded error or only with high probability, rather than exact ones. These oracles relax the need for precise solves at each step, allowing the ellipsoid method to proceed with approximate cuts that still guarantee volume reduction within a controlled margin, thereby reducing the overall query complexity for large-scale problems. Grötschel, Lovász, and Schrijver formalized the use of such weak oracles in their framework for geometric optimization, proving that the modified ellipsoid algorithm maintains polynomial-time solvability for convex programs under these relaxed assumptions.^[31]

Modifications to Ellipsoid Shapes

One notable modification to the ellipsoid method involves restricting the ellipsoids to Euclidean balls, which are special isotropic cases where the defining matrix is a scalar multiple of the identity. This "ball method" simplifies the update procedure significantly, as maintaining and updating the shape requires only adjustments to the center and radius, avoiding the need for full matrix inversions or decompositions. However, this comes at the cost of a slower volume reduction factor of $1 - \frac{1}{n} per iteration, compared to the more efficient exponential shrinkage in the general ellipsoid case, making it suitable for problems where computational simplicity outweighs the need for rapid convergence.^[32] Another key adaptation utilizes the John ellipsoid, defined as the unique ellipsoid of maximum volume inscribed within a given convex body, grounded in the Löwner-John theorem. This ellipsoid serves as an optimal initial bounding set for polytopes in linear programming applications of the ellipsoid method, providing a tighter starting enclosure than a generic ball or box, which can reduce the number of iterations required for convergence. The theorem guarantees that for any convex body in \mathbb{R}^n, there exists such an ellipsoid whose volume is at least n^{-n} times that of the body, enabling efficient initialization even for complex feasible regions.^[14]^[33] Randomized ellipsoids introduce stochastic perturbations or noisy separation oracles into the update process, tailored for stochastic convex optimization where the objective function or constraints are subject to random variations. These variants accelerate convergence by leveraging probabilistic guarantees, often achieving expected linear rates in low dimensions or under bounded variance assumptions, which is advantageous in settings like robust control with parameter uncertainty. For instance, multiple-update stochastic ellipsoid methods adaptively scale the ellipsoid based on sampled data, improving robustness over deterministic approaches in noisy environments.^[34]^[35] Beyond traditional optimization, post-2010 developments have extended ellipsoid shape modifications to machine learning, notably in sketching techniques for dimensionality reduction and coreset construction. Ellipsoid sketching uses variants like John ellipsoids to approximate high-dimensional data distributions efficiently, as seen in sparse dictionary learning where inscribed ellipsoids define compact summaries of feasible sets, enabling scalable approximations with provable error bounds. These applications highlight the method's adaptability to data-driven tasks, bridging classical convex optimization with modern statistical learning.^[36]

Cutting-Plane Algorithms

Cutting-plane algorithms constitute a broad class of optimization techniques designed to solve convex programs by iteratively refining an outer approximation of the feasible set through the addition of linear constraints, or "cuts," derived from a separation oracle. These methods begin with an initial polyhedral or other convex approximation containing the optimum and, in each iteration, evaluate the current approximation at its center or another point; if the point is feasible, it may update the objective bound, but if infeasible, the oracle provides a separating hyperplane that excludes the point while preserving the optimal solution within the updated set. This process continues until the approximation is sufficiently small or the optimum is identified. Seminal contributions include J. E. Kelley's 1960 method, which generates cuts using subgradients of the objective function to solve general convex programs, and A. Ya. Levin's 1965 algorithm, which employs a linear oracle to add supporting hyperplanes for minimizing convex functions over convex domains.^[37]^[38] Earlier ideas in cutting-plane methods trace back to integer programming contexts, where George B. Dantzig introduced cutting planes in 1960 to resolve fractional solutions in linear programs by adding valid inequalities that tighten the relaxation toward the integer hull. Dantzig's approach, detailed in his book Linear Programming and Extensions, laid foundational concepts for using cuts to handle discreteness, influencing subsequent continuous convex extensions by emphasizing iterative refinement of relaxations. These precursors highlighted the potential of cuts for non-polynomial but practical convergence in structured problems.^[39] The ellipsoid method represents a key specialization within cutting-plane algorithms, replacing the traditional polyhedral outer approximation with a sequence of shrinking ellipsoids to bound the feasible set. Unlike polyhedral methods that accumulate facets and may suffer from exponential growth in representation complexity, the ellipsoid approach updates the bounding ellipsoid via an affine transformation after adding a cut, ensuring a controlled volume reduction that guarantees polynomial-time convergence in terms of oracle calls for problems with rational data. This volume-based analysis, achieving a reduction factor of e^{-1/(2n)} per iteration in n-dimensions, enables theoretical polynomiality for general convex optimization, positioning the ellipsoid method as a bridge between practical cutting-plane heuristics and rigorous complexity bounds.^[1] A primary strength of cutting-plane algorithms, including the ellipsoid variant, lies in their applicability to general convex problems where only a separation oracle is available, without requiring explicit functional forms or differentiability. They excel in high-dimensional settings with sparse oracles, as seen in applications to combinatorial optimization via linear programming reductions. However, a notable weakness is their reliance on an efficient separation oracle; poor oracle performance can lead to slow convergence, and in practice, the accumulation of cuts in polyhedral variants often causes numerical instability or high computational overhead.^[40] In comparison to fractional cutting-plane methods, which aim to excise a fixed volume fraction (e.g., half) of the current approximation regardless of cut depth, the ellipsoid method is more theoretically oriented, prioritizing guaranteed volume shrinkage for worst-case polynomial bounds over empirical speed in low dimensions.^[1]

Interior-Point and Barrier Methods

Interior-point methods represent a class of algorithms for convex optimization that navigate through the interior of the feasible set rather than along its boundary, contrasting with the ellipsoid method's external approximation approach. These methods typically follow a "central path," a trajectory of points that remain strictly feasible while approaching the optimal solution as a barrier parameter decreases. The seminal work introducing a polynomial-time interior-point solver for linear programming was developed by Narendra Karmarkar in 1984, achieving a complexity of O(n^{3.5} L^2) iterations, where n is the dimension and L is the bit length of the input. This breakthrough built on the theoretical foundation established by the ellipsoid method, motivating the search for more efficient polynomial-time algorithms. Barrier methods form the core of many interior-point approaches, incorporating logarithmic barrier functions to penalize proximity to constraint boundaries and ensure iterates stay in the interior. For instance, the barrier function for inequality constraints Ax \leq b adds terms like -\mu \sum \log(b_i - a_i^T x) to the objective, where \mu > 0 controls the penalty strength. The theory of self-concordant barrier functions, introduced by Nesterov and Nemirovski in 1994, provides a framework for analyzing these methods, enabling Newton steps that yield polynomial convergence. Using self-concordant barriers, interior-point methods achieve an iteration complexity of O(\sqrt{\nu} L), where \nu is the barrier parameter (often O(n) for linear programs), significantly improving upon the ellipsoid method's O(n^2 L). Both the ellipsoid method and interior-point methods rely on separation oracles to handle constraints indirectly, without requiring an explicit feasible point upfront, but they differ in strategy: the ellipsoid method iteratively shrinks an outer ellipsoidal approximation, while interior-point methods dampen barrier penalties to move inward along the central path. The ellipsoid method's success in proving polynomial solvability for linear programming inspired the development of interior-point methods, as it demonstrated the feasibility of oracle-based polynomial algorithms and spurred research into practical variants. In modern optimization solvers, such as Gurobi, interior-point methods dominate for large-scale linear and convex programs due to their superior empirical performance, while the ellipsoid method serves primarily as a theoretical benchmark.^[41] Post-2000 research has explored hybrid approaches combining ellipsoid-like exterior approximations with interior-point techniques, particularly in semidefinite programming, to improve efficiency for approximate solutions. For example, Arora, Hazan, and Kale's 2004 algorithm integrates an ellipsoid-inspired exterior-point method with multiplicative weights updates, achieving faster convergence for low-rank semidefinite programs compared to pure interior-point solvers. These hybrids address challenges in high-dimensional settings where standard interior-point methods scale poorly, filling gaps in earlier theoretical frameworks.^[42]

References

[1]
[PDF] Ellipsoid Method - Stanford University
Apr 1, 2018 · We consider the problem of minimizing a convex function f : Rn → R ... It was used in 1979 by Khachiyan in his famous proof that ...
[2]
The Ellipsoid Method - PubsOnLine
Finally, Khachiyan indicated how one could adapt the ellipsoid method for convex optimization devel- oped by the Soviet mathematicians N. Z. Shor, D. B. ...
[3]
[PDF] Convex Optimization CMU-10725
Ellipsoid methods. 1979 Khachiyan's ellipsoid method: ▫ It constructs a sequence of shrinking ellipsoids. ▫ each of which contains the optimal solution set.
[4]
[PDF] THE ELLIPSOID METHOD - Cornell University
Finally, Khachiyan [35] indicated how one could adapt a method for convex optimization developed by the Soviet mathematicians N.Z. Shor, D.B. Iudin, and ...
[5]
L. G. Khachiyan, “A polynomial algorithm in linear programming ...
L. G. Khachiyan, “A polynomial algorithm in linear programming”, Dokl. Akad. Nauk SSSR, 244:5 (1979), 1093–1096. This publication is cited in the following 16 ...
[6]
[PDF] Khachiyan's Linear Programming Algorithm* - cs.wisc.edu
The algorithm can be used to solve linear programs in polynomial time. 1. INTRODUCTION. Leonid Genrikhovich. Khachiyan (Xaqarrn) published in Doklady. Akademiia ...
[7]
The (Dantzig) simplex method for linear programming - IEEE Xplore
Abstract: In 1947, George Dantzig created a simplex algorithm to solve linear programs for planning and decision-making in large-scale enterprises.Missing: URL | Show results with:URL
[8]
Contribution to the Ellipsoid Algorithm - Science
Khachian's algorithm for linear programming, distributed in preliminary form at the 10th International Symposium on Mathematical Programming in Montreal, ...
[9]
The ellipsoid method and its consequences in combinatorial ...
In this paper we show that the method also yields interesting results in combinatorial optimization. Thus it yields polynomial algorithms for vertex packing in ...
[10]
Definable Ellipsoid Method, Sums-of-Squares Proofs, and the Graph ...
The ellipsoid method is an algorithm that solves the (weak) feasibility and linear optimization problems for convex sets by making oracle calls to their ...
[11]
Fulkerson Prize - Mathematical Optimization Society
Past Winners of the Fulkerson Prize ; 1982. Shared one prize: Leonid G. Khachiyan, "A polynomial algorithm in linear programming", Doklady Akademiia Nauk SSR 244 ...
[12]
[PDF] Lecture 17: Interior Point Methods - cs.Princeton
Nov 19, 2018 · Nevertheless, the Ellipsoid method was very influential theoretically – it helped launch a flurry of work on linear programming, which ...
[13]
[PDF] Lecture notes on the ellipsoid algorithm
May 14, 2007 · One important fact about a positive definite matrix A is that there exists B such that. A = BT B, and hence A−1 = B−1(B−1)T . Ellipsoids are in ...
[14]
[PDF] The Ellipsoid Method - MPI-INF
Jan 3, 2016 · ellipsoid method, originally devised for nonlinear nondifferentiable optimization, can be modified in order to check the feasibility of a system ...
[15]
[PDF] THE ELLIPSOID METHOD AND ITS CONSEQUENCES IN ...
The main point is that the ellipsoid method proves the polynomial solvability of a large number of different combinat- orial optimization problems at once, and ...
[16]
[PDF] Problem complexity and method efficiency in optimization
Mar 7, 1986 · Yudin holds the chair of mathematical methods in the faculty of economics in Moscow university, and. Dr. A. S. Nemirovsky, a senior scientific ...Missing: Shor | Show results with:Shor<|control11|><|separator|>
[17]
[PDF] Chapter V: Ellipsoid Methods
The ellipsoid method was initially proposed by Yudin and Nemirovki [5] and Shor [6] in the 70's. The method attracted a lot of interest after the work of ...
[18]
[PDF] Oracles, Ellipsoid method and their uses in convex optimization
Khachiyan(1979) applied the ellipsoid method to derive the first polynomial time algorithm for linear programming. Although the algorithm is theoretically ...
[19]
[PDF] Chapter 10 Polynomiality of Linear Programming
Farkas' Lemma states that {x : Ax ≤ b} is empty iff there is λ ∈ Rm. ≥0 such ... The Ellipsoid Method and its consequences in combinatorial optimization.
[20]
[PDF] The Ellipsoid Method and Its Efficiency Analysis - Stanford University
The significant contribution of Khachiyan was to demonstrate in two papers—published in 1979 and. 1980—that under certain assumptions, the ellipsoid method ...Missing: original | Show results with:original
[21]
[PDF] 15.083 Lecture 10: Algorithms for solving relaxations
Number of iterations is O n 6 log(nU) . •. It has been shown that only O(n 3 ... • The ellipsoid method is a tool for classifying the complexity of linear.
[22]
[PDF] The Ellipsoid Algorithm for Linear Programming - cs.Princeton
Khachiyan(1979) applied the ellipsoid method to derive the first polynomial time algorithm for linear programming. Although the algorithm is theoretically ...
[23]
[PDF] Ellipsoid method
ellipsoid method addresses both issues, but retains theoretical efficiency ... • propagate Cholesky factor of P (improves numerical stability). Ellipsoid ...
[24]
A Computational Comparison of the Ellipsoid Algorithm with Several ...
Fox, J. H. Wilkinson NAG Fortran Library Manual, Mark7, Oxford, England, 1978 ... The ellipsoid method and its predecessors. Recent Advances in System ...
[25]
luk036/ellalgo: 👁️ Ellipsoid Algorithm in Python - GitHub
It is a polynomial-time algorithm that uses ellipsoids to iteratively reduce the feasible region of a linear program until an optimal solution is found.
[26]
[PDF] The ellipsoid method redux - Cornell University
Jan 18, 2023 · Khachiyan [9] used it to prove the polynomial-time solvability of linear programming. This was a very impressive theoretical advance, but the ...Missing: origins | Show results with:origins
[27]
A deep cut ellipsoid algorithm for convex programming: Theory and ...
Feb 15, 1993 · In each step the algorithm does not require more computational effort to construct these deep cuts than its corresponding central cut version.
[28]
Complexity Analysis of an Interior Cutting Plane Method for Convex ...
Analytic Center Cutting-Plane Method with Deep Cuts for Semidefinite Feasibility Problems. Journal of Optimization Theory and Applications, Vol. 123, No. 2 ...
[29]
An Ellipsoid Trust Region Bundle Method for Nonsmooth Convex ...
In this paper we provide implementable methods for solving nondifferentiable convex optimization problems. A typical method minimizes an approximate Moreau-- ...
[30]
[PDF] Cutting Plane and Ellipsoid Methods for Linear Programming
May 3, 2018 · 2.3.1 Approximation using Euclidean balls One of the simplest ideas to implement the cutting plane method is to use Euclidean balls as Ek's. ...
[31]
[PDF] the john ellipsoid theorem - University of South Carolina
Then there is an ellipsoid E (called the. John ellipsoid which will turn out to be the ellipsoid of maximal volume contained in K) so that if c is the center ...
[32]
Stochastic ellipsoid methods for robust control: Multiple updates and ...
Efficient randomized algorithms are developed for solving robust feasibility problems with multiple parameter-dependent convex constraints.Missing: empirical studies post-
[33]
Ellipsoid method for convex stochastic optimization in small dimension
We propose to use the ellipsoid method with minibatching, which converges linearly and hence requires significantly less iterations than SGD.Missing: randomized | Show results with:randomized
[34]
[PDF] Sketching Algorithms for Sparse Dictionary Learning - NIPS papers
Notably, this technique uses John ellipsoids to construct coresets rather than using a nearly optimal solution to the dictionary learning problem, and thus ...Missing: post- | Show results with:post-
[35]
The Cutting-Plane Method for Solving Convex Programs - SIAM.org
This paper introduces a master cutting plane algorithm for nonlinear programming that isolates the points it generates from one another until a solution is ...
[36]
A projection cutting plane algorithm for convex programming problems
1. J.E. Kelley. The cutting plane method for solving convex programs ; 2. A.Y. Levin. An algorithm for minimizing convex functions ; 3. D.J. Newman. Maximum on ...
[37]
(PDF) George Dantzig's contributions to integer programming
Aug 6, 2025 · This paper reviews George Dantzig's contributions to integer programming, especially his seminal work with Fulkerson and Johnson on the traveling salesman ...
[38]
[PDF] Localization and Cutting-Plane Methods
In these notes we describe a class of methods for solving general convex and quasiconvex optimization problems, based on the use of cutting-planes, ...
[39]
https://www.researchgate.net/publication/229388154_George_Dantzig%27s_contributions_to_integer_programming
[40]
[PDF] Fast Algorithms for Approximate Semidefinite Programming using ...
Our technique is a hybrid of the MW technique and an “exterior point” (i.e.,. Ellipsoid-like method) of Vaidya [26]; this lowers the de- pendence on the width ...Missing: inspiration | Show results with:inspiration