Quadratic programming
Quadratic programming (QP) is a type of mathematical optimization problem in which the objective function is quadratic and the constraints are linear, making it a special case of nonlinear programming that generalizes linear programming by incorporating quadratic terms in the objective.[1][2] The standard formulation of a quadratic program seeks to minimize \frac{1}{2} x^T Q x + c^T x, where x is the vector of decision variables, Q is a symmetric n \times n matrix, and c is a vector, subject to linear constraints such as Ax \leq b, A_{eq}x = b_{eq}, and bounds l \leq x \leq u.[1][2] If the matrix Q is positive semidefinite, the problem is convex, guaranteeing a global optimum and enabling efficient solution methods; otherwise, it may be nonconvex with multiple local minima.[1][2] Quadratic programming has broad applications across fields, including finance for optimal portfolio selection as pioneered by Harry Markowitz, who received the 1990 Nobel Prize in Economics for this work; machine learning for training support vector machines; logistics in solving quadratic knapsack problems; and engineering for power management in plug-in hybrid electric vehicles.[1][3] Common algorithms for solving quadratic programs include the active-set method, which iteratively adjusts the set of active constraints; interior-point methods, such as predictor-corrector approaches that navigate the feasible region interior; and trust-region-reflective algorithms that solve subproblems within a trust region using preconditioned conjugate gradients.[1][2] The origins of quadratic programming trace back to 1943 with H.B. Mann's work, with significant advancements in the 1950s by researchers including Barankin, Dorfman, Wolfe, and Frank, establishing it as a foundational tool in optimization theory.[1]Problem Formulation
Standard Form
Quadratic programming (QP) in its standard form involves minimizing a quadratic objective function subject to linear equality constraints and simple bound constraints on the variables. The problem is formulated as \min_x \ \frac{1}{2} x^T Q x + c^T x subject to A x = b, \quad l \leq x \leq u, where x \in \mathbb{R}^n is the vector of decision variables, Q \in \mathbb{R}^{n \times n} is a symmetric matrix, c \in \mathbb{R}^n is a constant vector, A \in \mathbb{R}^{m \times n} (with m \leq n) is the constraint matrix of full row rank, b \in \mathbb{R}^m is a constant vector, and l, u \in \mathbb{R}^n define the lower and upper bounds on x (which may be -\infty or \infty for unbounded variables).[4][5] The matrix Q must be symmetric to ensure the quadratic term \frac{1}{2} x^T Q x defines a valid quadratic form, as symmetry guarantees that the off-diagonal elements are consistently accounted for without duplication in the expansion. The factor of \frac{1}{2} in the objective is included for computational and analytical convenience: the gradient of the objective with respect to x is then Q x + c, simplifying derivatives and updates in solution algorithms compared to the form without \frac{1}{2}, where the gradient would be $2 Q x + c. This scaling also aligns the objective with the canonical representation of quadratic models in optimization, where the Hessian approximation (here exactly Q) appears without a factor of 2 in the first-order conditions.[5][6] For the problem to be convex, Q must be positive semidefinite (PSD), meaning that for all x \neq 0, x^T Q x \geq 0. To see why this ensures convexity, note that the objective function f(x) = \frac{1}{2} x^T Q x + c^T x is twice continuously differentiable, with Hessian matrix \nabla^2 f(x) = Q. A twice-differentiable function is convex if and only if its Hessian is PSD everywhere, which holds here since Q is constant. Thus, if Q \succeq 0, f(x) is convex, and the QP admits a global minimum over a convex feasible set. If Q is not PSD, the problem may be nonconvex, potentially having multiple local minima. This form is canonical because it generalizes linear programming (where Q = 0) while preserving structure for efficient solution methods, and it directly arises in subproblems of nonlinear optimization.[5] The dimensions reflect a problem with n variables and m equality constraints, assuming A has full row rank to avoid redundancy. Feasibility requires that the set \{x \in \mathbb{R}^n \mid A x = b, \, l \leq x \leq u\} is nonempty, which can be checked via linear programming or phase-one methods. For the minimum to exist and be finite, the objective must be bounded below on the feasible set; this holds if Q \succeq 0 and the feasible set is nonempty (for coercive objectives) or bounded (ensuring no recession directions where f(x) decreases unboundedly). If the feasible set is unbounded and Q has directions of negative curvature, the problem may be unbounded below.[5][4] This standard form originated in the 1950s as an extension of linear programming to handle quadratic objectives, with seminal work by Marguerite Frank and Philip Wolfe introducing an algorithm for solving such problems in 1956. A special case arises in constrained least squares, where Q = A^T A.[5]Constrained Least Squares
Constrained least squares arises in data fitting scenarios where a linear model must satisfy additional linear equality constraints, such as fixed parameters or continuity conditions. The problem is to minimize the squared Euclidean norm of the residual vector subject to these constraints: \min_x \| A x - b \|_2^2 \quad \text{subject to} \quad C x = d, where A \in \mathbb{R}^{m \times n} is the design matrix, b \in \mathbb{R}^m is the observation vector, C \in \mathbb{R}^{p \times n} specifies the constraints with right-hand side d \in \mathbb{R}^p, and typically m \geq n > p to ensure feasibility and relevance in underdetermined systems.[7][8] This formulation is directly equivalent to an equality-constrained quadratic program, as the objective expands algebraically into a quadratic function. Expanding the norm gives \| A x - b \|_2^2 = x^T A^T A x - 2 b^T A x + b^T b. The constant b^T b does not affect the minimizer, so the problem reduces to minimizing the quadratic \frac{1}{2} x^T Q x + c^T x subject to C x = d, where Q = 2 A^T A (positive semidefinite) and c = -2 A^T b. This matches the standard convex quadratic programming form with a positive semidefinite Hessian, ensuring the problem is convex and amenable to QP solvers.[7] A representative application is fitting a straight line y = m x + k to data points while fixing the y-intercept k to a known value, such as k = 0 for models through the origin. Consider the data points (x_i, y_i) = (1, 2), (2, 3), (3, 5). With k = 0, the model simplifies to y = m x, and the constraint is the second component of x = [m, k]^T fixed at 0. Reformulating for the free parameter m, we have A = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}, b = \begin{bmatrix} 2 \\ 3 \\ 5 \end{bmatrix}, yielding the least squares m = (A^T A)^{-1} A^T b = 14^{-1} \cdot 23 \approx 1.64. The residual sum of squares is \| A (23/14) - b \|_2^2 \approx 0.21. For comparison, the unconstrained line fit (allowing free k) gives m = 1.5, k \approx 0.33, with a lower residual of approximately 0.17, illustrating the trade-off imposed by the constraint. This equivalence allows solving via QP methods like active-set or interior-point algorithms.[7][9] The solution to the constrained least squares problem is unique provided the constraints are consistent (i.e., there exists some x satisfying C x = d) and the augmented matrix \begin{bmatrix} A \\ C \end{bmatrix} has full column rank, ensuring the quadratic objective is strictly convex over the affine feasible set. In such cases, the Karush-Kuhn-Tucker (KKT) conditions yield a unique primal-dual pair via the nonsingular linear system \begin{bmatrix} 2 A^T A & C^T \\ C & 0 \end{bmatrix} \begin{bmatrix} x \\ \lambda \end{bmatrix} = \begin{bmatrix} 2 A^T b \\ d \end{bmatrix}, where \lambda are the Lagrange multipliers.[7][8]Generalizations to Inequalities
The generalization of quadratic programming to inequality constraints allows for modeling real-world scenarios involving bounds on variables or one-sided restrictions, building directly on the equality-constrained case by incorporating linear inequalities alongside equalities. The full problem formulation is to minimize \frac{1}{2} x^T Q x + c^T x subject to A x \leq b, G x = h, and l \leq x \leq u, where x \in \mathbb{R}^n is the decision variable, Q \in \mathbb{R}^{n \times n} is symmetric, c \in \mathbb{R}^n, A \in \mathbb{R}^{m_A \times n}, b \in \mathbb{R}^{m_A}, G \in \mathbb{R}^{m_G \times n}, h \in \mathbb{R}^{m_G}, and l, u \in \mathbb{R}^n define componentwise bounds.[4] A standard technique to handle these inequalities is the introduction of nonnegative slack variables, which transform the problem into an equivalent equality-constrained form. For the inequality A x \leq b, slack variables s \geq 0 are added such that A x + s = b; similarly, the bounds l \leq x \leq u can be expressed using additional slacks. This results in an augmented equality-constrained quadratic program in the extended variables (x, s), preserving the original structure while enforcing feasibility through nonnegativity.[4] The feasible region for this generalized problem, defined by the linear equalities and inequalities, forms a polyhedron as the intersection of half-spaces and hyperplanes in \mathbb{R}^n.[10] This polyhedral structure ensures the feasible set is convex, regardless of the objective function. When Q is positive semidefinite, the quadratic objective is convex, rendering the entire problem convex and guaranteeing that any local minimizer is global.[11] This extension to inequalities was pioneered in the late 1950s by E. M. L. Beale to address practical optimization challenges with linear constraints.[12] Solutions to these problems satisfy the Karush-Kuhn-Tucker optimality conditions, which extend the equality case to account for active inequalities.[11]Theoretical Foundations
Optimality Conditions
In quadratic programming (QP), the first-order optimality conditions for a local minimum are given by the Karush-Kuhn-Tucker (KKT) conditions, which extend the method of Lagrange multipliers to problems with inequality constraints. These conditions are necessary under appropriate constraint qualifications and sufficient for global optimality when the QP is convex. Consider the standard QP formulation: minimize \frac{1}{2} x^T Q x + c^T x subject to equality constraints A x = b and inequality constraints G x \leq h, where Q \in \mathbb{R}^{n \times n}, c \in \mathbb{R}^n, A \in \mathbb{R}^{m \times n}, b \in \mathbb{R}^m, G \in \mathbb{R}^{p \times n}, and h \in \mathbb{R}^p. The KKT conditions for a point x^* to be optimal require the existence of multipliers \lambda^* \in \mathbb{R}^m and \mu^* \in \mathbb{R}^p such that:- Stationarity: Q x^* + c + A^T \lambda^* + G^T \mu^* = 0. This equates the gradient of the objective to a nonpositive combination of the constraint gradients.
- Primal feasibility: A x^* = b and G x^* \leq h. These ensure x^* lies in the feasible set.
- Dual feasibility: \mu^* \geq 0. This requires nonnegative multipliers for inequalities.
- Complementary slackness: \mu^{*T} (G x^* - h) = 0, or componentwise \mu_i^* (G_i x^* - h_i) = 0 for i = 1, \dots, p. This implies that if an inequality is inactive (G_i x^* < h_i), then \mu_i^* = 0.