Fact-checked by Grok 2 weeks ago

Nonlinear programming

Nonlinear programming is a subfield of mathematical optimization that addresses problems where the objective function to be minimized or maximized, or one or more of the constraints, involves nonlinear relationships among the decision variables.^[1]^[2] These problems are typically formulated as finding a vector \mathbf{x} that minimizes f(\mathbf{x}) subject to equality constraints h_i(\mathbf{x}) = 0 and inequality constraints g_j(\mathbf{x}) \leq 0, where f, h_i, or g_j are nonlinear functions.^[1] Unlike linear programming, which assumes linearity in both objectives and constraints, nonlinear programming accommodates real-world complexities such as quadratic costs, exponential growth, or trigonometric dependencies, but often results in non-convex problems that may have multiple local optima.^[2]^[3] The formal development of nonlinear programming began in the mid-20th century, building on advances in operations research during and after World War II.^[4] Key foundational work includes William Karush's 1939 master's thesis, which derived necessary conditions for optimality under inequality constraints, though it remained unpublished for decades.^[5] In 1948, Fritz John published related results on convexity and necessary conditions.^[5] The field was decisively launched in 1950 when Harold W. Kuhn and Albert W. Tucker presented the Kuhn-Tucker conditions—now known as Karush-Kuhn-Tucker (KKT) conditions—at the Second Berkeley Symposium on Mathematical Statistics and Probability, providing first-order necessary conditions for local optima in nonlinear programs.^[5]^[4] These conditions, supported by military-funded research through the Office of Naval Research, established nonlinear programming as a distinct discipline separate from linear programming.^[5] Solving nonlinear programs requires specialized algorithms due to the potential absence of global optima and challenges in constraint handling.^[6] Common methods include gradient-based techniques like the Frank-Wolfe algorithm, which iteratively solves linear approximations of the problem, and sequential quadratic programming (SQP), which approximates the nonlinear problem with a series of quadratic subproblems.^[2]^[7] Interior-point methods, originating in the 1960s for barrier functions and revitalized in the 1980s for linear programming, extend to nonlinear cases by navigating the feasible region interior while penalizing constraint violations.^[8] Penalty and augmented Lagrangian approaches transform constrained problems into unconstrained ones by adding terms that penalize infeasibility. For convex nonlinear programs, global optimality can often be guaranteed, but general cases rely on local search methods like Newton's method or trust-region strategies.^[3] Classification of problems—such as separable, quadratic, or geometric programming—guides the choice of algorithm, with software implementations like IPOPT or KNITRO enabling practical computation for large-scale instances.^[6]^[9] Nonlinear programming finds broad applications across engineering, economics, and sciences, where linear models prove insufficient.^[2] In engineering design, it optimizes structures under nonlinear material behaviors or fluid dynamics constraints.^[7] Optimal control problems, such as trajectory planning in aerospace or robotics, model nonlinear dynamics with differential equations as constraints.^[10] In finance, portfolio optimization balances expected returns against nonlinear risk measures like variance.^[2] Other uses include production planning with nonlinear costs, resource allocation in networks, and chemical process design to minimize energy use subject to reaction kinetics.^[11]^[6] The versatility of nonlinear programming underscores its role in addressing complex, real-world decision-making under uncertainty and interdependence.^[7]

Fundamentals

Definition and Formulation

Nonlinear programming (NLP) is the subfield of mathematical optimization concerned with finding values of decision variables x \in \mathbb{R}^n that minimize or maximize an objective function f(x), where f is nonlinear, subject to a set of nonlinear constraints defining the feasible region. The feasible region is the collection of all points x satisfying the constraints, which may include inequalities g_i(x) \leq 0 for i = 1, \dots, m and equalities h_j(x) = 0 for j = 1, \dots, p, with g_i and h_j also nonlinear. These problems arise in diverse fields requiring optimization beyond linear models, such as engineering design and economic modeling, where relationships between variables are inherently nonlinear.^[2]^[7] The standard formulation of an NLP is to solve

\min_{x \in \mathbb{R}^n} \, f(x)

subject to

g_i(x) \leq 0, \quad i = 1, \dots, m,

h_j(x) = 0, \quad j = 1, \dots, p,

assuming f, g_i, and h_j are continuous and differentiable (typically twice continuously differentiable for second-order analysis). This form encompasses maximization by negating the objective and handles both equality and inequality constraints; nonnegativity bounds x_k \geq 0 can be incorporated as special cases of g_i. Unlike linear programming, where the objective and constraints are affine functions yielding a convex feasible region and a unique global optimum (if it exists), NLP nonlinearity in f, g, or h often results in non-convex feasible regions with multiple local optima, complicating the search for global solutions.^[2]^[12]^[13] A cornerstone of NLP theory is the Karush-Kuhn-Tucker (KKT) conditions, which provide first-order necessary conditions for a point x^* to be a local minimizer, provided a constraint qualification (such as the gradients of active constraints being linearly independent) holds to ensure the existence of multipliers. These conditions generalize the method of Lagrange multipliers from equality-constrained problems to include inequalities. To derive them, form the Lagrangian

\mathcal{L}(x, \lambda, \mu) = f(x) + \sum_{i=1}^m \lambda_i g_i(x) + \sum_{j=1}^p \mu_j h_j(x),

where \lambda \in \mathbb{R}^m and \mu \in \mathbb{R}^p are Lagrange multipliers for the inequalities and equalities, respectively. For the equality-constrained case (ignoring inactive inequalities), a local minimizer x^* satisfies \nabla_x \mathcal{L}(x^*, \lambda^*, \mu^*) = 0 under regularity, meaning \nabla f(x^*) = -\sum \lambda_i^* \nabla g_i(x^*) - \sum \mu_j^* \nabla h_j(x^*), so the objective gradient balances the constraint gradients. To handle inequalities, introduce slack variables s_i \geq 0 such that g_i(x) + s_i = 0, transforming to equalities with nonnegativity on s_i; the multipliers \lambda_i^* for these must satisfy \lambda_i^* \geq 0. Complementary slackness then enforces \lambda_i^* g_i(x^*) = 0 for each i, ensuring inactive constraints (g_i(x^*) < 0) have zero multipliers, while active ones (g_i(x^*) = 0) may have positive multipliers. The full KKT conditions at x^* are thus:

Stationarity: \nabla f(x^*) + \sum_{i=1}^m \lambda_i^* \nabla g_i(x^*) + \sum_{j=1}^p \mu_j^* \nabla h_j(x^*) = 0,
Primal feasibility: g_i(x^*) \leq 0 for all i, h_j(x^*) = 0 for all j,
Dual feasibility: \lambda_i^* \geq 0 for all i,
Complementary slackness: \lambda_i^* g_i(x^*) = 0 for all i.

These conditions imply that at a local optimum, the objective's rate of change is countered exactly by the active constraints' gradients, with inactive inequalities ignored in the balance. Under convexity of f and the feasible region, the KKT conditions become sufficient for global optimality. The derivation originates from saddle-point analysis of the Lagrangian, where x^* minimizes \mathcal{L} over feasible x and multipliers maximize over nonnegative \lambda and unrestricted \mu, yielding zero duality gap at optimality.^[14]^[15]^[16]^[17]

Historical Development

The origins of nonlinear programming trace back to the 1940s, amid the development of linear programming during World War II efforts in operations research. John von Neumann's work on game theory and duality theory for linear programs, particularly his 1947 manuscript "Discussion of the Maximization Problem," provided foundational insights that highlighted the limitations of linearity and inspired extensions to nonlinear cases. Similarly, Garrett Birkhoff's contributions to polyhedral theory and combinatorial optimization, including the Birkhoff-von Neumann decomposition theorem from 1946, influenced early efforts to handle nonlinear constraints in optimization problems related to assignment and transportation. These developments, alongside George Dantzig's simplex method for linear programming in 1947, set the stage for addressing nonlinear extensions, though practical algorithms for such problems remained elusive until the 1950s.^[18] A pivotal milestone occurred in 1951 with the publication of the Karush-Kuhn-Tucker (KKT) conditions, which established necessary optimality conditions for nonlinear programs with inequality constraints. William Karush had formulated these conditions in his 1939 master's thesis, but they gained prominence through the independent work of Harold W. Kuhn and Albert W. Tucker in their seminal paper "Nonlinear Programming," presented at the Second Berkeley Symposium on Mathematical Statistics and Probability. Independently, Fritz John published related necessary conditions in 1948, which do not require constraint qualifications. This theorem generalized Lagrange multipliers to inequalities, launching nonlinear programming as a distinct field and enabling rigorous analysis of constrained optimization problems. Kuhn and Tucker were motivated by von Neumann's informal suggestions during discussions at Princeton in the late 1940s.^[18]^[19] The 1950s and 1960s saw algorithmic advancements building on these theoretical foundations. In 1956, Marguerite Frank and Philip Wolfe introduced the Frank-Wolfe algorithm in their paper "An Algorithm for Quadratic Programming," providing an iterative method for solving convex quadratic programs, a key subclass of nonlinear programming that arises in applications like portfolio optimization. This method, which linearizes the objective at each step and solves a linear subproblem, marked an early practical approach and was popularized in the 1960s as computational tools improved. Quadratic programming itself emerged as a focal area, with works like Beale's 1961 contributions further refining solvers for this structured form of nonlinear problems. By the 1970s and 1980s, attention shifted toward global optimization for nonconvex nonlinear programs, where local methods could miss the optimum. Branch-and-bound techniques, initially developed for integer programming in the 1960s by Land and Doig (1960), were extended to nonconvex cases in the 1970s; for instance, John A. Tomlin's 1970 work applied branch-and-bound to integer and nonconvex programming, enabling enumeration of feasible regions while bounding suboptimal branches. This era also witnessed the resurgence of interior-point methods, sparked by Narendra Karmarkar's 1984 polynomial-time algorithm for linear programming, which used barrier functions and projective transformations. Although initially for linear cases, these methods were rapidly extended to nonlinear and convex quadratic programs in the late 1980s, offering efficient alternatives to simplex-based approaches for large-scale problems. In the 2010s and 2020s, nonlinear programming has integrated deeply with machine learning, addressing large-scale nonconvex optimization in AI systems such as neural network training and hyperparameter tuning. Seminal surveys highlight how second-order and stochastic methods from nonlinear programming, like quasi-Newton approximations, enhance scalability in deep learning, where objectives involve millions of parameters. This fusion has driven high-impact applications, with tools like interior-point solvers adapted for constrained ML formulations, marking a evolution from theoretical milestones to ubiquitous computational frameworks in artificial intelligence optimization.

Theoretical Foundations

Convexity and Optimality Conditions

A convex set in \mathbb{R}^n is defined as a set C such that for any x, y \in C and \lambda \in [0, 1], the point \lambda x + (1 - \lambda) y \in C.^[20] Convex sets are closed under affine combinations, meaning that if \sum_{i=1}^k \lambda_i x_i = x with \sum_{i=1}^k \lambda_i = 1 and \lambda_i \geq 0 for all i, and each x_i \in C, then x \in C.^[21] The intersection of any collection of convex sets is convex, which ensures that feasible regions defined by convex constraints remain convex.^[20] A function f: \mathbb{R}^n \to \mathbb{R} is convex if its epigraph \{(x, t) \mid t \geq f(x)\} is a convex set, or equivalently, if for all x, y in the domain and \lambda \in [0, 1], f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y).^[20] This property is characterized by Jensen's inequality: for a convex function f and weights \lambda_i \geq 0 with \sum_{i=1}^k \lambda_i = 1, f\left( \sum_{i=1}^k \lambda_i x_i \right) \leq \sum_{i=1}^k \lambda_i f(x_i).^[22] If the inequality is strict for \lambda \in (0, 1) and x \neq y, the function is strictly convex.^[20] Twice-differentiable convex functions satisfy \nabla^2 f(x) \succeq 0 (positive semi-definite Hessian) for all x.^[20] In nonlinear programming, a problem is convex if the objective function is convex (for minimization) and all inequality constraints define convex sets (i.e., g_i(x) \leq 0 where each g_i is convex), with equality constraints being affine.^[23] For a convex nonlinear program, any local minimum is a global minimum, and if the objective is strictly convex, the minimum is unique.^[2] This follows from the fact that the level sets of convex functions are convex, ensuring no better feasible points exist beyond local optima.^[24] For unconstrained minimization of a differentiable function f, first-order optimality requires \nabla f(x^*) = 0 at a local minimum x^*, meaning the gradient vanishes.^[25] For equality-constrained problems \min f(x) subject to h(x) = 0, where h is vector-valued, the Lagrange multiplier condition states that there exist \lambda such that \nabla f(x^*) + \lambda^T \nabla h(x^*) = 0, assuming regularity like linear independence of \nabla h(x^*).^[26] For general nonlinear programs with inequalities, the Karush-Kuhn-Tucker (KKT) conditions provide necessary conditions for optimality under constraint qualifications.^[15] Consider \min f(x) subject to g_i(x) \leq 0 (i=1,\dots,m) and h_j(x) = 0 (j=1,\dots,p), with f, g_i, h_j differentiable. The Lagrangian is \mathcal{L}(x, \lambda, \mu) = f(x) + \sum_{i=1}^m \lambda_i g_i(x) + \sum_{j=1}^p \mu_j h_j(x). The KKT conditions at a local minimum x^* are:

Stationarity: \nabla_x \mathcal{L}(x^*, \lambda^*, \mu^*) = 0, i.e., \nabla f(x^*) + \sum_{i=1}^m \lambda_i^* \nabla g_i(x^*) + \sum_{j=1}^p \mu_j^* \nabla h_j(x^*) = 0.
Primal feasibility: g_i(x^*) \leq 0 for all i, h_j(x^*) = 0 for all j.
Dual feasibility: \lambda_i^* \geq 0 for all i.
Complementary slackness: \lambda_i^* g_i(x^*) = 0 for all i.

These arise from forming the Lagrangian to incorporate constraints and setting its partial derivatives to zero, with nonnegativity and slackness ensuring the inequalities are handled correctly.^[15] Under a constraint qualification like Slater's (strict feasibility for inequalities, and linear independence for equalities), the KKT conditions are necessary for local minima.^[25] For convex problems satisfying such qualifications, they are also sufficient for global optimality.^[20] An alternative to KKT is the Fritz John conditions, which do not require constraint qualifications but introduce an extra multiplier \tau \geq 0 for the objective: there exist \tau, \lambda_i \geq 0, \mu_j such that \tau \nabla f(x^*) + \sum \lambda_i \nabla g_i(x^*) + \sum \mu_j \nabla h_j(x^*) = 0, with \tau + \sum \lambda_i + \sum |\mu_j| = 1, primal/dual feasibility, and complementary slackness.^[27] If \tau > 0, they reduce to KKT; otherwise, x^* may be a "degenerate" optimum where constraints dominate.^[27] Second-order optimality conditions refine first-order ones. For unconstrained problems, if \nabla f(x^*) = 0 and \nabla^2 f(x^*) \succ 0 (positive definite Hessian), then x^* is a strict local minimum.^[28] More generally, for constrained cases, the Hessian of the Lagrangian must be positive semi-definite on the tangent space to the active constraints at x^*.^[29] This ensures the quadratic approximation curves upward in feasible directions. In nonconvex nonlinear programming, where the objective or constraints are nonconvex, KKT points may correspond to local minima, maxima, or saddle points, complicating the identification of global optima.^[30] Saddle points, where the Hessian has both positive and negative eigenvalues, can trap algorithms in plateaus of near-zero gradient but high function value, leading to slow convergence or suboptimal solutions.^[31] Distinguishing local from global optima requires additional global search techniques, as local methods may converge to non-global stationary points.^[30]

Duality and Sensitivity Analysis

In nonlinear programming, Lagrangian duality provides a framework for reformulating the primal optimization problem to derive bounds and insights into its structure. The Lagrangian function for a problem minimizing f(x) subject to inequality constraints g(x) \leq 0 and equality constraints h(x) = 0 is defined as L(x, \lambda, \mu) = f(x) + \lambda^T g(x) + \mu^T h(x), where \lambda \geq 0 are multipliers for inequalities and \mu are unrestricted multipliers for equalities.^[32] The dual function is then constructed as g(\lambda, \mu) = \inf_x L(x, \lambda, \mu), which yields a lower bound on the primal objective value when the infimum is finite.^[33] The dual problem is formulated as maximizing g(\lambda, \mu) subject to \lambda \geq 0, providing a convex optimization problem even if the primal is nonconvex. Weak duality holds universally, stating that the optimal primal value is at least the optimal dual value, i.e., \inf \{f(x) \mid g(x) \leq 0, h(x) = 0\} \geq \sup \{g(\lambda, \mu) \mid \lambda \geq 0\}.^[32] Strong duality, where the duality gap is zero and optimal values coincide, requires additional conditions such as convexity of the primal and a constraint qualification like Slater's condition, which ensures the existence of a strictly feasible point where g(x) < 0 for inequalities with positive multipliers.^[33] In nonconvex problems, a positive duality gap may persist, complicating the recovery of primal solutions from dual optima.^[32] Sensitivity analysis leverages dual variables to assess how solutions respond to perturbations in problem data. The optimal dual multipliers \lambda^* and \mu^* serve as shadow prices, representing the marginal change in the optimal objective value per unit relaxation of the corresponding constraint; for instance, \lambda_i^* indicates the rate of improvement in the objective when the i-th inequality bound is increased.^[34] Perturbation analysis examines variations in parameters such as right-hand-side values or objective coefficients by considering the perturbed problem and deriving first-order expansions of the optimal solution around the nominal point, often using implicit function theorems under regularity conditions like second-order sufficiency.^[35] Duality plays a high-level role in optimization algorithms by generating lower bounds for branch-and-bound methods and aiding convergence proofs through dual ascent or subgradient techniques.^[33] However, in nonconvex settings, the potential for nonzero duality gaps limits its reliability for exact solutions, necessitating hybrid approaches or relaxations.^[32]

Solution Approaches

Local Optimization Methods

Local optimization methods in nonlinear programming seek to find local optima by iteratively improving an initial guess through descent directions, typically assuming the objective and constraints are smooth and differentiable. These methods are particularly effective for problems where the solution landscape features well-behaved local minima, such as convex or mildly nonconvex cases, but they do not guarantee global optimality. They form the backbone of many practical solvers due to their efficiency in high dimensions when started near a solution.^[36] For unconstrained nonlinear programs of the form \min_x f(x), where f: \mathbb{R}^n \to \mathbb{R} is continuously differentiable, the steepest descent method, also known as gradient descent, proceeds by moving in the direction of the negative gradient at each iterate x_k, which points toward the locally fastest decrease in f. The algorithm initializes at x_0 and computes the search direction d_k = -\nabla f(x_k); if d_k = 0, it stops as a stationary point is reached. Otherwise, a step length \alpha_k > 0 is chosen via exact or inexact line search to minimize f(x_k + \alpha_k d_k), and the update is x_{k+1} = x_k + \alpha_k d_k. This method, first proposed by Cauchy in 1847, converges to a stationary point under mild conditions, such as f being continuously differentiable on a compact set, though its linear convergence rate can be slow for ill-conditioned problems like quadratics with disparate eigenvalues.^[37]^[38] A more rapid alternative for unconstrained optimization is Newton's method, which approximates f locally by its second-order Taylor expansion m_k(p) = f(x_k) + \nabla f(x_k)^T p + \frac{1}{2} p^T H(x_k) p, where H(x_k) is the Hessian matrix assumed positive definite near the solution. The minimizer of this quadratic model solves H(x_k) d_k = -\nabla f(x_k), yielding the Newton direction d_k = -H(x_k)^{-1} \nabla f(x_k). The update is then x_{k+1} = x_k + \alpha_k d_k, often with \alpha_k = 1 for full steps when close to the optimum, or damped via line search otherwise. Under twice continuous differentiability of f, strict positive definiteness of H near a local minimum x^* where \nabla f(x^*) = 0, and Lipschitz continuity of H, Newton's method exhibits local quadratic convergence: \|x_{k+1} - x^*\| \leq C \|x_k - x^*\|^2 for some constant C > 0 and iterates sufficiently close to x^*.^[39]^[40] Constrained nonlinear programs \min_x f(x) subject to c(x) = 0 and g(x) \leq 0 require handling feasibility. One approach is the quadratic penalty method, which transforms the problem into an unconstrained sequence by augmenting the objective with a term penalizing constraint violations. The penalty function is P(x, \rho) = f(x) + \frac{\rho}{2} \|c(x)\|^2 + \frac{\rho}{2} \sum_i [\max(0, g_i(x))]^2, where \rho > 0 is a penalty parameter increased monotonically to infinity across iterations. Each subproblem \min_x P(x, \rho_k) is solved using an unconstrained method like steepest descent or Newton, and the limit of the minimizers approaches a solution to the original problem as \rho_k \to \infty. This method, originally suggested by Frisch in 1955 for handling nonlinear constraints, ensures convergence to a Karush-Kuhn-Tucker (KKT) point under suitable regularity, though ill-conditioning arises for large \rho due to near-singular Hessians.^[41]^[42] Sequential Quadratic Programming (SQP) addresses constrained problems more directly by iteratively approximating the nonlinear program with a quadratic program (QP) subproblem. At iterate x_k, the QP minimizes a quadratic model of the Lagrangian \mathcal{L}(x, \lambda) = f(x) + \lambda^T c(x), using the Hessian (or approximation) of \mathcal{L} and linearizing constraints: \min_d \frac{1}{2} d^T H_k d + \nabla f(x_k)^T d subject to \nabla c(x_k)^T d + c(x_k) = 0 and \nabla g(x_k)^T d + g(x_k) \leq 0. The solution (d_k, \lambda_{k+1}) provides the step and updated multipliers; a line search along this direction ensures descent in a merit function incorporating KKT residuals. Originating in Wilson's 1963 thesis and advanced by Han in 1976 with proofs of superlinear convergence using quasi-Newton updates, SQP achieves local quadratic convergence when exact Hessians are used and second-order sufficiency holds.^[36] Interior-point methods handle inequality constraints via barrier functions that prevent iterates from crossing boundaries. For \min_x f(x) subject to c(x) = 0 and g(x) \geq 0, the logarithmic barrier subproblem is \min_x f(x) - \mu \sum_i \log([g_i(x)](/page/Log)) with equality constraints, where barrier parameter \mu > 0 decreases to zero. Primal-dual variants solve perturbed KKT systems simultaneously for primal x, dual \lambda, \nu, and slack variables, using Newton steps on the barrier KKT conditions. These methods, extended to nonlinear programs from linear cases in works like those of Overton and Wright (1998), exhibit local superlinear or quadratic convergence near strict local solutions under linear independence and second-order conditions.^[43]

Global Optimization Methods

Global optimization methods seek to identify the global minimum (or maximum) of nonconvex nonlinear programs, where the objective function and constraints may exhibit multiple local optima, complicating the search process. Unlike local methods, these approaches systematically explore the feasible region to ensure optimality guarantees or high-confidence approximations, often at significant computational cost. Deterministic methods provide rigorous proofs of global optimality through exhaustive partitioning and bounding, while stochastic methods offer heuristic exploration suitable for high-dimensional problems.

Deterministic Methods

Branch-and-bound (B&B) algorithms form the cornerstone of deterministic global optimization for nonlinear programs, systematically partitioning the variable space and using relaxations to bound subproblems. Spatial branching divides the domain of continuous variables into subintervals, creating a tree of nodes where each node represents a rectangular subdomain defined by tightened bounds on the variables. Lower bounds are computed by solving convex relaxations of the original nonconvex problem over each subdomain, such as linear programming approximations or convex underestimators that enclose the feasible objective values from below; upper bounds are obtained from feasible solutions, often via local solvers. If a node's lower bound exceeds the current best upper bound, the node is pruned, eliminating unpromising regions from further exploration. This process continues until all remaining nodes are fathomed, ensuring the global optimum is found. Node selection in B&B follows strategies like the best-bound rule, which prioritizes the node with the smallest lower bound to efficiently converge on the global solution by focusing computational effort on the most competitive subregions. For problems involving bilinear terms, such as w = xy where x \in [x^L, x^U] and y \in [y^L, y^U], McCormick relaxations provide tight convex underestimators forming the convex envelope:

\begin{align*} w &\geq x^L y + y^L x - x^L y^L, \\ w &\geq x^U y + y^U x - x^U y^U. \end{align*}

These inequalities, along with analogous concave overestimators, enable spatial branching to refine bounds and reduce the search space effectively. Complementary techniques, such as spatial branch-and-reduce, enhance B&B by incorporating range reduction rules to contract variable bounds prior to branching, further tightening relaxations and accelerating convergence. The αBB (α-based branch-and-bound) method exemplifies branch-and-reduce for twice-differentiable nonconvex functions, constructing convex underestimators via a quadratic perturbation. For a nonconvex function f: \mathbb{R}^n \to \mathbb{R} over a subdomain around a reference point x_c, the underestimator is

f^\alpha(x) = f(x) + \alpha \|x - x_c\|_2^2,

where \alpha \geq \frac{1}{2} \sup_{\xi \in \text{subdomain}} \max\{0, -\lambda_{\min}(\nabla^2 f(\xi))\} ensures convexity, with \lambda_{\min} the smallest eigenvalue of the Hessian; this relaxation converges to the global optimum as partitioning refines.^[44] Such methods, often combined with linear relaxations for constraints, have been pivotal in solving industrial-scale nonconvex problems.

Stochastic Methods

Stochastic methods approximate global optima through randomized search, forgoing optimality guarantees in favor of scalability to complex landscapes. Genetic algorithms (GAs) operate on a population of candidate solutions, evolving them via natural selection principles: selection favors fitter individuals (lower objective values), crossover recombines pairs to generate offspring, and mutation introduces random perturbations to maintain diversity and escape local minima. This population-based evolution mimics Darwinian processes, progressively improving solutions over generations until convergence criteria are met, making GAs robust for multimodal nonlinear programs. Simulated annealing (SA) draws from metallurgical annealing, starting with a high "temperature" parameter that allows probabilistic acceptance of worse solutions to explore broadly, then cooling gradually to refine toward a minimum. At each iteration, a neighbor solution is proposed; it is accepted with probability \min\{1, \exp(-\Delta f / T)\}, where \Delta f is the objective change and T the current temperature following a cooling schedule like geometric decay. This mechanism probabilistically tunnels through energy barriers, providing a heuristic for global search in nonconvex settings.

Challenges

Global optimization of nonlinear programs is NP-hard in general, as even deciding the existence of a feasible point or evaluating simple quadratic forms can require exponential time in the worst case. Additionally, these methods suffer from the curse of dimensionality, where the volume of the search space explodes exponentially with the number of variables, rendering exhaustive enumeration impractical for high-dimensional problems despite advances in bounding and reduction techniques.

Applications and Implementations

Real-World Applications

Nonlinear programming (NLP) finds extensive application in engineering disciplines, where it addresses complex design and control problems involving nonlinear constraints and objectives. In structural optimization, NLP is employed to minimize the weight of truss structures while satisfying stress and displacement constraints, enabling efficient designs for bridges, buildings, and aerospace components.^[45] For instance, hybrid optimization techniques combining NLP with buckling considerations have been used to enhance the performance of elastic truss systems under load.^[46] In chemical process control, NLP optimizes nonlinear reactor models to maximize yield or minimize energy consumption, accounting for reaction kinetics and heat transfer dynamics that introduce nonlinearity.^[47] These applications adapt the general NLP formulation by incorporating domain-specific models, such as differential equations for reactor behavior, to achieve operational efficiency.^[48] In economics and finance, NLP supports decision-making under uncertainty and strategic interactions. Portfolio optimization often involves minimizing nonlinear risk measures like Value-at-Risk (VaR), which captures tail risks in asset returns through nonlinear formulations, leading to more robust investment strategies.^[49] Additionally, equilibrium models in game theory are solved using NLP to find generalized Nash equilibria in finite normal-form games, where players' strategies interact nonlinearly to determine market outcomes or pricing.^[50] These methods provide insights into competitive dynamics, such as oligopolistic markets, by formulating payoffs as nonlinear functions of mixed strategies. Machine learning leverages NLP for core training and tuning tasks, treating model development as optimization problems. Neural network training is formulated as an unconstrained NLP, minimizing a nonlinear loss function over network parameters to fit data patterns, with methods like primal-dual interior-point approaches accelerating convergence.^[51] Hyperparameter tuning, which selects optimal learning rates or layer sizes, is also cast as an NLP problem when the performance metric exhibits nonlinear dependencies on hyperparameters, enabling automated refinement for improved model accuracy. In the energy sector, NLP is critical for managing power systems with nonlinear physics. Optimal power flow (OPF) in electrical grids minimizes generation costs or losses subject to nonlinear alternating current (AC) models, ensuring voltage stability and line limits across transmission networks.^[52] This application has enabled real-time dispatch in large-scale grids, adapting NLP to handle the non-convexity inherent in power balance equations.^[53] Historically, NASA applied NLP for trajectory optimization during the 1960s Apollo missions, solving nonlinear problems to minimize entry heating while meeting entry corridor constraints for lunar returns.^[54] In modern contexts, post-2010 advancements in GPU acceleration have scaled NLP for large-scale applications in AI, such as training massive neural networks, by parallelizing nonlinear solvers to handle millions of variables efficiently.^[55]

Software Tools and Algorithms

Software tools for nonlinear programming encompass a range of open-source and commercial solvers that implement algorithms such as interior-point methods and sequential quadratic programming (SQP) to address constrained optimization problems.^[56]^[57]^[58] These tools support both local and global optimization, with capabilities for handling sparse structures and large-scale instances involving millions of variables.^[59]^[60] Among open-source options, IPOPT (Interior Point OPTimizer) is a widely used package for large-scale nonlinear optimization, employing a primal-dual interior-point filter line-search algorithm that excels in sparse, constrained problems.^[56] It supports interfaces in multiple languages, including C++, Fortran, and Python, and is particularly effective for continuous nonlinear programs with equality and inequality constraints.^[59] SciPy.optimize, part of the Python-based SciPy library, provides accessible functions like minimize with the SLSQP method for nonlinear constrained optimization, making it suitable for medium-scale problems in scientific computing.^[57] NLopt offers a lightweight C library with wrappers for languages like Python and Julia, integrating over 20 algorithms for local and global nonlinear optimization, including derivative-free options for noisy objectives. Commercial solvers provide robust performance for industrial applications. MATLAB's fmincon function solves general constrained nonlinear multivariable optimization using SQP or interior-point algorithms, with options for trust-region methods and support for parallel evaluation of objectives and constraints.^[58] In the GAMS modeling system, CONOPT serves as a specialized nonlinear solver based on generalized reduced gradient and active-set strategies, optimized for large, sparse models with highly nonlinear constraints.^[61]^[62] KNITRO, developed by Artelys, implements advanced interior-point and active-set methods for nonlinear and mixed-integer nonlinear programs, featuring multistart capabilities for global optimization and parallelization for faster convergence on large instances.^[60]^[63] Key features across these tools include efficient handling of sparse matrices to reduce memory usage and computation time, as seen in IPOPT's support for sparse linear algebra interfaces like HSL and MA57.^[64] Parallelization is integrated in solvers like KNITRO and SciPy's options for multiprocessing, enabling distributed evaluation of function gradients on multi-core systems.^[60]^[57] Algorithm selection is flexible; for instance, NLopt allows users to choose from local (e.g., augmented Lagrangian) or global (e.g., differential evolution) methods via a unified API. Integration with high-level modeling languages enhances usability for nonlinear programming. Pyomo, an open-source Python package, facilitates declarative modeling of nonlinear problems and interfaces with solvers like IPOPT and KNITRO through the AMPL Solver Library format. JuMP, a Julia domain-specific language, supports nonlinear expressions and automatic differentiation for second-order derivatives, seamlessly linking to NLopt and IPOPT for solving large-scale models.^[65] These frameworks handle problems with millions of variables in modern applications, leveraging sparse data structures and GPU acceleration in emerging extensions.^[56]^[66] Recent developments as of 2025 emphasize stochastic extensions of traditional algorithms and AI integration for solver enhancement. For example, stochastic sequential quadratic programming (SQP) methods address optimization under noisy objectives, common in machine learning applications.^[67] Additionally, coevolutionary approaches combining global exploration with local refinement have advanced solutions for large-scale combinatorial nonlinear problems.^[68] Such innovations improve scalability and reliability for real-time and uncertain environments. Hybrid solvers that combine local and global techniques continue to enhance reliability, particularly in mixed-integer nonlinear extensions. BARON (Branch-And-Reduce Optimization Navigator) exemplifies this by using branch-and-bound with NLP relaxations to guarantee global optimality in nonconvex problems, supporting large-scale MINLPs through presolving and symmetry detection.^[69]^[70] Such hybrids, including GPU-accelerated variants, have improved scalability for real-time optimization in the 2020s.^[55]

Illustrative Examples

Two-Dimensional Case

To illustrate the formulation and solution of a nonlinear program in two dimensions, consider the problem of minimizing the objective function f(x, y) = x^2 + y^2 subject to the inequality constraints x + y \geq 1, x \geq 0, and y \geq 0. This represents finding the point in the feasible region closest to the origin in the Euclidean norm. The feasible region is the portion of the first quadrant lying above or on the line x + y = 1. Graphically, the level sets of the objective function are concentric circles centered at the origin with radii r = \sqrt{f(x, y)}. The minimum occurs at the point of tangency between the smallest such circle and the boundary of the feasible region, which is the line segment x + y = 1 for x \geq 0, y \geq 0. This tangency point lies where the normal to the constraint (the direction (1, 1)) aligns with the gradient of the objective, yielding the candidate solution (x, y) = (0.5, 0.5). To solve analytically, apply the Karush-Kuhn-Tucker (KKT) conditions, which are necessary for optimality under constraint qualifications such as linear independence of gradients. Form the Lagrangian \mathcal{L}(x, y, \lambda, \mu_1, \mu_2) = x^2 + y^2 + \lambda (1 - x - y) + \mu_1 (-x) + \mu_2 (-y), where \lambda \geq 0, \mu_1 \geq 0, and \mu_2 \geq 0 are the dual multipliers for the inequalities x + y \geq 1, x \geq 0, and y \geq 0, respectively. The KKT stationarity conditions require \nabla \mathcal{L} = 0, giving:

$2x - \lambda - \mu_1 = 0, \quad 2y - \lambda - \mu_2 = 0.

Primal feasibility holds: x + y \geq 1, x \geq 0, y \geq 0. Dual feasibility is \lambda, \mu_1, \mu_2 \geq 0. Complementary slackness requires \lambda (x + y - 1) = 0, \mu_1 x = 0, and \mu_2 y = 0. Due to symmetry, test the case where the inequality x + y \geq 1 binds (\lambda > 0, so x + y = 1) and the nonnegativity constraints are inactive (\mu_1 = \mu_2 = 0, x > 0, y > 0). Substituting into stationarity yields $2x = \lambda and $2y = \lambda, so x = y. With x + y = 1, this gives x = y = 0.5 and \lambda = 1. The remaining KKT conditions are satisfied, as \mu_1 = \mu_2 = 0 \geq 0 and complementary slackness holds. To verify local optimality, examine the second-order sufficient conditions on the Hessian of \mathcal{L} restricted to the tangent space of active constraints (here, only x + y = 1). The Hessian is H = \begin{pmatrix} 2 & 0 \\ 0 & 2 \end{pmatrix}, which is positive definite. Projecting onto the null space of the active constraint gradient (1, 1) (directions orthogonal to it, e.g., (1, -1)) yields z^T H z > 0 for z \neq 0 in that space, confirming a local minimum. Since the objective is convex and constraints linear, this is the global optimum. The optimal value is f(0.5, 0.5) = 0.5. The dual variable \lambda = 1 interprets the shadow price of the constraint x + y \geq 1: relaxing the right-hand side by a small amount \epsilon > 0 improves the objective by approximately \lambda \epsilon = \epsilon, reflecting the binding nature of the constraint at optimum. The zero values of \mu_1 and \mu_2 indicate the nonnegativity constraints are non-binding.

Multi-Dimensional Case

In the multi-dimensional case, nonlinear programming problems extend beyond two variables, often involving three or more dimensions, which amplifies computational complexity, the potential for numerous local optima, and difficulties in interpreting results. These problems demonstrate how scalability challenges arise, as the search space expands exponentially, making exhaustive exploration impractical without advanced techniques. A key feature is the proliferation of local minima, which local solvers may trap solutions in, underscoring the need for careful initial point selection and global verification strategies.^[71] An illustrative example is the minimization of the three-dimensional Rastrigin function, a standard benchmark for testing optimization algorithms due to its nonconvexity and multimodal landscape:

f(x, y, z) = 30 + (x^2 - 10 \cos(2\pi x)) + (y^2 - 10 \cos(2\pi y)) + (z^2 - 10 \cos(2\pi z)),

defined over the box constraints -5.12 \leq x, y, z \leq 5.12. This function exhibits a global minimum of 0 at (0, 0, 0), surrounded by many local minima located near integer coordinates, with the number of such minima increasing with the domain size.^[71] The oscillatory cosine terms introduce the nonlinearity, creating a rugged surface with regularly spaced attraction basins that mimic real-world engineering design challenges where multiple suboptimal configurations exist.^[71] To solve this numerically, local optimization methods such as sequential quadratic programming (SQP) can be employed, which iteratively solve quadratic approximations of the objective and constraints. SQP is particularly effective for nonlinear problems, approximating the Hessian of the Lagrangian to guide descent. For instance, starting from the initial point (0, 0, 0), SQP rapidly converges to the global minimum (0, 0, 0) with objective value 0. However, from an initial point (1.2, 1.2, 1.2), it converges to a nearby local minimum at approximately (1, 1, 1) with objective value 3, calculated as $30 + 3(1^2 - 10 \cos(2\pi \cdot 1)) = 30 + 3(1 - 10) = 3. Similarly, initializing at (2.5, 0.5, -1.5) leads to convergence at approximately (2, 0, -1) with objective value 5, calculated as $30 + (4 - 10) + (0 - 10) + (1 - 10) = 30 - 6 - 10 - 9 = 5, reflecting numerical solver behavior near integer local minima. These results, derived from standard solver outputs in optimization software like MATLAB's fminunc adapted for SQP, illustrate how local methods find distinct candidates depending on the starting location, often requiring multiple runs to sample the space.^[36]^[71] Visualization poses significant challenges in three or more dimensions, as full rendering is infeasible; instead, projections onto the xy-plane (fixing z=0) or slices (e.g., setting z=constant) reveal the periodic valleys and peaks, but lose information about interactions across variables. Sensitivity to initial points is pronounced, with basins of attraction around each local minimum drawing nearby starts, potentially missing the global solution unless diversified initialization is used, such as Latin hypercube sampling.^[71] The outcome typically yields multiple candidate solutions from repeated local solves, each a stationary point satisfying first-order optimality conditions locally. Global verification requires additional steps, such as enumeration over suspected minima locations (feasible in low dimensions like 3D due to the discrete-like structure) or bounding via relaxation techniques to prune suboptimal regions, confirming the global minimum at (0, 0, 0). This process highlights the trade-offs in multi-dimensional nonlinear programming, where local efficiency must balance with global reliability.^[71]^[72]

References

[1]
Nonlinear Programming (NLP) – Biosport - Engineering Research
Nonlinear programming refers to the an optimization scenario in which: the objective function is possibly nonlinear, and there are equality and/or inequality ...
[2]
[PDF] Nonlinear Programming - MIT
Numerous mathematical-programming applications, including many introduced in previous chapters, are cast naturally as linear programs.
[3]
[PDF] Nonlinear Optimization
Oct 17, 2024 · Nonlinear optimization, also referred to as nonlinear programming when nonlinear constraints are involved, have many applications. Just to ...
[4]
[PDF] LINEAR PROGRAMMING
Nonlinear Programming began around 1951 with the famous Karush-Kuhn-Tucker Conditions which are related to the Fritz-John Conditions (1948). In 1954, Ragnar ...
[5]
A Contextualized Historical Analysis of the Kuhn–Tucker Theorem in ...
When Kuhn and Tucker proved the Kuhn–Tucker theorem in 1950 they launched the theory of non- linear programming. However, in a sense this theorem had been ...
[6]
[PDF] Nonlinear Programming
Different algorithms are used for the dif- ferent types. For certain types where the functions have simple forms, problems can be solved relatively efficiently.
[7]
[PDF] Nonlinear Programming: Concepts, Algorithms and Applications
Definition of Risk - fluctuation of ri(t) over investment (or past) time period. To minimize risk, minimize variance about portfolio mean (risk averse) ...
[8]
THE INTERIOR-POINT REVOLUTION IN OPTIMIZATION
Sep 21, 2004 · Although interior-point techniques, primarily in the form of barrier methods, were widely used during the 1960s for problems with nonlinear ...
[9]
[PDF] Interior-Point Methods for Nonconvex Nonlinear Programming
Three possible solutions to the problem are derived and discussed, including shifting slack variables, a trust region method, and a modified barrier method. The ...
[10]
[PDF] Handout 1: CS726: Nonlinear Programming Theory and Applications
2. Applications of Nonlinear Optimization. 2.1. Optimal Control. Have a mechanical / chemical / electrical / economic process described by a dynamic system ...
[11]
[PDF] lecture slides on nonlinear programming - Athena Scientific
Feb 3, 2005 · APPLICATIONS OF NONLINEAR PROGRAMMING. • Data networks – Routing. • Production planning. • Resource allocation. • Computer-aided design.
[12]
[PDF] Modeling with Nonlinear Programming
Nonlinear programming solves problems formulated as min f(x) subject to inequality constraints gi(x) ≤ 0 and equality constraints hi(x)=0.
[13]
[PDF] Extended Nonlinear Programming 1 Introduction
1 Introduction. The basic problem in nonlinear programming, and for that matter in all of finite- dimensional optimization, is usually explained as having the ...
[14]
Nonlinear Programming - Project Euclid
Nonlinear Programming Chapter. Author(s) HW Kuhn, AW Tucker. Editor(s) Jerzy Neyman. Berkeley Symp. on Math. Statist. and Prob., 1951: 481-492 (1951)
[15]
[PDF] NONLINEAR PROGRAMMING
This work was done under contracts with the Office of Naval Research. 48I. Page 2. 482. SECOND BERKELEY SYMPOSIUM: KUHN AND TUCKER ... Koopmans, Wiley, New York, ...
[16]
[PDF] Lagrange Multiplier Method & Karush-Kuhn-Tucker (KKT) Conditions
The KKT conditions are analogous conditions in the case of constraints. The KKT conditions are the following: 1) Gradient of the Lagrangian = 0. 2) Constraints: ...
[17]
[PDF] Lecture 12: KKT conditions - Statistics & Data Science
The KKT conditions for the constrained problem could have been derived from studying optimality via subgradients of the equivalent problem, i.e. Nlj =0(x∗) ...
[18]
Optimization/Mathematical Programming - INFORMS.org
Optimizing a function using derivatives is perhaps as old as the history of calculus; thus, methods for optimizing a non-linear function by so-called descent ...
[19]
[PDF] Convex Optimization - CSE, IIT Delhi
This book is about convex optimization, a special class of mathematical optimiza- tion problems, which includes least-squares and linear programming ...
[20]
[PDF] Lecture 3. Convex Sets
So we have the following definition of affine combination. A set is affine if it is closed under affine combinations. Lecture 3. Convex Sets. 3.1 Affine sets.
[21]
[PDF] Convexity and Jensen's Inequality - Andrew B. Nobel
Definition: A set C ⊆ Rd is convex if for every x, y ∈ C and every α ∈ [0, 1] the point αx + (1 − α)y ∈ C. Jensen's Inequality: Let C ⊆ Rd be convex and ...
[22]
Nonlinear Convex Optimization — CVXOPT User's Guide
Nonlinear convex optimization minimizes f_0(x) subject to f_k(x) ≤ 0, k=1 to m, G x ≤ h, and A x = b, where f_k are convex and twice differentiable.
[23]
[PDF] CONVEX ANALYSIS AND NONLINEAR OPTIMIZATION Theory and ...
This book covers convex analysis, its applications, and extensions, aiming to unify the theory of optimization, and is a concise, accessible account.
[24]
[PDF] Optimality Conditions for Nonlinear Optimization - Stanford University
When First-Order Optimality Conditions are Sufficient? Theorem 7 If objective f is a locally convex function in the feasible direction space at the KKT solution ...
[25]
[PDF] Lecture 2 First-order optimality conditions - MIT
First-order optimality conditions define conditions that optimal points need to satisfy. For this lecture, we will make the blanket assumption ...
[26]
Fritz John
EXTREMUM PROBLEMS WITH INEQUALITIES. AS SUBSIDIARY CONDITIONS. Fritz John. This paper deals with an extension of Lagrange's multiplier rule to the case, where ...
[27]
1.2.1.2 Second-order conditions for optimality - Daniel Liberzon
Another difference with the first-order condition is that the second-order condition distinguishes minima from maxima: at a local maximum, the Hessian must be ...
[28]
[PDF] Optimality Conditions for Nonlinear Optimization
KKT conditions are first-order necessary conditions. Goal. Extend second-order from the unconstrained case. Remark. Important to include second ...
[29]
[1405.4604] On the saddle point problem for non-convex optimization
May 19, 2014 · The saddle point problem involves saddle points, not local minima, which are surrounded by high error plateaus, slowing down learning in non- ...Missing: optima | Show results with:optima
[30]
How to Escape Saddle Points Efficiently - Berkeley AI Research
Aug 31, 2017 · A core, emerging problem in nonconvex optimization involves the escape of saddle points. While recent research has shown that gradient descent ( ...
[31]
[PDF] DUALITY IN NONLINEAR PROGRAMMING
FALK, A constrained Lagrangian approach to nonlinear programming, Doctoral thesis,. University of Michigan, Ann Arbor, 1965. [11] W. FENCHEL, Convex cones ...
[32]
Nonlinear Programming, 3rd edition - Athena Scientific
$$89.00Nonlinear Programming: 3rd Edition. by Dimitri P. Bertsekas. ISBN: 978-1-886529-05-2. Publication: 2016, 880 pages, hardcover. Price: $89.00.
[33]
Sensitivity Analysis for Nonlinear Programming in CasADi
Fiacco A.V., Ishizuka Y. Sensitivity and stability analysis for nonlinear programming. Annals of Operations Research, 27 (1990), pp. 215-235.
[34]
Sensitivity analysis for parametric nonlinear programming: A tutorial
Apr 22, 2025 · Building upon the fundamental work of Fiacco, it derives the sensitivity of primal-dual solutions for regular nonlinear programs and explores ...
[35]
[PDF] Sequential Quadratic Programming - Duke University
Sequential Quadratic Programming. 27 by Han (1976) in his rst paper on SQP methods. This paper also included the proof of superlinear covergence for the ...
[36]
[PDF] Steepest Descent 1 Introduction - OSTI.GOV
The steepest descent method is one of the oldest known methods for minimizing a general nonlinear function. The convergence theory for the method is widely ...
[37]
[PDF] The Steepest Descent Algorithm for Unconstrained Optimization and ...
The Steepest Descent Algorithm for Unconstrained. Optimization and a. Bisection Line-search Method. Robert M. Freund. February, 2004. 1. 2004 Massachusetts ...
[38]
[PDF] Newton's Method for Unconstrained Optimization
Note from the statement of the convergence theorem that the iterates of Newton's method are equally attracted to local minima and local maxima. Indeed, the ...
[39]
[PDF] Quadratic Convergence of Newton's Method - NYU Computer Science
the proof of quadratic convergence (assuming convergence takes place) is fairly simple and may be found in many books. Here it is. Let f be a real-valued.Missing: original | Show results with:original
[40]
[PDF] A new comptutational method for constrained nonlinear programming
optimality condition. The initial suggestions for the penalty function approach came from Courant [10] and Frisch [24] . Subsequent developments.
[41]
[PDF] Penalty and Barrier Methods for Constrained Optimization
Barrier and penalty methods are designed to solve P by instead solving a sequence of specially constructed unconstrained optimization problems. In a penalty ...<|separator|>
[42]
[PDF] Interior Methods for Nonlinear Optimization
Primarily in the form of barrier methods, interior-point techniques were popular during the 1960s for solving nonlinearly constrained problems. However, their ...
[43]
[PDF] Non-linear programing for sizing optimization of truss structures
In the recent study, a non-linear programming tool employing the interior-point algorithm was integrated with the analyses of truss structures. As numerical ...
[44]
Hybrid optimization of truss structures with strength and buckling ...
Jan 1, 1982 · An efficient scheme utilizing standard nonlinear programming methodology is presented for the optimal design of elastic, ...
[45]
[PDF] Efficient Nonlinear Programming Algorithms for Chemical Process ...
Nonlinear programming strategies have been used for process optimization for almost 50 years. These have been essential for plant and equipment design, retro- ...
[46]
Nonlinear Programming | SIAM Publications Library
This book addresses modern nonlinear programming (NLP) concepts and algorithms, especially as they apply to challenging applications in chemical process ...
[47]
Worst-Case Value-at-Risk of Non-Linear Portfolios
Aug 18, 2009 · Portfolio optimization problems involving Value-at-Risk (VaR) are often computationally intractable and require complete information about the ...
[48]
https://epubs.siam.org/doi/book/10.1137/1.9780898719383
[49]
Training of supervised neural networks via a nonlinear primal-dual ...
We propose a new training algorithm for feedforward supervised neural networks based on a primal-dual interior-point method for nonlinear programming.
[50]
A comparison of nonlinear optimization methods for supervised ...
The use of nonlinear optimization techniques to perform neural network training offers a means of reducing that computing time. Two key issues in the ...
[51]
An introduction to optimal power flow: Theory, formulation, and ...
This article provides an introduction to OPF from an operations research perspective; it describes a complete and concise basis of knowledge for beginning OPF ...Missing: grids | Show results with:grids
[52]
Optimal Power Flow in electrical grids based on power routers
Since it deals with the traditional power flow equations, the OPF is a nonlinear and non-convex optimization problem, making it complex to solve. In order ...Optimal Power Flow In... · 1. Introduction · 4. Case Studies And Results
[53]
[PDF] Trajectory optimization for an apollo-type vehicle under entry ...
This report describes a numerical optimization study conducted to investigate optimal performance boundaries, from considerations of maneuver capability and ...
[54]
[PDF] GPU-Accelerated Nonlinear Programming - NREL
This work focuses on GPU-accelerated nonlinear programming, developing MadNLP.jl, a solver written in Julia, using GPUs for AI and scientific computing.Missing: post- | Show results with:post-
[55]
Ipopt - COIN-OR Documentation
Ipopt (Interior Point Optimizer, pronounced "Eye-Pea-Opt") is an open source software package for large-scale nonlinear optimization.Installing Ipopt · Ipopt Options · Ipopt Output · Interfacing your NLP to Ipopt
[56]
Optimization and root finding - Numpy and Scipy Documentation
SciPy optimize provides functions for minimizing (or maximizing) objective functions, possibly subject to constraints. It includes solvers for nonlinear ...Minimize · Optimize · Root_scalar · Curve_fit
[57]
fmincon - Find minimum of constrained nonlinear multivariable ...
fmincon passes x to your objective function and any nonlinear constraint functions in the shape of the x0 argument. For example, if x0 is a 5-by-3 array, then ...Constrained Nonlinear... · Fmincon · Optimoptions · Choosing the Algorithm
[58]
Interfacing your NLP to Ipopt - COIN-OR Documentation
Ipopt is a nonlinear programming solver that is designed for solving large-scale, sparse problems. While Ipopt can be customized for a variety of matrix formats ...Using Ipopt Through Ampl · The C++ Interface · Coding The Problem...
[59]
Knitro - Artelys
Artelys Knitro is the most advanced solver for nonlinear optimization with 4 NLP algorithms (interior-point / active-set) and 3 MINLP algorithms.Knitro program - EN · Dynamic tariff models in... · A general coupled morning...
[60]
CONOPT - GAMS
CONOPT is an Active Set solver. ... The SLP iterations have Step less than 1 due to nonlinear objective terms and CONOPT jumps directly from SLP to SQP.
[61]
The CONOPT Algorithm - Nonlinear Solver - GAMS
CONOPT is a solver for large-scale nonlinear optimization (NLP) originally developed and maintained by ARKI Consulting & Development A/S in Bagsvaerd, Denmark.
[62]
KNITRO - GAMS
Knitro implements both state-of-the-art interior-point and active-set methods for solving nonlinear optimization problems. In the interior method (also known as ...
[63]
Ipopt Options - COIN-OR Documentation
Ipopt has many (maybe too many) options that can be adjusted for the algorithm. Options are all identified by a string name, and their values can be of one ...Barrier Parameter Update · Ma97 Linear Solver · Spral Linear Solver
[64]
Nonlinear Modeling - JuMP-dev
JuMP has support for nonlinear (convex and nonconvex) optimization problems. JuMP is able to automatically provide exact, sparse second-order derivatives to ...Add a parameter · Create a nonlinear expression · Nonlinear expressions in detail
[65]
Overview - JuMP-dev
The Nonlinear submodule contains data structures and functions for working with a nonlinear optimization problem in the form of an expression graph. This page ...
[66]
BARON Solver | The Optimization Firm
Best Complete MINLP Solver. BARON is the industry-leading solver for mixed-integer nonlinear programming (MINLP). Our worldwide developers have, on average, ...
[67]
BARON - GAMS
The Branch-And-Reduce Optimization Navigator (BARON) is a GAMS solver for the global solution of nonlinear (NLP) and mixed-integer nonlinear programs (MINLP).
[68]
Rastrigin Function
The Rastrigin function has several local minima. It is highly multimodal, but locations of the minima are regularly distributed.Missing: 3D | Show results with:3D
[69]
https://www.minlp.com/baron-solver