Fact-checked by Grok 2 weeks ago

Benders decomposition

Benders decomposition is an exact optimization algorithm for solving mixed-integer linear programming (MILP) problems, which decomposes a complex model into a master problem involving complicating (typically integer) variables and simpler subproblems for the remaining variables, iteratively generating feasibility and optimality cuts to converge to an optimal solution.^[1] Introduced by Jacques F. Benders in 1962 as a partitioning procedure for mixed-variable programming, the method exploits problem structure to decentralize computation and handle large-scale instances efficiently.^[1]^[2] The algorithm operates through an iterative process grounded in projection and dualization: the master problem provides trial solutions for the complicating variables, which are fixed in the subproblems to check feasibility and optimality, yielding dual multipliers that form linear cuts added back to the master to tighten its relaxation.^[2] Originally formulated for linear programs with mixed continuous and integer variables, Benders decomposition has been extended over decades to nonlinear, stochastic, multi-stage, and bilevel optimization contexts, with foundational enhancements like generalized versions for convex problems.^[1]^[2] Its convergence relies on the formulation's projection onto complicating variables and outer linearization of the value function, making it particularly effective for problems with block-angular structures in fields such as supply chain planning, energy systems, and network design.^[2] Despite potential slow convergence in practice due to weak cuts, numerous acceleration techniques—including cut selection, aggregation, and hybrid metaheuristics—have been developed to enhance its performance in combinatorial optimization.^[2]

Overview

Core Concept

Benders decomposition is a partitioning technique designed to solve mixed-integer linear programming (MILPs) problems by dividing them into a restricted master problem, which manages the complicating variables—often the integer or binary ones—and one or more subproblems that explore the feasible region of the continuous variables once the master decisions are fixed.^[3] This approach exploits the structure of problems where the complicating variables couple the constraints, allowing the subproblems to become tractable linear programs after fixing those variables.^[4] The high-level workflow begins with solving the master problem to generate candidate values for the complicating variables, followed by solving the subproblems to assess the quality of these candidates and produce cuts—either optimality cuts that tighten the objective bound or feasibility cuts that eliminate invalid solutions—which are then incorporated into the master problem for subsequent iterations.^[3] This iterative refinement leverages duality in the subproblems to generate valid inequalities without resolving the entire problem from scratch.^[4] Named after Jacques F. Benders, the method was introduced in 1962 for deterministic mixed-integer linear programming (MILP) problems, and has since been extended to stochastic contexts.^[5] It proves particularly effective for optimization scenarios featuring a limited number of complicating variables alongside expansive continuous decision spaces, substantially lowering computational demands relative to monolithic MILP solvers.^[4]

Advantages and Limitations

Benders decomposition is particularly advantageous for structured mixed-integer linear programming (MILP) problems where the complicating variables—typically a small number of integer variables—can be separated from the continuous ones, allowing the subproblems to become significantly easier to solve once these integers are fixed.^[4] This approach exploits the inherent structure of the problem, decentralizing the computational burden and enabling efficient handling of large-scale instances in fields such as transportation and energy planning.^[4] Furthermore, the method generates strong optimality and feasibility cuts that effectively prune the search space in the master problem, accelerating convergence to the global optimum compared to exhaustive enumeration methods.^[4] The parallelizability of subproblems represents another key strength, as multiple subproblems can be solved independently and concurrently, which is especially beneficial for distributed computing environments and reduces overall solution time.^[6] In stochastic programming contexts, Benders decomposition naturally accommodates multi-stage and scenario-based extensions by decomposing the problem into a master for first-stage decisions and subproblems for recourse actions, making it a foundational tool for such applications without requiring extensive reformulation.^[4] Despite these benefits, Benders decomposition often exhibits slow convergence in early iterations due to the generation of weak initial cuts, leading to a "tailing-off" effect where progress stalls and the number of iterations increases substantially.^[4] The algorithm's reliance on the solvability of subproblems—typically assuming linear programming duality for cut derivation—can fail if subproblems are infeasible or unbounded, necessitating additional handling mechanisms that complicate implementation.^[4] It is highly sensitive to problem structure; performance degrades markedly when there are many integer complicating variables, as the master problem then becomes computationally intensive, or when variables are tightly coupled, reducing the decomposition's effectiveness.^[4] Additionally, numerical instability in dual solutions, arising from ill-conditioned subproblems or solver tolerances, can cause erratic bound progression and unreliable cuts.^[7] In comparison to alternatives like branch-and-bound, Benders decomposition can outperform on decomposable problems by avoiding full enumeration, but it may underperform on tightly coupled instances where the benefits of decomposition are minimal.^[4] Practically, while modern solvers such as CPLEX and Gurobi facilitate implementation through callbacks for cut generation, custom coding is often required for tailored cut selection and stabilization, increasing development effort for non-standard problems.^[8]

Mathematical Formulation

Primal Problem Setup

Benders decomposition addresses mixed-integer linear programming (MILP) problems where a subset of decision variables introduces combinatorial complexity, hindering direct solution methods. The primal problem is formulated as follows:

\begin{align*} \min_{x, y} \quad & \mathbf{c}^\top \mathbf{x} + \mathbf{d}^\top \mathbf{y} \\ \text{s.t.} \quad & A \mathbf{x} + B \mathbf{y} \geq \mathbf{b}, \\ & \mathbf{x} \in X, \\ & \mathbf{y} \in Y, \end{align*}

where \mathbf{x} represents the vector of complicating integer variables belonging to a discrete set X \subseteq \mathbb{Z}^{n_x}_+ (typically non-negative integers or binary), \mathbf{y} is the vector of continuous variables in the non-negative orthant Y = \mathbb{R}^{n_y}_+, \mathbf{c} \in \mathbb{R}^{n_x}, \mathbf{d} \in \mathbb{R}^{n_y}, A \in \mathbb{R}^{m \times n_x}, B \in \mathbb{R}^{m \times n_y}, and \mathbf{b} \in \mathbb{R}^m. This structure originates from the partitioning approach for mixed-variables programs, adapted to modern notation where integer variables are designated as complicating to facilitate decomposition.^[1] The problem assumes linearity in both the objective function and constraints, ensuring convexity in \mathbf{y} for any fixed \mathbf{x}, a bounded feasible region to guarantee finite optima, and attainment of strong duality in the resulting linear programs when \mathbf{x} is fixed (zero duality gap). These properties enable the separation of integer and continuous decisions without loss of optimality. The formulation further assumes that the subproblem is bounded whenever feasible, meaning no recession direction \mathbf{y}^r \geq \mathbf{0} exists with B \mathbf{y}^r \geq \mathbf{0} and \mathbf{d}^\top \mathbf{y}^r < 0, ensuring the recourse function \theta(\mathbf{x}) is finite for feasible \mathbf{x}.^[1] The rationale for decomposition lies in the interdependence: the choice of \mathbf{x} defines the feasible set for \mathbf{y}, rendering the joint problem computationally intractable for large instances due to the integer requirements on \mathbf{x}. By fixing \mathbf{x} to a candidate value \bar{\mathbf{x}}, the subproblem in \mathbf{y} reduces to a standard linear program, which is efficiently solvable and provides dual information to refine the integer choices. This partitioning exploits the structure to iteratively approximate the original problem.^[1]

Master Problem

In Benders decomposition, the master problem serves as an iterative approximation to the original mixed-integer linear program, concentrating on the complicating integer variables x while using an auxiliary variable \eta to estimate the contribution from the continuous variables. The initial restricted master problem is typically formulated as the linear programming relaxation over x \in X, where X represents the feasible domain for the integer decisions (often starting as an LP relaxation without full integrality enforced initially). It takes the form

\min_{x, \eta} \ c^T x + \eta

subject to

x \in X, \quad \eta \in \mathbb{R},

with no initial constraints projecting the continuous subproblem feasibility onto x (such as A x \geq b), rendering it loose or potentially unbounded. Here, \eta acts as a lower bound on the optimal value of the subproblem for the selected x. This setup originates from the partitioning approach in the seminal work, which separates mixed-variable problems into a master over discrete choices and subproblems over continuous ones.^[1] To refine the lower bound on the subproblem value, optimality cuts are incorporated iteratively based on dual solutions from feasible subproblems. For a candidate solution x^k obtained from solving the current master, the subproblem yields dual multipliers \pi^k. The resulting optimality cut is added as

\eta \geq \pi^{k\top} (b - A x),

ensuring that for any feasible x, \eta underestimates the true subproblem recourse function \theta(x) via the supporting hyperplane defined by \pi^k, and is tight at x^k since \theta(x^k) = \pi^{k\top} (b - A x^k). These cuts, derived from extreme points of the dual subproblem polyhedron, progressively tighten the approximation of the value function.^[1]^[9] When the subproblem proves infeasible for a candidate x^k, a feasibility cut is generated using an extreme ray \pi^k from the unbounded dual, excluding regions of x that render the overall problem infeasible. This cut takes the form

\pi^{k\top} (b - A x) \leq 0,

which carves out invalid x by enforcing the projection of the joint feasible set onto X. Such cuts stem from Farkas' lemma applied to the subproblem's infeasibility certificate, where \pi^k \geq 0 and \pi^k B \leq 0^\top.^[1]^[9] The master problem begins as a coarse relaxation—often an LP over the projected integer space—and evolves through the accumulation of these optimality and feasibility cuts, transforming into a strengthened mixed-integer program that converges to the global optimum as the cuts span the dual polyhedron and recession cone. This iterative strengthening maintains finite convergence under standard assumptions like bounded polyhedra, with the final master solution satisfying all generated cuts and aligning with the subproblem value.^[1]^[10]

Subproblem and Dual

In Benders decomposition, once a candidate solution \mathbf{x} is obtained from the master problem, the complicating variables \mathbf{x} are fixed, reducing the original mixed-integer linear program to a pure linear programming subproblem in the continuous variables \mathbf{y}. This subproblem, often referred to as the primal subproblem, is formulated as:

\begin{align*} \min_{\mathbf{y}} &\quad \mathbf{d}^T \mathbf{y} \\ \text{s.t.} &\quad B \mathbf{y} \geq \mathbf{b} - A \mathbf{x}, \\ &\quad \mathbf{y} \geq \mathbf{0}, \end{align*}

where B is the coefficient matrix for \mathbf{y}, \mathbf{d} is the cost vector for \mathbf{y}, and the right-hand side is adjusted based on the fixed \mathbf{x}. This subproblem is solvable as a standard linear program (LP) using methods like the simplex algorithm, and its optimal value, denoted \theta(\mathbf{x}), represents the recourse cost for the given \mathbf{x}. The formulation assumes the subproblem is feasible for the current \mathbf{x}; cases of infeasibility or unboundedness are addressed separately to generate appropriate cuts. The dual of this primal subproblem plays a central role in generating valid cuts for the master problem, leveraging the structure of linear programming duality. The dual subproblem is:

\begin{align*} \max_{\boldsymbol{\pi}} &\quad \boldsymbol{\pi}^T (\mathbf{b} - A \mathbf{x}) \\ \text{s.t.} &\quad \boldsymbol{\pi}^T B \leq \mathbf{d}^T, \\ &\quad \boldsymbol{\pi} \geq \mathbf{0}, \end{align*}

where \boldsymbol{\pi} are the unrestricted dual variables (in sign) associated with the inequality constraints B \mathbf{y} \geq \mathbf{b} - A \mathbf{x}. If the primal subproblem is feasible and bounded, strong duality theorem from linear programming ensures that the optimal primal objective \theta(\mathbf{x}) equals the optimal dual objective value, \boldsymbol{\pi}^{*T} (\mathbf{b} - A \mathbf{x}), where \boldsymbol{\pi}^* is an optimal dual solution. This equality validates the optimality cut added to the master problem: \eta \geq \boldsymbol{\pi}^{*T} (\mathbf{b} - A \mathbf{x}), which provides a linear underestimator of the recourse function \theta(\mathbf{x}) and is tight at the current \mathbf{x}. In cases where the primal subproblem is infeasible for the fixed \mathbf{x} (i.e., no \mathbf{y} \geq \mathbf{0} satisfies B \mathbf{y} \geq \mathbf{b} - A \mathbf{x}), a phase-I feasibility subproblem is solved to certify infeasibility and generate a feasibility cut. This involves introducing artificial slack variables \mathbf{s} \geq \mathbf{0} to form an auxiliary LP:

\begin{align*} \min_{\mathbf{y}, \mathbf{s}} &\quad \mathbf{1}^T \mathbf{s} \\ \text{s.t.} &\quad B \mathbf{y} + \mathbf{s} \geq \mathbf{b} - A \mathbf{x}, \\ &\quad \mathbf{y} \geq \mathbf{0}, \, \mathbf{s} \geq \mathbf{0}, \end{align*}

where the objective minimizes the total artificial violation. If the optimal value is positive, the problem is infeasible. The dual of this phase-I problem is then maximized over \boldsymbol{\pi} \geq \mathbf{0} subject to \boldsymbol{\pi}^T B \leq \mathbf{0} and \|\boldsymbol{\pi}\| \leq 1 (or a similar normalization, such as \boldsymbol{\pi}^T \mathbf{1} = 1 for scaling), yielding \boldsymbol{\pi}^T (\mathbf{b} - A \mathbf{x}) > 0. This generates the feasibility cut \boldsymbol{\pi}^T (\mathbf{b} - A \mathbf{x}) \leq 0 for the master problem, which excludes the current infeasible \mathbf{x} while remaining valid for the original feasible set. The normalization ensures the cut is non-trivial and computationally stable. The reliance on strong duality for both optimality and feasibility cuts is fundamental to the validity of Benders decomposition, as it guarantees that the generated inequalities are supporting hyperplanes to the recourse function \theta(\mathbf{x}) without requiring the primal subproblem to be solved to optimality in every iteration—dual solutions suffice for cut generation. This dual perspective also facilitates efficient implementation, as modern LP solvers provide optimal dual solutions alongside primal ones.

Algorithm Procedure

Initialization

The initialization phase of the Benders decomposition algorithm establishes the starting point for the iterative process by setting up the initial master problem, solving the corresponding subproblem, and defining initial bounds on the optimal value. This setup ensures the algorithm begins with a valid relaxation that can be progressively tightened through cuts. The initial master problem is constructed as the linear programming relaxation of the restricted master, disregarding integer constraints on the complicating variables x to yield a feasible starting solution x^0. The auxiliary variable \eta, which approximates the recourse function, is initialized to -\infty (or a sufficiently large negative value) to provide a lower underestimation of the second-stage contribution. Alternatively, in bounded formulations, \eta^0 may be set to a big-M value, where M is a constant larger than any anticipated subproblem objective to prevent unboundedness while allowing initial iterations to proceed.^[1] With x^0 in hand, the subproblem is formulated and solved to assess feasibility for this first-stage decision. If feasible, the optimal dual solution \pi^0 is recovered, enabling the generation of the first optimality cut to refine the master approximation. If infeasible, an extreme ray from the dual subproblem produces the initial feasibility cut, ensuring the master excludes this infeasible x^0 in subsequent solves. Bounds are initialized to frame the optimization gap: the lower bound z_{LB}^0 is taken as the objective value from the initial master solution, c^T x^0 + \eta^0, reflecting a valid relaxation. The upper bound z_{UB}^0 derives from any incumbent feasible solution to the full problem, often obtained via a simple heuristic, relaxation, or trial point such as x = 0 if nonnegative and satisfying supply constraints.^[11] Early-stage degeneracy, arising from multiple optimal dual solutions in the subproblem, can cause cycling or weak cuts; to mitigate this, small perturbations to the subproblem objective coefficients or stabilization techniques—such as aggregate cuts or proximal bundle methods—may be applied to promote diverse extreme points and numerical robustness. The big-M parameter, when used to cap \eta, is chosen as an overestimate of the maximum possible recourse cost (e.g., based on problem-specific bounds like total demand times unit cost), ensuring it does not exclude optimal solutions while maintaining solvability.

Iterative Steps

The iterative steps of Benders decomposition constitute the main loop of the algorithm, where the restricted master problem and subproblem are alternately solved to generate and incorporate cutting planes, progressively tightening the bounds on the optimal solution.^[1] This process refines the approximation of the recourse function until the duality gap closes, leveraging the duality between the primal subproblem and its dual to produce valid cuts.^[4] In the first step of each iteration k, the current restricted master problem is solved to optimality, yielding the solution x^k for the complicating variables and \eta^k as an underestimate of the recourse cost.^[1] This provides a lower bound on the overall objective value, given by c^\top x^k + \eta^k, where c are the costs associated with x.^[4] Next, the subproblem is formed by fixing the complicating variables to x^k and solving either the primal or dual formulation to assess feasibility and compute the recourse value v^k.^[1] If the subproblem is feasible, the optimal dual multipliers \pi^k are used to evaluate v^k; should v^k > \eta^k, an optimality cut is generated as

\eta \geq \zeta^k + \pi^k (b - A x),

where \zeta^k is the constant term derived from the dual objective at x^k, ensuring the cut supports the underestimation of the recourse function.^[4] This cut is added to the master problem, and the total objective c^\top x^k + v^k updates the upper bound, as it represents a feasible solution to the original problem.^[1] If the subproblem is instead infeasible for x^k, a feasibility-checking subproblem (or dual unbounded ray) is solved to obtain multipliers \pi^k satisfying the dual constraints with \pi^k (A x^k - b) > 0.^[4] A feasibility cut

\pi^k (A x - b) \leq 0

is then generated and appended to the master problem, eliminating the infeasible x^k and similar points from future consideration.^[1] The master problem is updated by incorporating the new cut(s)—either optimality or feasibility—and resolved to initiate the next iteration, with lower and upper bounds tracked throughout to monitor progress toward optimality.^[4] To accelerate convergence, techniques such as selecting the most violated cut from a pool or aggregating multiple cuts can be applied, reducing the number of constraints while preserving validity.^[12]

Convergence and Termination

Benders decomposition achieves finite convergence for mixed-integer linear programs where the complicating variables are integer-valued and the subproblem feasible region is a bounded polyhedron. This finite termination arises because the dual of the subproblem, formulated as a linear program, possesses a finite number of extreme points and extreme rays, each generating a distinct optimality or feasibility cut that eliminates a portion of the master problem's feasible region.^[13]^[4] Throughout the iterations, the algorithm tightens bounds on the optimal value: the lower bound \eta^k from the master problem solution is non-decreasing across iterations k, while the upper bound, given by the minimum objective value v over all feasible primal solutions encountered, is non-increasing. The optimality gap diminishes as these bounds converge, with termination upon satisfying \eta^k \geq \min v.^[4]^[14] Practical termination criteria include declaring optimality when the relative gap (\min v - \eta^k)/\min v \leq \epsilon, where \epsilon is a small tolerance such as 0.1%, to balance solution quality with computational effort. Infeasibility is detected and terminated if the master problem admits no feasible solution for x after incorporating all necessary feasibility cuts from unbounded dual rays.^[4] In the worst case, the algorithm may generate an exponential number of cuts, corresponding to the potentially exponential enumeration of subproblem dual extremes. Despite this theoretical bound, empirical performance often approximates polynomial time, particularly when cuts effectively prune the search space.^[15] The tolerance \epsilon influences near-optimality, with tighter values yielding solutions closer to exact but risking excessive iterations due to numerical precision limits. Dual degeneracy in the subproblem can exacerbate convergence delays by producing redundant or weak cuts from multiple optimal dual solutions yielding identical multipliers.^[14]^[4]

Applications and Extensions

Practical Examples

Benders decomposition finds application in various domains where mixed-integer linear programs feature a small number of complicating integer variables and a large set of continuous variables, such as production planning in manufacturing.^[4]

Example 1: Production Planning

Consider a simplified uncapacitated facility location problem, which models production planning decisions where binary variables indicate whether to open production facilities and continuous variables represent customer demand assignments to those facilities. The goal is to minimize the sum of fixed opening costs for facilities and transportation costs for assigning demands. With 3 potential facilities (plants) and 5 customers, fixed costs are $2, $3, $3 for plants 1, 2, 3, respectively, and transportation costs per unit demand are given by the matrix:

Plant/Customer	1	2	3	4	5
1	2	3	4	5	7
2	4	3	1	2	6
3	5	4	2	1	3

Each customer has unit demand. The integer variables y_i \in \{0,1\} decide facility openings, and continuous x_{ij} \geq 0 assign demand from facility i to customer j, with x_{ij} \leq y_i and \sum_i x_{ij} = 1 for each j.^[16] In the initial iteration, solve the master problem relaxing the subproblem to \theta = 0, yielding y = (1,0,0) with lower bound z^{LP} = 2 (fixed cost only). The subproblem for this y is feasible, with optimal assignment x_{1j} = 1 for all j, giving recourse cost 21 (total transportation $2+3+4+5+7), upper bound z^{UP} = 23, and gap 21. The dual multipliers for demand constraints are v = (2,3,4,5,7) (costs from open plant 1), and for linking constraints w = (0,0,0,0,0, 0,0,3,3,1, 0,0,2,4,4) based on closed plants 2 and 3. This generates the optimality cut \theta \geq 21 + 2y_1 - 4y_2 - 7y_3.^[16] Adding the cut to the master and resolving gives y = (0,1,1), \theta = 10, lower bound 16. The subproblem yields recourse cost 12 (assignments using plants 2 and 3), upper bound 18, gap 2. A new dual solution produces another optimality cut, such as adjustments for the new y. After 2-3 iterations, convergence occurs at optimal y = (1,0,1), with assignments balancing costs, total objective 16 (fixed 5 + transportation 11). This demonstrates how cuts tighten the master approximation iteratively.^[16]

Example 2: Network Design

In network design, binary variables select edges to build, while continuous variables handle flows subject to conservation and capacity constraints tied to edge selections. A small example involves routing flow from node 1 to sink 8 across 8 nodes, with arc capacities in matrix G (e.g., G_{1,2}=3, G_{1,3}=2, G_{1,4}=2, others 0-5), minimizing 0.1 times number of arcs plus negative outflow from source (to maximize flow, equivalent to min -flow). Binary x_{ij} opens arcs, continuous y_{ij} \geq 0 carries flow, with y_{ij} \leq G_{ij} x_{ij}, total arcs ≤11, and flow balance at intermediate nodes.^[17] The initial master (relaxed θ ≥ -sum G) solves to lower bound -29, selecting several arcs. The subproblem is feasible with outflow 5.9, upper bound 0 (no fixed cost considered yet), adding feasibility cut none but optimality cut $3x_{1,2} + 2x_{1,3} + 2x_{1,4} + \theta \geq 0 from dual π on balance constraints. In iteration 2, master lower bound -6.7, subproblem outflow 4.7, cut $5x_{2,5} + x_{3,5} + x_{2,6} + \dots + \theta \geq 0. Iteration 3 adds x_{3,7} + 4x_{5,8} + 2x_{6,8} + \theta \geq 0, tightening to lower -6.5. By iteration 7, after adding cuts like $3x_{1,2} + x_{3,5} + 2x_{6,8} + \dots + \theta \geq 0, bounds meet at -5.1, optimal arcs include (1,2),(1,3),(1,4),(2,5),(3,5),(3,6),(4,6),(5,8),(6,8), flows e.g., 3 on (1,2), 2 on (1,3), total flow 5.1. This highlights feasibility via flow balance and optimality cuts from duals, with infeasible x generating feasibility cuts if no flow path exists.^[17] Benders decomposition suits domains like logistics (e.g., warehouse location), energy systems (e.g., unit commitment with network constraints), and scheduling (e.g., task assignment to machines), where integer decisions are few and continuous parts scale large.^[4]^[18]^[19] For solver integration, such as in Python with Gurobi, the process involves iteratively solving a master MIP with Gurobi, extracting fixed integer x, solving the LP subproblem, and adding cuts via callbacks or manual loops. A pseudocode snippet for cut generation in the uncapacitated facility location follows:

import gurobipy as gp

# Master model setup (binary y_i for facilities, theta)
master = gp.Model("master")
y = master.addVars(num_facilities, vtype=gp.GRB.BINARY, name="y")
theta = master.addVar(lb=-gp.GRB.INFINITY, name="theta")
master.setObjective(fixed_costs @ y + theta, gp.GRB.MINIMIZE)
# Add initial cuts, e.g., theta >= 0
master.optimize()

while not converged:
    x_fixed = [y[i].X for i in facilities]
    # Subproblem LP for fixed x_fixed
    sub = gp.Model("sub")
    assign = sub.addVars(customers, facilities, lb=0, name="assign")
    sub.setObjective(transport @ assign, gp.GRB.MINIMIZE)
    for j in customers:
        sub.addConstr(gp.quicksum(assign[j,i] for i in facilities) == 1)
    for i in facilities:
        for j in customers:
            sub.addConstr(assign[j,i] <= x_fixed[i])
    sub.optimize()
    if sub.status == gp.GRB.INFEASIBLE:
        # Add feasibility cut from IIS or dual rays
        sub.computeIIS()
        # Generate and add to master: sum pi * (constraints for y)
        master.cbLazy(feas_cut)
    else:
        duals = [con.Pi for con in sub.getConstrs()]
        opt_cut = sum(duals_demand @ b - duals_link @ A @ y) <= theta
        master.addConstr(opt_cut)
    master.optimize()
    # Check gap between theta.X and sub.objVal
import gurobipy as gp

# Master model setup (binary y_i for facilities, theta)
master = gp.Model("master")
y = master.addVars(num_facilities, vtype=gp.GRB.BINARY, name="y")
theta = master.addVar(lb=-gp.GRB.INFINITY, name="theta")
master.setObjective(fixed_costs @ y + theta, gp.GRB.MINIMIZE)
# Add initial cuts, e.g., theta >= 0
master.optimize()

while not converged:
    x_fixed = [y[i].X for i in facilities]
    # Subproblem LP for fixed x_fixed
    sub = gp.Model("sub")
    assign = sub.addVars(customers, facilities, lb=0, name="assign")
    sub.setObjective(transport @ assign, gp.GRB.MINIMIZE)
    for j in customers:
        sub.addConstr(gp.quicksum(assign[j,i] for i in facilities) == 1)
    for i in facilities:
        for j in customers:
            sub.addConstr(assign[j,i] <= x_fixed[i])
    sub.optimize()
    if sub.status == gp.GRB.INFEASIBLE:
        # Add feasibility cut from IIS or dual rays
        sub.computeIIS()
        # Generate and add to master: sum pi * (constraints for y)
        master.cbLazy(feas_cut)
    else:
        duals = [con.Pi for con in sub.getConstrs()]
        opt_cut = sum(duals_demand @ b - duals_link @ A @ y) <= theta
        master.addConstr(opt_cut)
    master.optimize()
    # Check gap between theta.X and sub.objVal

This loop adds optimality cuts from subproblem duals and feasibility cuts for invalid y, terminating on small gap.^[20]^[21]

Variants and Improvements

Logic-based Benders decomposition extends the classical method to handle non-convex or combinatorial subproblems where linear programming duals are unavailable or inadequate. Instead of relying on dual multipliers, it generates cuts through logical inference and constraint propagation on the subproblem, such as no-good cuts that exclude infeasible or suboptimal integer solutions. This approach, formalized by Hooker, treats classical Benders cuts as a special case of inference duals and has been applied to large-scale mixed-integer problems like scheduling and planning. For instance, in wildfire suppression models, logic-based cuts strengthen no-good formulations to improve bounding and convergence.^[22]^[23]^[24] Accelerated variants address slow convergence in standard Benders by enhancing cut generation and master problem stability. Bundle methods aggregate multiple cuts into a single surrogate to reduce the number of constraints while preserving approximation quality, as seen in multicommodity network design. Stabilization techniques, such as trust regions on the recourse variable \eta, restrict master solutions to neighborhoods of promising points, mitigating oscillations in large-scale combinatorial optimization. Hybrid approaches integrate Benders with branch-and-bound for integer master problems, iteratively probing to tighten bounds. Pareto-optimal cuts further improve efficiency by selecting non-dominated cuts that provide the strongest lower bounds, often reducing iterations by up to 50% in capacitated facility location problems.^[25]^[26]^[27]^[28] Generalized Benders decomposition, proposed by Geoffrion, adapts the method for nonlinear convex problems by deriving cuts from supporting hyperplanes of the value function rather than LP duals. It separates complicating variables into a master problem and solves nonlinear subproblems to generate feasibility and optimality cuts based on convex duality theory. This extension ensures finite convergence under convexity assumptions and has been extended to nonconvex cases with branch-and-cut frameworks for global optimization in stochastic mixed-integer nonlinear programs.^[29]^[30]^[31] Stochastic Benders decomposition handles multi-stage problems with uncertainty by incorporating scenario trees, where nested decompositions solve forward and backward passes to propagate cuts across stages. For recourse problems, progressive hedging augments the decomposition with penalty terms to enforce non-anticipativity, accelerating convergence in scenario-based models like energy planning. Enhanced variants, such as scaled cuts, further stabilize multi-stage mixed-integer programs by adjusting cut slopes based on scenario probabilities.^[32]^[33]^[34] Modern improvements leverage machine learning for cut selection, training classifiers like support vector machines to identify effective cuts from historical data, reducing added constraints by 30-70% in transmission expansion planning. Reinforcement learning surrogates approximate the integer master to guide cut generation, while parallel solving of subproblems across cores or distributed systems scales to thousands of scenarios in stochastic security-constrained unit commitment. These 2020s developments integrate with ML-enhanced solvers, addressing tail convergence in mixed-integer settings through data-driven Pareto-optimal selection. As of 2025, advances include perspective Benders cuts for fixed-charge networks and accelerated variants for reliable facility location problems.^[35]^[36]^[37]^[38]^[39]^[40]

References

[1]
https://doi.org/10.1007/BF01386316
[2]
https://doi.org/10.1016/j.ejor.2016.12.005
[3]
[PDF] An introduction to Benders decomposition
Benders decomposition. We define the recourse function (second-stage value function, slave problem) f (x) = min{qT y : Wy = h − Tx, y ≥ 0}. (2). If for some ...Missing: explanation | Show results with:explanation
[4]
[PDF] The Benders Decomposition Algorithm: A Literature Review - Cirrelt
This paper presents a state-of-the-art survey of this algorithm, with an emphasis on its use in combinatorial optimization. We discuss the classical algorithm, ...
[5]
Partitioning procedures for solving mixed-variables programming ...
Benders, J.F. Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 4, 238–252 (1962). https://doi.org/10.1007/BF01386316.
[6]
An asynchronous parallel benders decomposition method for ...
The BD method decomposes stochastic problems into one master problem and multiple disjoint subproblems. It thus lends itself readily to parallelization.Missing: few | Show results with:few
[7]
The Benders decomposition algorithm: A literature review
### Extracted Content
[8]
Benders Decomposition Then and Now - OR in an OB World
Oct 9, 2011 · So here is a quick (?) explanation. The original paper by Jack Benders [1] dealt with both continuous and discrete optimization, linear and ...
[9]
Benders Decomposition
Using the convention that an infeasible minimization problem has optimal value ∞ , we can write ( P ) as min c T x + z ( x ) ( P ′ ) s.t. x ∈ X where z ( x ) = ...
[10]
Jacques Benders and his decomposition algorithm - ScienceDirect
Sep 2, 2025 · The present paper is based on two talks at the workshop, an overview of the history and significance of the decomposition method by the first ...
[11]
[PDF] Benders' Decomposition Methods for Structured Optimization ...
The idea behind delayed constraint generation is to try to solve FMP using only a small subset of the constraints, and to check afterward whether any of the non ...
[12]
[PDF] On the selection of Benders' cuts - dei.unipd
Benders, J.: Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik 4, 238–252 (1962). 6. Cadoux, F.: Computing deep ...<|control11|><|separator|>
[13]
Partitioning procedures for solving mixed-variables programming ...
For these problems a slight modi- fication of the combinatorial procedure for solving pure "zero-one" problems, developed by BENDERS, CATCHPOLE and I~UIKEN [3] ...
[14]
Accelerating Benders Decomposition: Algorithmic Enhancement ...
This paper proposes methodology for improving the performance of Benders decomposition when applied to mixed integer programs.Missing: Magnanti Wong 1981 PDF
[15]
A Modified Benders' Partitioning Algorithm for Mixed Integer ...
3, November 1977. Printed in USA. A MODIFIED BENDERS' PARTITIONING ALGORITHM. FOR MIXED INTEGER PROGRAMMING*. DALE MCDANIEL† AND MIKE DEVINE.
[16]
[PDF] Benders Decomposition
DTU-Management / Operations Research. Example 10.6: “Hand” calculation of duals vj u = [v,w] where v are the demand duals and w are the open/closed facility ...
[17]
Benders decomposition - JuMP-dev
Benders decomposition is a useful algorithm for solving convex optimization problems with a large number of variables. It works best when a larger problem can ...Theory · Monolithic problem · Iterative method · Callback methodMissing: paper | Show results with:paper<|control11|><|separator|>
[18]
[PDF] Benders.pdf - Iowa State University
Mar 25, 2021 · You need optimization background to understand decomposition. • Decomposition is highly applicable in power system planning problems. • Sections ...
[19]
[PDF] Benders Decomposition - John Hooker
These Benders cuts are added to the master problem in each iteration h. Benders cuts. Min Number of Late Tasks. Page 85. 85.<|control11|><|separator|>
[20]
JadeLiu13/BendersDecomposition: This is a benders decomposition ...
This is a benders decomposition implementation in Python using Gurobi for solving the Uncapacitated Facility Location (UFL) Problem.
[21]
Benders decomposition example in Python - Gurobi Support
Mar 18, 2023 · A Benders decomposition algorithm in Gurobi is basically just a branch-and-cut algorithm where you add Benders cuts in lazy-constraint and/or user-cut ...Benders decomposition in GurobiHow can Benders decomposition be faster then Gurobi?More results from support.gurobi.com
[22]
Logic-based Benders decomposition for large-scale optimization
Oct 25, 2019 · This chapter presents the basic theory of LBBD and explains how classical Benders decomposition is a special case. It also describes branch and ...
[23]
[PDF] Logic-Based Benders Decomposition - John Hooker
Benders decomposition partitions the problem variables into master problem variables and subproblem variables.
[24]
Logic-based benders decomposition for wildfire suppression
In this paper we propose a new exact algorithm utilising logic-based Benders decomposition. We benchmark the new approach against mixed-integer programming and ...
[25]
[PDF] The Benders by batch algorithm: design and stabilization of an ...
Dec 15, 2022 · This paper introduces a new exact algorithm to solve two-stage stochastic linear programs. Based on the multicut Benders reformulation of such ...
[26]
Stabilized Benders Methods for Large-Scale Combinatorial ...
Jan 23, 2020 · To overcome this drawback, we present a stabilized Benders decomposition whose master is restricted to a neighborhood of successful candidates ...
[27]
Accelerating the Benders Decomposition Method: Application to ...
To successfully apply the Benders method to these instances, we propose various acceleration techniques, including the use of cutting planes, partial ...
[28]
[PDF] Cut Selection for Benders Decomposition - arXiv
May 11, 2020 · In particular, we refer to Pareto-optimality and facet-defining cuts and observe that each of these criteria can be matched to a particular ...
[29]
[PDF] Generalized Benders Decomposition 1
See the original paper or. ReL 1, Section 4.1, for details. The main result of this paper is an extension of Benders' approach to a more general class of ...
[30]
Generalized Benders Decomposition Method to Solve Big Mixed ...
The Generalized Benders Decomposition (GBD) method is a basic approach to solve big mixed-integer nonlinear optimization problems with convex objectives and ...
[31]
Nonconvex Generalized Benders Decomposition for Stochastic ...
Jul 9, 2011 · This paper considers deterministic global optimization of scenario-based, two-stage stochastic mixed-integer nonlinear programs (MINLPs)
[32]
[PDF] Benders, Nested Benders and Stochastic Programming
This article aims to explain the Nested Benders algorithm for the solution of large-scale stochastic programming problems in a way that is intelligible to.
[33]
A Modified Progressive Hedging Approach for Multistage Intraday ...
Progressive hedging algorithm (PHA) is a scenario decomposition based approach which is useful in solving stochastic programming problems with complicating ...
[34]
[PDF] Benders decomposition with scaled cuts for multistage stochastic ...
Jun 30, 2024 · We consider Benders decomposition algorithms for multistage stochastic mixed-integer programs (SMIPs) with general mixed-integer decision ...
[35]
Benders Cut Classification via Support Vector Machines for Solving ...
We propose to integrate machine learning techniques into the traditional Benders decomposition framework to learn cut characteristics and selectively generate ...
[36]
Machine Learning-Enhanced Benders Decomposition Approach for ...
The proposed ML-enhanced Multicut Benders Decomposition algorithm improves computational efficiency by identifying effective and ineffective optimality cuts.
[37]
Accelerating L-shaped Two-stage Stochastic SCUC with Learning ...
Nov 17, 2023 · This paper takes advantage of machine learning and proposes enhanced variants of Benders decomposition for solving two-stage stochastic security ...
[38]
[PDF] Accelerating Benders Decomposition via Reinforcement Learning ...
In this paper we propose a method of accelerating BD with the aid of a surrogate model in place of an NP-hard integer master problem.