Fact-checked by Grok 2 weeks ago

Linear–quadratic regulator

The linear–quadratic regulator (LQR) is a foundational method in optimal control theory for designing state-feedback controllers that stabilize linear time-invariant dynamical systems while minimizing a quadratic cost function, which balances the magnitude of state deviations from a desired trajectory against the energy expended by control inputs. This approach assumes full knowledge of the system state and relies on solving the algebraic Riccati equation to compute optimal feedback gains, ensuring closed-loop stability under controllable and observable conditions. The LQR problem is typically formulated for infinite-horizon operation, where the cost is the integral of a quadratic form involving the state vector x(t) and control vector u(t), expressed as \int_0^\infty (x^T Q x + u^T R u) \, dt, with positive semi-definite Q and positive definite R weighting matrices. The resulting control law takes the linear form u(t) = -K x(t), where K = R^{-1} B^T P and P solves the Riccati equation A^T P + P A - P B R^{-1} B^T P + Q = 0 for the system \dot{x} = A x + B u. Introduced by Rudolf E. Kalman in 1960 as part of his seminal contributions to , the LQR built on earlier dynamic programming ideas from Richard Bellman and addressed the need for systematic feedback design in linear systems. Kalman's work formalized the problem using Hamilton-Jacobi-Bellman theory, deriving the as the key computational tool and introducing as a necessary condition for solvability. By the , extensions incorporated finite-horizon variants and observer-based implementations for partial state observation, evolving into the linear-quadratic-Gaussian framework when combined with Kalman filtering for noisy environments. These developments established LQR as a benchmark for , with inherent guarantees of asymptotic stability and performance bounds derived from the eigenvalues of the closed-loop system. In practice, LQR finds widespread application in engineering domains requiring precise regulation, such as for attitude control of , robotics for trajectory stabilization, and for vibration suppression in buildings and bridges. Its quadratic cost structure allows tunable trade-offs between tracking accuracy and control effort via the choice of Q and R, often refined through loop-shaping or numerical optimization to meet specific robustness requirements against model uncertainties. Despite assumptions of and perfect , modern variants extend LQR to nonlinear, constrained, and stochastic settings, underscoring its enduring influence in fields like autonomous vehicles and power systems.

Introduction

Definition and Purpose

The linear–quadratic regulator (LQR) is a classical optimal control technique that computes a state-feedback control law to minimize a quadratic cost function for linear dynamical systems. This approach assumes the system dynamics are linear and time-invariant, with the cost reflecting a trade-off between state deviations and control efforts. The primary purpose of the LQR is to stabilize unstable or marginally stable systems by driving the states toward a desired equilibrium, such as the origin, while also enabling reference tracking and disturbance rejection. By balancing the penalties on state errors (via a positive semi-definite weighting matrix) and control inputs (via a positive definite weighting matrix), it achieves efficient regulation without excessive energy expenditure. Under the assumptions of and quadratic costs, the LQR guarantees global optimality and asymptotic for controllable systems, with computational tractability achieved through solutions to Riccati equations. These properties make it a foundational method in , offering a robust starting point for more complex nonlinear or uncertain systems. For instance, in applications like , the LQR designs minimum-energy laws to align vehicle orientation with thrust commands, ensuring precise maneuvering.

Historical Development

The linear–quadratic regulator (LQR) emerged in the mid-20th century amid the development of theory during the 1950s and 1960s, marking a transition from frequency-domain methods to state-space approaches for system analysis and design. This period was influenced by the and advances in computing, which necessitated efficient control strategies for complex dynamic systems. Key figures such as Rudolf E. Kalman and Richard S. Bucy played pivotal roles; Kalman's work laid the groundwork by framing control problems in terms of minimizing quadratic cost functions for linear systems. A seminal milestone occurred in 1960 when Kalman published his paper "Contributions to the Theory of Optimal Control," introducing the solution to the linear quadratic problem via the Riccati equation and establishing the core framework for LQR design. This formulation provided a systematic method to compute optimal feedback gains, influencing subsequent research in multivariable control. The evolution of LQR continued in the 1960s through its integration with state estimation via the Kalman-Bucy filter, leading to the linear-quadratic-Gaussian (LQG) framework that handled stochastic disturbances in real-world applications. Computational advances in the and , driven by the proliferation of digital computers and microprocessors, enabled the real-time solution of Riccati equations and implementation of LQR controllers in embedded systems, broadening its practicality beyond theoretical analysis. Into the , LQR remains a foundational tool in and autonomous vehicles, valued for its robustness and optimality in path tracking and stabilization tasks, with ongoing refinements but no fundamental shifts since the late .

Mathematical Prerequisites

Linear State-Space Models

The linear state-space model provides a fundamental framework for representing the dynamics of linear time-invariant (LTI) systems in . In this representation, the system's behavior is described by a set of (or difference) equations that capture the evolution of the internal over time in response to inputs. For continuous-time systems, the state-space form is given by \dot{\mathbf{x}}(t) = A \mathbf{x}(t) + B \mathbf{u}(t), where \mathbf{x}(t) \in \mathbb{R}^n is the state vector, \mathbf{u}(t) \in \mathbb{R}^m is the input vector, A \in \mathbb{R}^{n \times n} is the system matrix describing the intrinsic dynamics, and B \in \mathbb{R}^{n \times m} is the input matrix that maps inputs to state changes. For discrete-time systems, the analogous form is \mathbf{x}_{k+1} = A \mathbf{x}_k + B \mathbf{u}_k, with \mathbf{x}_k and \mathbf{u}_k denoting states and inputs at time step k. This notation uses boldface to distinguish vectors and matrices, and time dependence is explicit where relevant to highlight the evolution. The state-space approach, introduced as a unified for analyzing linear dynamical systems, shifts focus from input-output relations to internal state trajectories, enabling comprehensive system analysis. Key assumptions underlying the linear state-space model include , meaning the equations are affine in states and inputs with no higher-order terms, and time-invariance, where the matrices A and B remain constant over time. These properties ensure that superposition holds, allowing solutions to be scaled and added linearly. Additionally, is a critical assumption, defined as the ability to drive the from any to the in finite time using admissible inputs; it depends on the of the controllability matrix [B, AB, \dots, A^{n-1}B] being full (equal to n). In the context of regulator problems, controllability guarantees that stabilizing can be designed. The input matrix B plays a pivotal role here, as its columns span the directions in state space directly influenced by inputs, determining the extent to which external controls can affect the system's evolution. Properties of the state-space model facilitate stability analysis, primarily through the eigenvalues of the system matrix A. For continuous-time LTI systems, asymptotic requires all eigenvalues of A to have negative real parts, ensuring that the unforced state \mathbf{x}(t) = e^{At} \mathbf{x}(0) converges to zero as t \to \infty. In discrete time, demands that all eigenvalues lie inside the unit circle in the . The matrix B does not directly affect open-loop but modulates how inputs can counteract or enhance the natural dynamics dictated by A. These characteristics make the state-space form indispensable for understanding system behavior before applying strategies.

Quadratic Cost Functions

In optimal control, particularly for the linear-quadratic regulator (LQR), the performance of a control strategy is evaluated using a quadratic cost function, which quantifies the between state regulation and control effort. For the infinite-horizon continuous-time case, the cost is typically defined as the J = \int_0^\infty \left( \mathbf{x}(t)^T Q \mathbf{x}(t) + \mathbf{u}(t)^T R \mathbf{u}(t) \right) dt, where Q \in \mathbb{R}^{n \times n} is a positive semi-definite weighting the state deviations (often chosen to penalize specific states like or errors), and R \in \mathbb{R}^{m \times m} is a positive definite weighting the inputs (reflecting or limits). The ensures the cost is convex, facilitating analytical solutions via dynamic programming or . For finite-horizon problems, the is taken from 0 to T, potentially with a terminal cost \mathbf{x}(T)^T P_f \mathbf{x}(T) to account for endpoint conditions. In discrete time, the replaces the : J = \sum_{k=0}^\infty \left( \mathbf{x}_k^T Q \mathbf{x}_k + \mathbf{u}_k^T R \mathbf{u}_k \right). The choice of Q and R is application-specific: larger Q elements emphasize tighter state tracking, while larger R elements reduce aggressive control to conserve resources. Observability of the pair (A, \sqrt{Q}) (where \sqrt{Q} is a matrix square root) ensures the cost penalizes all controllable modes, complementing the controllability assumption. This framework, rooted in , provides a measurable objective for deriving optimal laws that minimize J subject to the .

Problem Formulation

General Setup

The linear–quadratic regulator (LQR) addresses the optimal control of linear dynamical systems by minimizing a quadratic performance index that penalizes deviations in state from a desired trajectory and excessive control effort. This formulation assumes a continuous-time, time-invariant system with perfect state measurement available, distinguishing it from extensions like the linear–quadratic-Gaussian (LQG) controller, which incorporates process and measurement noise. The state-space representation and quadratic cost structure draw from general linear systems theory, as outlined in the mathematical prerequisites. The core LQR problem is to determine a control input u(t) that minimizes the cost functional J = \int_0^T \left( x(t)^\top Q x(t) + u(t)^\top R u(t) \right) \, dt + x(T)^\top P_f x(T), subject to the linear dynamics \dot{x}(t) = A x(t) + B u(t), with given initial x(0) = x_0, where Q \succeq 0 and R \succ 0 are symmetric weighting matrices for states and inputs, respectively, and P_f \succeq 0 is an optional terminal cost matrix. Terminal constraints may be incorporated via P_f, but the problem often assumes none for simplicity. The system matrices A \in \mathbb{R}^{n \times n} and B \in \mathbb{R}^{n \times m} describe the evolution, with the pair (A, B) assumed controllable to ensure solvability. Under these assumptions, the optimal control takes the state-feedback form u(t) = -K(t) x(t), where K(t) is a time-varying gain matrix designed to balance the cost terms. This feedback law renders the closed-loop system \dot{x} = (A - B K) x asymptotically stable, driving the state to the origin while respecting the quadratic penalties. The resulting K is computed offline via dynamic programming, yielding a certainty-equivalence solution without online optimization.

Finite-Horizon Variant

In the finite-horizon variant of the linear-quadratic regulator (LQR), the control problem is formulated over a fixed time [0, T] with T < \infty, aiming to minimize a quadratic cost that balances deviations, control effort, and a terminal penalty. The cost functional is given by J = x(T)^T P_T x(T) + \int_0^T \left( x(t)^T Q(t) x(t) + u(t)^T R(t) u(t) \right) dt, where P_T \geq 0 is the terminal weighting matrix penalizing the final x(T), and Q(t) \geq 0, R(t) > 0 are symmetric time-varying matrices weighting the and control penalties, respectively. This setup assumes linear \dot{x}(t) = A(t) x(t) + B(t) u(t) with x(0) = x_0, allowing for optimization of transient behavior without assuming perpetual operation. The policy in this finite-horizon LQR takes the form of a time-varying state feedback u(t) = -K(t) x(t), where the gain matrix K(t) is determined by solving equations backward in time from the terminal at t = T to the initial time t = 0. This backward ensures the gain adapts to the approaching horizon, providing precise adjustments that diminish in influence as the terminal time nears, particularly when P_T emphasizes final accuracy. Unlike steady-state approaches, this time-varying nature captures the evolving priorities over the finite interval. Finite-horizon LQR finds applications in short-term tasks requiring precise , such as systems where the goal is to minimize intercept error within a bounded time. In these scenarios, the method enables integrated guidance and by shaping trajectories to meet terminal constraints like impact angle and miss distance. Compared to the infinite-horizon variant, it lacks an asymptotic steady-state , instead demanding repeated backward computations for each horizon , which elevates the computational load especially for extended finite intervals.

Infinite-Horizon Variant

The infinite-horizon variant of the (LQR) addresses over an unbounded time interval, where the horizon T approaches and no terminal cost term is included in the objective function. This formulation minimizes the accumulated cost indefinitely, emphasizing steady-state performance and long-term rather than a fixed . It is particularly suited for systems requiring persistent without predefined termination, such as continuous . For the infinite-horizon cost to remain finite and the optimal to converge, the must satisfy specific structural conditions: the pair (A, B) must be stabilizable to ensure of unstable modes, and the pair (A, C) must be detectable, where Q = C^\top C, to guarantee of those modes. These assumptions prevent divergence of the functional and ensure the existence of a stabilizing law. Without detectability, unobservable unstable modes could lead to unbounded , rendering the problem ill-posed. Under these conditions, the optimal control policy simplifies to a time-invariant linear feedback gain K, applied constantly after any initial transient phase, yielding u(t) = -K x(t). This constant gain promotes asymptotic stability, driving the state to the origin over time while minimizing the ongoing quadratic penalties on state deviations and control effort. The infinite-horizon LQR thus provides a stationary policy ideal for perpetual operation. As the finite-horizon length T increases, the time-varying solution to the associated converges to a steady-state form known as the algebraic Riccati solution, establishing equivalence between the limiting finite-horizon and infinite-horizon problems. This asymptotic behavior underscores the practicality of the infinite-horizon approach for applications like chemical process control or power system stabilization, where indefinite operation demands robust, unchanging regulation strategies.

Optimal Control Derivation

Hamilton-Jacobi-Bellman Approach

The Hamilton-Jacobi-Bellman (HJB) approach to the linear-quadratic regulator (LQR) derives the optimal control policy using principles of dynamic programming, where the value function represents the minimum cost-to-go from any state at a given time. This method frames the LQR problem as a continuous-time task with linear dynamics \dot{x} = Ax + Bu and quadratic stage cost x^T Q x + u^T R u, assuming Q \succeq 0 and R \succ 0. The value function V(x, t) is defined as the infimum over admissible controls of the from time t to the horizon T, including the of the stage cost plus a terminal quadratic penalty x(T)^T Q_f x(T). Applying the dynamic programming principle leads to the HJB equation, which enforces optimality at every and time: \min_u \left[ \frac{\partial V}{\partial t} + \frac{\partial V}{\partial x} (Ax + Bu) + x^T Q x + u^T R u \right] = 0, with the terminal condition V(x, T) = x^T Q_f x. This characterizes the value function as the solution to a boundary-value problem, connecting the LQR to broader theory through the formulation. Given the quadratic structure of the cost and the linearity of the , the value function is assumed to take the V(x, t) = x^T P(t) x, where P(t) is a symmetric matrix that evolves backward in time. Substituting this into the HJB equation yields the by minimizing the expression with respect to u. The minimization step involves taking the derivative with respect to u and setting it to zero, resulting in the state- law u^* = -R^{-1} B^T P x. This linear gain directly ties the HJB to the LQR's closed-loop , highlighting its role as a cornerstone of .

Riccati Equation Solution

In the Hamilton-Jacobi-Bellman (HJB) framework for the linear-quadratic regulator (LQR) problem, the optimal value function is assumed to take a V(x, t) = x^T P(t) x, where P(t) is a symmetric that evolves with time. This assumption simplifies the nonlinear HJB into an ordinary for P(t). Substituting the quadratic value function into the HJB equation and minimizing the Hamiltonian with respect to the control input u yields the continuous-time differential : \dot{P}(t) + A^T P(t) + P(t) A - P(t) B R^{-1} B^T P(t) + Q = 0, where A and B are the state and input matrices of the , Q is the state weighting matrix, and R is the input weighting matrix, assumed positive definite. This equation, first derived in the context of for , describes the backward evolution of P(t) from the terminal time to the initial time. For the infinite-horizon LQR problem, where the time horizon extends to infinity and the system is asymptotically stable under optimal control, the solution seeks a constant positive definite matrix P. Setting \dot{P} = 0 in the differential Riccati equation results in the algebraic Riccati equation: A^T P + P A - P B R^{-1} B^T P + Q = 0. A unique positive definite solution P exists under the assumptions of controllability of the pair (A, B) and positive definiteness of Q and R. The boundary condition for the finite-horizon case is P(T) = P_T, typically P_T = Q_f to account for state costs, ensuring the solution remains for all t \in [0, T]. For the infinite-horizon case, the of P guarantees the of the closed-loop system. Once P is obtained by solving the appropriate , the optimal feedback gain matrix is computed as K(t) = R^{-1} B^T P(t) for the time-varying case or K = R^{-1} B^T P for the steady-state case, yielding the linear control law u = -K x.

Specific Implementations

Continuous-Time Finite-Horizon

The continuous-time finite-horizon linear-quadratic regulator (LQR) provides an optimal feedback control law for linear dynamical systems over a specified time [0, T], minimizing the quadratic cost \int_0^T (x(t)^T Q x(t) + u(t)^T R u(t)) \, dt, where Q \succeq 0 and R \succ 0 weight state and input penalties, respectively, assuming no terminal cost for simplicity. The solution relies on the time-varying matrix P(t) that satisfies the differential , derived from the Hamilton-Jacobi-Bellman . The Riccati differential equation is given by \dot{P}(t) = -A^T P(t) - P(t) A + P(t) B R^{-1} B^T P(t) - Q, with the terminal condition P(T) = 0. This (ODE) is integrated backward in time from t = T to t = 0 to obtain P(t) for all t \in [0, T], ensuring the cost-to-go function V(t, x) = x^T P(t) x captures the optimal value from time t onward. If a nonzero terminal cost matrix Q_f \succeq 0 is included, the becomes P(T) = Q_f, but the equation form remains unchanged. The time-varying optimal feedback gain is then computed as K(t) = R^{-1} B^T P(t), yielding the control law u(t) = -K(t) x(t). This gain adjusts dynamically over the horizon, typically becoming more aggressive early in the interval to drive the state toward zero and less so near t = T under the zero-terminal-cost assumption. Substituting into the \dot{x}(t) = A x(t) + B u(t) produces the closed-loop form \dot{x}(t) = (A - B K(t)) x(t), which is asymptotically stable over [0, T] for controllable and systems, steering the state to the origin by the horizon end. Computationally, the is solved via numerical backward integration methods, such as Runge-Kutta solvers, starting from P(T) and propagating to initial time; this "backward recursion" avoids forward simulation errors accumulating over long horizons. The solution exhibits to the horizon T: shorter T results in conservative gains that prioritize feasibility over , while longer T allows more optimal but computationally intensive trajectories, with improving for moderate T under well-conditioned A and B.

Continuous-Time Infinite-Horizon

In the continuous-time infinite-horizon linear-quadratic regulator (LQR) problem, the goal is to design a law for the \dot{x}(t) = A x(t) + B u(t) that minimizes the quadratic cost functional J = \int_0^\infty \left( x(t)^T Q x(t) + u(t)^T R u(t) \right) dt, where Q \succeq 0 and R \succ 0 are symmetric weighting matrices. This formulation assumes no cost and requires the to converge, which holds under stabilizability conditions on the . The optimal takes the form of a constant linear u(t) = -K x(t), where the matrix K is time-invariant, contrasting with the time-varying gains in finite-horizon variants. The optimal gain K is derived by solving the Hamilton-Jacobi-Bellman equation in , yielding the continuous (ARE): A^T P + P A - P B R^{-1} B^T P + Q = 0, where P \succeq 0 is the unique solution. If the pair (A, B) is controllable and the pair (Q^{1/2}, A) is , then P is positive definite, ensuring the closed-loop matrix A - B K has all eigenvalues with negative real parts, thus guaranteeing asymptotic of the origin for any initial state. The feedback is then computed as K = R^{-1} B^T P. Several numerical methods exist for solving the ARE. A direct approach involves forming the Hamiltonian matrix H = \begin{pmatrix} A & -B R^{-1} B^T \\ -Q & -A^T \end{pmatrix} and computing its to identify the stable , from which P is obtained via the corresponding deflating ; this is robust and widely implemented in software libraries. Alternatively, iterative solvers such as the Kleinman algorithm start with an initial guess P_0 = 0 and successively solve the associated A^T P_{k+1} + P_{k+1} A + Q - P_{k+1} B R^{-1} B^T P_k + P_k B R^{-1} B^T P_{k+1} = 0 until , leveraging the monotonicity of the sequence under the aforementioned assumptions. These techniques ensure computational efficiency for systems of moderate dimension while preserving the theoretical guarantees of uniqueness and stability.

Discrete-Time Finite-Horizon

The discrete-time finite-horizon linear quadratic regulator (LQR) provides an optimal strategy for linear systems over a predetermined number of time steps, commonly applied in sampled-data contexts where continuous processes are discretized for computational . This formulation arises from dynamic programming principles, minimizing a performance index subject to linear , and yields time-varying gains computed offline via recursive equations. Unlike its continuous-time counterpart, which relies on differential equations, the discrete version employs algebra suitable for digital processors, enabling precise in applications like and process . The underlying system is modeled by the linear discrete-time dynamics x_{k+1} = A x_k + B u_k, where x_k \in \mathbb{R}^n denotes the state vector at time step k, u_k \in \mathbb{R}^m is the control input, and A, B are the state transition and input matrices, respectively, assumed constant over the horizon. The control objective is to minimize the finite-horizon quadratic cost functional J = \sum_{k=0}^{N-1} \left( x_k^T Q x_k + u_k^T R u_k \right), with symmetric state weighting Q \geq 0 and positive definite input weighting R > 0, penalizing deviations from the without a terminal cost term. The optimal takes the linear state-feedback form u_k = -K_k x_k, where the time-varying K_k balances state against effort. To derive the gains, the solution involves backward recursion through the discrete-time Riccati difference equation, starting from the horizon end with P_N = 0: P_k = A^T P_{k+1} A + Q - A^T P_{k+1} B (R + B^T P_{k+1} B)^{-1} B^T P_{k+1} A, \quad k = N-1, \dots, 0. Each P_k \geq 0 represents the value function matrix at step k, capturing the cumulative future cost-to-go. The corresponding optimal gain is then K_k = (R + B^T P_{k+1} B)^{-1} B^T P_{k+1} A, ensuring the feedback u_k = -K_k x_k achieves the minimum cost, assuming (A, B) is stabilizable and (A, Q^{1/2}) is detectable to guarantee convergence properties over the horizon. This recursion is efficiently solved numerically, with computational complexity scaling as \mathcal{O}(N n^3) for horizon length N and state dimension n, making it feasible for moderate-sized systems. In practice, the discrete-time finite-horizon LQR is integral to digital control systems, where analog plants are discretized via methods like zero-order hold, allowing implementation on microcontrollers for tasks such as trajectory tracking in unmanned vehicles or inventory management in operations research. The approach's optimality holds under the linearity and quadratic cost assumptions, providing a benchmark for robust and adaptive extensions in uncertain environments.

Discrete-Time Infinite-Horizon

In the discrete-time infinite-horizon (LQR) problem, the goal is to find a time-invariant that minimizes the infinite sum of a for a x_{k+1} = A x_k + B u_k, where the is \sum_{k=0}^{\infty} (x_k^T Q x_k + u_k^T R u_k), with Q \geq 0 and R > 0 symmetric. This formulation assumes the pair (A, B) is stabilizable and (A, Q^{1/2}) is detectable, ensuring the existence of a unique solution to the associated . The optimal cost-to-go P satisfies the discrete (ARE): \begin{aligned} P &= A^T P A + Q - A^T P B (R + B^T P B)^{-1} B^T P A, \end{aligned} which represents the steady-state limit of the finite-horizon recursion as the number of steps approaches . The corresponding gain is then given by K = (R + B^T P B)^{-1} B^T P A, yielding the law u_k = -K x_k. To solve the discrete ARE numerically, iterative methods such as the Kleinman iteration can be employed, which converges to the stabilizing solution P under the stabilizability assumption by successively updating P_{i+1} = A^T P_i A + Q - A^T P_i B (R + B^T P_i B)^{-1} B^T P_i A, starting from P_0 = 0. Alternatively, the Schur decomposition method transforms the associated into a to extract the stable , from which P is recovered, providing a direct and reliable computation for systems where iterative may be slow. The resulting closed-loop system dynamics are x_{k+1} = (A - B K) x_k, and under the given assumptions, the matrix A - B K is Schur , meaning all its eigenvalues lie strictly inside the unit disk, ensuring asymptotic and finite expected cost.

Constraints and Limitations

Unconstrained Assumptions

The standard linear–quadratic regulator (LQR) is formulated under the assumption that inputs and state variables are unbounded, permitting the optimization process to select any real-valued inputs and allowing states to evolve without prescribed limits. This unconstrained nature simplifies the mathematical derivation, enabling closed-form solutions via dynamic programming or variational methods. is another core assumption, requiring that all components of the system are directly measurable and accessible for , without reliance on or observers. Furthermore, the LQR framework assumes a dynamics, free from external disturbances, process , or measurement errors, ensuring that the model perfectly captures the system's behavior. These assumptions imply that the optimal gain, derived as a linear state law u = -Kx from the solution to the associated , may demand control efforts that saturate physical actuators in practical applications, such as exceeding voltage limits in electric motors or thrust bounds in systems. Consequently, when inputs saturate, the closed-loop system can deviate from the predicted , potentially resulting in suboptimal performance or even instability if the unconstrained design pushes the system beyond safe operating regimes. A primary limitation of these unconstrained assumptions is their disconnection from real-world constraints, such as saturation or bounds in safety-critical domains like or automotive , where violation can lead to despite the design's theoretical soundness. Nonetheless, under the stated assumptions and provided the linear system is controllable, the LQR guarantees global optimality by minimizing the infinite-horizon quadratic cost functional and ensures asymptotic of the closed-loop around the origin. This stability holds uniformly for all initial states, leveraging the of the cost matrices and the of the .

Handling State and Input Constraints

A common practical approach to handle input constraints in the linear-quadratic regulator (LQR) involves clipping the unconstrained input to respect limits, yielding the saturated control law u = \text{sat}(-K x), where K is the LQR and \text{sat}(\cdot) denotes the saturation function. This method is widely applied in semi-active control systems, such as vehicle suspensions, where the clipped input approximates the ideal while adhering to bounds. However, simple clipping can lead to degradation or for large deviations, as the saturated no longer minimizes the exactly. To ensure stability under input saturation, Lyapunov-based analysis employs sector bounds to model the nonlinearity introduced by the saturation function, treating the difference between saturated and unsaturated controls as a bounded perturbation. For instance, Fu (2001) derives an optimal sector bound \rho that characterizes this mismatch and constructs a Lyapunov function V(x) = x^T P_\rho x, where P_\rho solves a modified Riccati equation, guaranteeing asymptotic stability within nested invariant ellipsoidal sets. This approach enlarges the region of attraction compared to naive clipping while preserving near-optimal performance for small states. State constraints, such as bounds on position or velocity, require reformulating the LQR problem as a quadratic program (QP) solved at each time step, minimizing the quadratic cost subject to linear inequalities on states and inputs. Scokaert and Rawlings (1998) show that for the infinite-horizon discrete-time case, this constrained LQR yields a stabilizing feedback policy by iteratively solving finite-horizon QPs, with the solution converging to a unique steady-state gain that respects constraints without tuning parameters like control horizons. This QP-based method ensures feasibility and optimality within the constrained set but shifts from the closed-form Riccati solution of unconstrained LQR. For mild constraints, alternative strategies include gain scheduling, where the LQR gain K is adapted dynamically based on the state magnitude to preempt saturation, or explicit multiparametric quadratic programming that precomputes a piecewise affine control law over polyhedral regions. Komaee (2024) proposes dynamic gain adaptation in LQR to balance high performance with saturation avoidance, using a state-dependent scaling that maintains stability via Lyapunov analysis. Similarly, Bemporad et al. (2002) derive an explicit solution for constrained LQR as a continuous piecewise quadratic function, enabling fast online evaluation akin to model predictive control (MPC) predictions over short horizons. These methods suit systems with known bound structures, such as robotic manipulators. Incorporating constraints into LQR often results in a loss of global optimality, as the saturated or QP-adjusted inputs deviate from the unconstrained minimum-cost , potentially reducing the or requiring conservative designs. Computationally, QP solving or explicit partitioning increases online demands compared to matrix multiplications in standard LQR, with complexity scaling cubically in horizon length for dense formulations. In applications, such as for legged platforms, constrained iterative LQR variants address joint limits and contact forces but rely on approximations like active-set methods, leading to suboptimality and numerical challenges with high-degree constraints.

Linear-Quadratic-Gaussian Regulator

The linear-quadratic-Gaussian (LQG) regulator addresses the of linear systems in the presence of stochastic disturbances, integrating the linear-quadratic regulator (LQR) for deterministic control with a for state estimation under assumptions. This framework applies to discrete-time systems described by the state evolution x_{k+1} = A x_k + B u_k + w_k and noisy measurements y_k = C x_k + v_k, where w_k and v_k are zero-mean and measurement noises with covariances Q_n and R_n, respectively. The LQG approach minimizes the of a quadratic cost functional similar to the LQR, but accounting for estimation uncertainty. Central to the LQG regulator is the separation principle, which establishes that the optimal control and estimation problems can be solved independently, with the overall controller applying the LQR feedback gain to the state estimate from the Kalman filter. Specifically, the control law takes the form u_k = -K \hat{x}_{k|k}, where K is the optimal LQR gain matrix and \hat{x}_{k|k} is the filtered state estimate. This independence simplifies design: the LQR gain K is computed based on the system matrices A, B, and the cost weighting matrices Q and R as in the deterministic case, while the estimator parameters depend solely on the noise covariances Q_n and R_n. The separation principle holds under the Gaussian noise assumption, ensuring the combined controller achieves the minimal expected cost. The provides the state estimate \hat{x}_{k|k} through a two-step prediction-correction process. In the prediction step, the a priori estimate is propagated forward using the : \hat{x}_{k|k-1} = A \hat{x}_{k-1|k-1} + B u_{k-1}, with the associated error updated via the incorporating Q_n. The correction step then incorporates the new y_k to the a posteriori estimate \hat{x}_{k|k} = \hat{x}_{k|k-1} + L_k (y_k - C \hat{x}_{k|k-1}), where the Kalman L_k is derived from the predicted , measurement matrix C, and noise covariances Q_n and R_n to minimize error variance. For time-invariant systems, steady-state gains K and L are obtained by solving algebraic s. The LQG framework embodies the certainty equivalence principle, whereby the optimal control policy treats the state estimate \hat{x}_{k|k} as the true state, ignoring the uncertainty in the estimate for the purpose of feedback computation. This equivalence arises directly from the separation principle and the quadratic-Gaussian structure, ensuring that the certainty-equivalent controller coincides with the fully optimal stochastic controller. While this yields the minimal expected quadratic cost, it does not explicitly hedge against estimation errors, a limitation addressed in robust extensions.

Model Predictive Control

Model Predictive Control (MPC) is an technique that employs a dynamic model to forecast the system's future states over a finite prediction horizon. At each sampling instant, an is solved to compute a sequence of future control inputs that minimize a , subject to the predicted system dynamics and any operational constraints. Only the initial control action from this sequence is implemented, after which the horizon recedes forward, and the optimization is repeated using updated measurements, enabling closed-loop and to disturbances or model mismatches. This receding-horizon strategy originated in process industries in the late 1970s and has since become a standard for multivariable constrained systems due to its ability to handle complex interactions explicitly. In its , MPC bears a direct relation to the linear-quadratic regulator (LQR), extending it to incorporate constraints and finite horizons. For unconstrained linear systems with costs, the MPC problem reduces to the finite-horizon LQR optimization, where the solution yields a time-varying law; as the horizon approaches , this converges to the steady-state infinite-horizon LQR derived from the . When constraints are introduced, such as bounds on states x_k \in \mathcal{X} or inputs u_k \in \mathcal{U}, the problem transforms into a program (), solved online at each step to enforce feasibility while minimizing the cost. This constrained formulation results in a affine or nonlinear law, contrasting with LQR's linear structure. The primary advantages of MPC over traditional LQR stem from its explicit treatment of constraints and predictive nature. LQR assumes unbounded states and inputs, potentially leading to infeasible or saturating controls in real systems, whereas MPC systematically incorporates constraints via the , ensuring safe operation in applications like chemical processes or where actuator limits and safety bounds are paramount. Furthermore, the online re-optimization in MPC provides superior disturbance rejection and tracking compared to LQR's static , as it anticipates behavior and adjusts dynamically; however, this comes at the of higher computational demand, often mitigated by explicit solutions precomputing the map offline. While linear MPC focuses on quadratic costs for , it can extend to nonlinear variants for broader applicability, though guarantees require careful terminal and set akin to LQR. The standard discrete-time linear MPC formulation for a system x_{k+1} = A x_k + B u_k is given by: \begin{align*} \min_{u_{0|k}, \dots, u_{N-1|k}} &\quad \sum_{i=0}^{N-1} \left( x_{i|k}^\top Q x_{i|k} + u_{i|k}^\top R u_{i|k} \right) + x_{N|k}^\top P x_{N|k} \\ \text{subject to} &\quad x_{i+1|k} = A x_{i|k} + B u_{i|k}, \quad x_{0|k} = x_k, \\ &\quad x_{i|k} \in \mathcal{X}, \quad u_{i|k} \in \mathcal{U}, \quad i = 0, \dots, N-1, \\ &\quad x_{N|k} \in \mathcal{X}_f, \end{align*} where x_{i|k} and u_{i|k} denote predicted and input at time k+i based on current x_k, Q \succeq 0 and R \succ 0 are weighting matrices penalizing deviations and control effort, P \succeq 0 is the terminal cost matrix (often the LQR ), \mathcal{X} and \mathcal{U} are sets defining and input constraints, and \mathcal{X}_f is a terminal set ensuring recursive feasibility and . The optimal input sequence U_k^* = \{u_{0|k}^*, \dots, u_{N-1|k}^*\} yields the applied control u_k = u_{0|k}^*, with the process repeating at k+1. This setup mirrors the finite-horizon LQR cost but adds the inequality constraints, enabling MPC's practical utility in constrained environments.

Other Variants

The quadratic-quadratic regulator variant of the linear-quadratic regulator incorporates penalties solely on quadratic input terms, including the rate of change of the control input, to promote smoother controller behavior while minimizing state deviations. The associated cost function simplifies to J = \int_0^\infty \left( x^\top Q x + u^\top R u + \dot{u}^\top S \dot{u} \right) \, dt, where Q \geq 0 weights the , R > 0 and S \geq 0 weight the and its , respectively. This formulation addresses limitations of standard LQR in systems requiring bounded control rates, such as flight control, by augmenting the with control derivative states. The -quadratic regulator extends LQR to approximate nonlinear dynamics through higher-order terms in the , enabling better handling of systems with mild nonlinearities. For instance, including cubic terms in the states allows the controller to capture asymmetries or higher-fidelity behaviors without full nonlinear optimization. Seminal work on approximating solutions for such problems uses structured methods to compute gains, often yielding low-degree controllers for tractability in high-dimensional settings. Recent advancements further explore - costs for broader nonlinear approximations, maintaining quadratic-like solvability via sum-of-squares techniques. Risk-sensitive LQR addresses uncertainty in environments by replacing the expected cost with an function, emphasizing worst-case scenarios for risk-averse . The becomes J = \frac{2}{\theta} \log \mathbb{E} \left[ \exp \left( \frac{\theta}{2} \int_0^\infty (x^\top Q x + u^\top R u) \, dt \right) \right], where \theta > 0 tunes risk sensitivity; as \theta \to \infty, it converges to H-infinity control for robustness against disturbances. This variant, originally formulated for linear-quadratic-Gaussian systems, provides certainty-equivalent solutions via Riccati equations with adjusted parameters. In the 2020s, robust LQR variants have seen integration with adaptive algorithms and AI-driven to enable of system parameters in uncertain or evolving environments, such as autonomous systems and . These approaches achieve sublinear bounds while ensuring stability, bridging classical with data-driven methods for real-world deployment.