State-space representation is a mathematical framework in control theory and systems engineering for modeling the dynamics of physical systems using a set of state variables that capture the system's internal condition, along with input and output variables related through first-orderdifferential equations.[1] This approach converts an nth-order differential equation describing the system into an equivalent first-ordermatrix differential equation, enabling a compact and versatile description of system behavior over time.[2]The standard form for linear time-invariant (LTI) systems in continuous time is given by the state equation \dot{x}(t) = Ax(t) + Bu(t) and the output equation y(t) = Cx(t) + Du(t), where x(t) is the state vector of dimension n, u(t) is the input vector, y(t) is the output vector, and A, B, C, D are constant matrices of appropriate dimensions representing the system dynamics, input coupling, output coupling, and direct feedthrough, respectively.[3] For discrete-time systems, the equations become x[k+1] = Ax + Bu and y = Cx + Du.[4] The choice of state variables is not unique and can be selected to reflect physically meaningful quantities, such as position and velocity in mechanical systems, allowing the model to fully determine future states from initial conditions and inputs.[1]Unlike transfer function representations, which describe linear time-invariant systems in the frequency domain, state-space models naturally accommodate multi-input multi-output (MIMO) systems, nonlinearities, time-varying parameters, and scenarios where inputs are computed in real-time during simulation or controldesign.[5] This makes them particularly suitable for modern control techniques, such as state feedback, observer design, and optimal control via methods like linear quadratic regulators (LQR).[2]The state-space approach emerged in the late 1950s and early 1960s as part of the shift to modern control theory, largely through the contributions of Rudolf E. Kalman, who formalized its use for system realization, filtering, and estimation in his seminal 1960 paper on continuous-time problems.[6] Kalman's work built on earlier ideas from differential equations and linear algebra, integrating concepts like controllability and observability to analyze system properties, and it played a pivotal role in applications ranging from aerospace guidance to signal processing and robotics.[7] Today, state-space representations underpin computational tools in software like MATLAB and are essential for simulating and controlling complex engineered systems.[4]
Fundamentals
Definition and Purpose
State-space representation, developed in the early 1960s by Rudolf E. Kalman and collaborators, arose as a response to the limitations of classical control methods, such as transfer functions, which were primarily suited for single-input single-output (SISO) systems and struggled with the complexities of multi-input multi-output (MIMO) configurations prevalent in modern engineering applications.[8] Kalman's foundational work formalized a time-domain approach that unified the analysis and synthesis of linear dynamical systems, enabling the treatment of internal system behavior alongside input-output relations.[9]At its core, state-space representation models a dynamic system using a set of first-order differential equations that describe the evolution of the state vector \mathbf{x}(t) \in \mathbb{R}^n, which captures the system's internal condition, in response to the input vector \mathbf{u}(t) \in \mathbb{R}^m, with the output vector \mathbf{y}(t) \in \mathbb{R}^p derived therefrom. The general nonlinear form is given by:\dot{\mathbf{x}}(t) = f(\mathbf{x}(t), \mathbf{u}(t), t)\mathbf{y}(t) = h(\mathbf{x}(t), \mathbf{u}(t), t)where f: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R} \to \mathbb{R}^n and h: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R} \to \mathbb{R}^p are smooth functions defining the system dynamics and output mapping, respectively. This framework serves to predict future system behavior from current states and inputs, facilitating simulation, control design, and estimation tasks in fields like aerospace, robotics, and process control.[2]Compared to precursors like ordinary differential equations, which describe systems through higher-order scalar relations, state-space representation reduces them to an equivalent vector form of first-order equations, promoting computational efficiency and multivariable handling.[2] Transfer functions, derived via Laplace transforms for linear time-invariant systems, emphasize input-output relations but overlook initial conditions, internal dynamics, and MIMO interactions, rendering them inadequate for comprehensive analysis.[10] Key advantages of state-space models include their explicit inclusion of initial states for transient response prediction, natural accommodation of time-varying parameters, and ability to reveal unmeasurable internal dynamics essential for advanced control strategies like state feedback.[10][11]
State Variables and State Space
State variables form the core of the state-space representation, defined as the minimal set of variables whose values at an initial time t_0 uniquely determine the future behavior of a dynamical system given any specified input sequence.[3] This set captures the complete internal condition of the system, enabling prediction of all subsequent states and outputs without redundancy.[1] The concept, foundational to modern control theory, was formalized by Rudolf E. Kalman in his seminal 1960 paper on linear filtering, where state variables provide a vector description sufficient for recursive estimation and prediction.[12]The state space is the abstract mathematical structure encompassing all possible values of the state variables, typically modeled as the Euclidean space \mathbb{R}^n, where n denotes the number of state variables.[13] In this n-dimensional vector space, each point represents a unique system state, with axes aligned to the individual state variables, allowing geometric interpretation of dynamics such as trajectories or equilibria.[14] This representation facilitates analysis in multivariable systems by treating states as coordinates in a linear algebra framework.A classic example arises in the mass-spring-damper system, a second-order mechanical oscillator governed by Newton's second law. Here, the state variables are commonly chosen as the position x of the mass and its velocity \dot{x}, which together fully specify the motion from any initial conditions under applied forces.[5] These variables transform the original second-order differential equation into an equivalent first-order vector form, illustrating how physical quantities like displacement and momentum serve as states.While the choice of state variables is not unique—different selections can yield equivalent descriptions of the same dynamics—linear transformations ensure consistency. Specifically, a new state vector x' related to the original x by an invertible matrix T (i.e., x' = T x) preserves the input-output behavior and system properties.[15] This equivalence underscores the flexibility in state selection, often guided by physical insight or computational convenience.The dimension n of the state space directly corresponds to the order of the system, defined as the highest derivative order in its governing differential equation. For an n-th order equation, exactly n state variables are required to achieve a minimal representation, avoiding overparameterization.[1] This matching ensures the state-space model captures the essential degrees of freedom without excess.[16]
Linear Systems
Continuous-Time Representation
The continuous-time state-space representation provides a mathematical framework for modeling linear time-invariant (LTI) dynamical systems using first-order differential equations. It expresses the evolution of the system's state and output as functions of the current state and input, capturing multi-input multi-output (MIMO) behaviors in a compact matrix form. This representation is foundational in control theory for analyzing and designing systems, as it directly incorporates the system's internal dynamics and external influences.[3]The standard form for an LTI continuous-time system is given by\dot{x}(t) = A x(t) + B u(t)y(t) = C x(t) + D u(t),where x(t) \in \mathbb{R}^n denotes the state vector, u(t) \in \mathbb{R}^m the input vector, and y(t) \in \mathbb{R}^p the output vector, with A \in \mathbb{R}^{n \times n}, B \in \mathbb{R}^{n \times m}, C \in \mathbb{R}^{p \times n}, and D \in \mathbb{R}^{p \times m} being constant matrices that define the system parameters.[3][17] The model assumes linearity, meaning superposition holds for states, inputs, and outputs, and time-invariance, implying the matrices do not vary with time; it also relies on an initial state x(0) to fully specify the system's behavior from t = 0.[17][1]In this formulation, the matrix A governs the free response of the system by describing how the states evolve in the absence of inputs, B captures the coupling between inputs and state derivatives, C maps the states to the outputs, and D accounts for any direct feedthrough from inputs to outputs without state involvement.[3][1] The solution to the state equation, assuming zero initial conditions for the forced response, is obtained using the matrix exponential:x(t) = e^{A t} x(0) + \int_0^t e^{A (t - \tau)} B u(\tau) \, d\tau,which integrates the effects of past inputs weighted by the system's dynamics.[3][17]A simple illustrative example is the single integrator, modeling position as the integral of velocity input, with \dot{x}(t) = u(t) and y(t) = x(t), corresponding to A = 0, B = 1, C = 1, and D = 0.[1] This case demonstrates how the state-space form reduces higher-order systems to coupled first-order equations, facilitating computational analysis.[1]
Discrete-Time Representation
In discrete-time systems, the state-space representation models linear time-invariant dynamics through difference equations that describe the evolution of the state vector at discrete sampling instants. The standard form is given byx_{k+1} = A x_k + B u_k,y_k = C x_k + D u_k,where x_k \in \mathbb{R}^n is the state vector, u_k \in \mathbb{R}^m is the input vector, y_k \in \mathbb{R}^p is the output vector, and k denotes the sampling instant; the matrices A \in \mathbb{R}^{n \times n}, B \in \mathbb{R}^{n \times m}, C \in \mathbb{R}^{p \times n}, and D \in \mathbb{R}^{p \times m} capture the system dynamics, input influence, output mapping, and direct feedthrough, respectively.[18] This formulation contrasts with continuous-time models by replacing differential equations with recursive updates, making it ideal for implementation on digital computers.[19]Discrete-time state-space models are often derived from continuous-time counterparts via discretization, assuming a zero-order hold on the input signal over each sampling interval. For a continuous-time system \dot{x} = A_c x + B_c u with sampling period T, the discrete matrices areA_d = e^{A_c T},B_d = \int_0^T e^{A_c \tau} \, d\tau \, B_c,yielding the discrete form x_{k+1} = A_d x_k + B_d u_k.[20] This method preserves the stability properties of the original system when the sampling rate is sufficiently high, enabling the analysis of sampled-data control systems.[20]The explicit solution to the homogeneous discrete-time state equation (with zero input) is x_k = A^k x_0, while the full solution for nonzero inputs is the iterative expressionx_k = A^k x_0 + \sum_{i=0}^{k-1} A^{k-1-i} B u_i.This closed-form solution facilitates simulation and prediction in digital environments.[19] In applications such as digital control and signal processing, these models support the design of controllers for systems like sampled robotic actuators or digital filters; for instance, a simple discrete accumulator, modeling cumulative input effects, takes the form x_{k+1} = x_k + u_k, y_k = x_k, where the state integrates the input sequence.[21]Frequency-domain analysis of discrete-time state-space models draws an analogy to the continuous case via the z-transform, where the system's transfer function G(z) = C (zI - A)^{-1} B + D enables evaluation of the frequency response by substituting z = e^{j \omega T} along the unit circle, revealing gain and phase characteristics for stability and performance assessment in digital filters and controllers.[22]
Controllability
In the context of linear state-space representations, controllability describes the capability to transfer the system's state vector from any arbitrary initial state x(0) = x_1 to any desired final state x(t_f) = x_2 in finite time t_f > 0 by applying an appropriate input signal u(t).[23] This property ensures that the inputs can influence all degrees of freedom in the state space, forming a cornerstone of modern control theory as established by Rudolf E. Kalman.[6]For continuous-time linear time-invariant (LTI) systems described by \dot{x}(t) = Ax(t) + Bu(t), where x \in \mathbb{R}^n and u \in \mathbb{R}^m, controllability is determined by the rank of the controllability matrix \mathcal{C} = [B, AB, A^2B, \dots, A^{n-1}B]. The system is controllable if and only if \mathcal{C} has full rank n, meaning its column space spans the entire \mathbb{R}^n.[24] This algebraic criterion, known as Kalman's rank condition, provides a straightforward test independent of time and can be computed directly from the system matrices A and B.[25]The same controllability matrix and rank condition apply to discrete-time LTI systems of the form x(k+1) = Ax(k) + Bu(k), where the state transitions occur at discrete steps. A system is controllable if \mathcal{C} has rank n, allowing steering from any x(0) = x_1 to x(N) = x_2 in finite steps N.[23] This equivalence between continuous and discrete cases highlights the robustness of the framework across time domains.An alternative characterization uses the controllability Gramian W_c(t) = \int_0^t e^{A\tau} B B^T e^{A^T \tau} \, d\tau, which is positive definite for some finite t > 0 if and only if the system is controllable.[26] The Gramian quantifies the "energy" required to reach states and is particularly useful for time-varying systems or minimum-energy control problems, as its positive definiteness ensures the existence of a steering input with finite norm.[27]A simple example is the double integrator system modeling position and velocity, given by \dot{x} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u, where x = [[position](/page/Position), [velocity](/page/Velocity)]^T and u is acceleration input. The controllability matrix \mathcal{C} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} has full rank 2, confirming controllability and allowing arbitrary positioning and velocity adjustments via input.[24]Controllability is essential for control design techniques such as pole placement, where state feedback u = -Kx can arbitrarily assign the closed-loop eigenvalues if and only if the pair (A, B) is controllable, enabling desired dynamic response shaping.[28]
Observability
In state-space representation, observability refers to the ability to determine the initial state x(0) of a linear system uniquely from the knowledge of the input u(t) and output y(t) over a finite time interval.[29] This property ensures that the internal dynamics can be reconstructed from external measurements, which is fundamental for system identification and state estimation in control theory.[29]The standard test for observability in linear time-invariant systems, both continuous- and discrete-time, involves the observability matrix \mathcal{O}, defined as\mathcal{O} = \begin{bmatrix} C \\ CA \\ \vdots \\ CA^{n-1} \end{bmatrix},where A is the n \times n system matrix, C is the output matrix, and n is the system order. The system is observable if and only if \mathcal{O} has full rank n, meaning its rows span the entire state space.[29] This rank condition guarantees that the mapping from the initial state to the output sequence is injective under zero input.An alternative characterization uses the observability Gramian W_o(t), given byW_o(t) = \int_0^t e^{A^T \tau} C^T C e^{A \tau} \, d\taufor continuous-time systems (with a similar sum for discrete-time). The system is observable if W_o(t) is positive definite for some finite t > 0, as this implies the state can be recovered without singularity in the reconstruction process.[30]Observability exhibits a duality with controllability: a pair (A, C) is observable if and only if the transposed pair (A^T, C^T) is controllable, highlighting the symmetric roles of input and output matrices in state-space analysis.[29]Consider the inverted pendulum on a cart, a classic example where the state vector includes cart position x, cart velocity \dot{x}, pendulum angle \theta, and angular velocity \dot{\theta}. If only the cart position is measured (C = [1 \, 0 \, 0 \, 0]), the observability matrix has full rank 4, confirming that both position and velocity states (of the cart and pendulum) can be inferred from position outputs alone.[31]Analysis of the zero-input response further elucidates observability: with u(t) = 0, the output y(t) = C e^{At} x(0) must uniquely identify x(0), which holds if the columns of C e^{At} over $0 \leq t \leq T (for some T) span \mathbb{R}^n. This time-domain perspective aligns with the matrix rank test and underscores observability as a structural property independent of specific inputs.[29]
System Analysis
Transfer Functions
Transfer functions provide a frequency-domain representation of the input-output behavior of linear time-invariant (LTI) systems described by state-space models, obtained by applying the Laplace or Z-transform to the state equations under zero initial conditions.[32]For continuous-time LTI systems governed by \dot{x}(t) = Ax(t) + Bu(t) and y(t) = Cx(t) + Du(t), the transfer function matrix G(s) is derived asG(s) = C(sI - A)^{-1}B + D,where s is the complex frequency variable.[33] This expression results from taking the Laplace transform of the state equations, solving for X(s), and substituting into the output equation, yielding the forced response relating Y(s) to U(s).[34] The transfer function is proper if the degree of the denominator polynomial \det(sI - A), which equals the system order n, exceeds the degree of each numerator polynomial in G(s); it is strictly proper when D = 0.[35]In the discrete-time case, for systems \ x(k+1) = Ax(k) + Bu(k) and y(k) = Cx(k) + Du(k), the transfer function is similarlyG(z) = C(zI - A)^{-1}B + D,obtained via the Z-transform, bridging the time-domain state evolution to the frequency-domain input-output map.[36]A state-space realization is minimal if it is both controllable and observable, ensuring the transfer function is irreducible with no pole-zero cancellations; the minimal order equals the degree of the transfer function.[37] For multi-input multi-output (MIMO) systems, G(s) or G(z) forms a transfer matrix, and the McMillan degree—the minimal state dimension—matches the order n of a controllable and observable realization.[38]Consider a series RLC circuit with input voltage v_s(t) and output capacitor voltage v_c(t), using states x_1 = i_L (inductor current) and x_2 = v_c (capacitor voltage). The state-space model isA = \begin{bmatrix} -\frac{R}{L} & -\frac{1}{L} \\ \frac{1}{C} & 0 \end{bmatrix}, \quad
B = \begin{bmatrix} \frac{1}{L} \\ 0 \end{bmatrix}, \quad
C = \begin{bmatrix} 0 & 1 \end{bmatrix}, \quad D = 0.The resulting single-input single-output (SISO) transfer function isG(s) = \frac{V_c(s)}{V_s(s)} = \frac{1}{LC s^2 + RC s + 1},a second-order low-pass filter with poles determined by the circuit parameters.[5]Transfer functions capture only the zero-initial-condition response and omit unobservable or uncontrollable modes, which do not affect the input-output relation but may influence internal dynamics.[39]
Canonical Realizations
Canonical realizations provide standardized state-space representations for linear systems, particularly minimal realizations of transfer functions, facilitating analysis, simulation, and control design by imposing specific structures on the system matrices. These forms ensure uniqueness up to similarity transformations and highlight properties like controllability and observability. For single-input single-output (SISO) systems, common canonical forms include the controller and observer forms, while the Jordan form addresses modal structure. In multi-input multi-output (MIMO) systems, the Brunovsky form serves a similar role for controllable realizations.The controller canonical form, also known as the controllable canonical form, represents a SISO system with a transfer function having a monic denominator polynomial. The systemmatrix A is a companion matrix constructed from the coefficients of the denominator, ensuring the realization is controllable. Specifically, for a transfer function G(s) = \frac{b(s)}{a(s)} where a(s) = s^n + a_{n-1} s^{n-1} + \cdots + a_0 is monic and b(s) = b_{n-1} s^{n-1} + \cdots + b_0, the matrices are:A = \begin{bmatrix}
0 & 1 & 0 & \cdots & 0 \\
0 & 0 & 1 & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & \cdots & 1 \\
-a_0 & -a_1 & -a_2 & \cdots & -a_{n-1}
\end{bmatrix}, \quad
B = \begin{bmatrix}
0 \\
0 \\
\vdots \\
0 \\
1
\end{bmatrix},C = \begin{bmatrix} b_0 & b_1 & \cdots & b_{n-2} & b_{n-1} \end{bmatrix}, \quad D = 0for strictly proper systems. This structure places the dynamics in phase-variable coordinates, simplifying pole placement via state feedback.[40]As an example, consider the SISO transfer function G(s) = \frac{1}{s^2 + 3s + 2}, with denominator coefficients a_1 = 3, a_0 = 2, and numerator coefficient b_0 = 1, b_1 = 0. The controller canonical form is:A = \begin{bmatrix} 0 & 1 \\ -2 & -3 \end{bmatrix}, \quad B = \begin{bmatrix} 0 \\ 1 \end{bmatrix}, \quad C = \begin{bmatrix} 1 & 0 \end{bmatrix}, \quad D = 0.This realization is minimal and controllable, with states representing successive derivatives of the input filtered by the system dynamics.The observer canonical form is the dual of the controller form, obtained by transposing the matrices and interchanging the roles of inputs and outputs. Here, A is the transpose of the companion matrix (a left companion form), B^T = C from the controller form, and C^T = B, ensuring observability. For the same transfer function G(s) = \frac{1}{s^2 + 3s + 2}, the matrices become:A = \begin{bmatrix} 0 & -2 \\ 1 & -3 \end{bmatrix}, \quad B = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \quad C = \begin{bmatrix} 0 & 1 \end{bmatrix}, \quad D = 0.This form is useful for observer design, as the structure guarantees that all states are observable.[41]The Jordan canonical form transforms the system matrix A into a block-diagonal structure consisting of Jordan blocks, each corresponding to an eigenvalue of A. If A is diagonalizable, the form is fully diagonal, with entries as the system's poles, directly revealing the natural modes and facilitating modal analysis. For non-diagonalizable cases, off-diagonal ones in the blocks account for repeated eigenvalues and generalized eigenvectors. The transformation matrix P satisfies J = P^{-1} A P, where J is the Jordan form, and the input/output matrices adjust as \tilde{B} = P^{-1} B, \tilde{C} = C P. This form is particularly valuable in control systems for decomposing dynamics into independent modal subsystems.[42]For MIMO systems, the Brunovsky canonical form provides a controllable realization, decomposing the system into chains of integrators based on controllability indices, which are invariants under state feedback and coordinate changes. Each chain corresponds to a Brunovsky block, with the number and lengths determined by the ranks of controllability matrices. This form is unique up to permutation of blocks and is essential for feedback linearization and decoupling in multivariable control.[43]Similarity transformations preserve the input-output behavior, including the transfer function, of a state-space realization. Given a nonsingular matrix T, the transformed system \tilde{x} = T^{-1} x yields \tilde{A} = T^{-1} A T, \tilde{B} = T^{-1} B, \tilde{C} = C T, \tilde{D} = D, ensuring the transfer function C(sI - A)^{-1} B + D = \tilde{C} (sI - \tilde{A})^{-1} \tilde{B} + \tilde{D}. This allows conversion between different canonical forms without altering system dynamics.[44]For non-proper transfer functions, where the degree of the numerator exceeds that of the denominator (resulting in a polynomial part), standard state-space realizations must be extended. Realizations incorporate a direct feedthrough polynomial matrix N(s) alongside the strictly proper part, often using descriptor or generalized state-space forms like E \dot{x} = A x + B u, y = C x + D u + N(s) u, to handle improper fractions while maintaining minimality. This approach ensures complete representation of high-frequency behavior in polynomial-compensated systems.[45][46]
Stability and Poles
In state-space representations of linear systems, internal stability is determined by the eigenvalues of the system matrix A. For continuous-time systems described by \dot{x}(t) = Ax(t), the system is asymptotically stable if all eigenvalues of A have negative real parts, ensuring that the state trajectories converge to the origin from any initial condition.[47][48] For discrete-time systems given by x(k+1) = Ax(k), asymptotic stability requires all eigenvalues of A to lie strictly inside the unit circle in the complex plane.[49][50]The poles of a state-space system are defined as the eigenvalues of the matrix A, which govern the natural modes of the system's response.[51][33] While the poles of the corresponding transfer function coincide with the observable and controllable eigenvalues of A, the full set of poles in the state-space model includes unobservable modes that do not appear in input-output descriptions but still affect internal dynamics.[52][53] A matrix A is termed Hurwitz if all its eigenvalues have negative real parts, directly implying asymptotic stability for the continuous-time system.[54]Lyapunov theory provides an alternative characterization of stability without explicitly computing eigenvalues. For the continuous-time system, if there exist positive definite matrices P and Q satisfying the Lyapunov equation A^T P + P A = -Q, then the system is asymptotically stable, as the quadratic form V(x) = x^T P x serves as a Lyapunov function whose derivative is negative definite along trajectories.[55][56] This approach is particularly useful for verifying stability in systems where eigenvalue computation is numerically challenging.The Routh-Hurwitz criterion offers a method to assess stability by examining the characteristic polynomial \det(sI - A) without finding its roots explicitly. For a polynomial p(s) = s^n + a_{n-1}s^{n-1} + \cdots + a_0, the criterion constructs a Routh array from the coefficients; the system is stable if all elements in the first column of the array are positive, indicating no roots in the right-half complex plane.[57][58] For example, consider a second-order system with characteristic polynomial s^2 + 3s + 2 = 0; the Routh array is\begin{array}{cc}
s^2 & 1 & 2 \\
s^1 & 3 & 0 \\
s^0 & 2 & 0
\end{array}with all first-column entries positive, confirming stability corresponding to eigenvalues at -1 and -2.[59]Detectability extends observability to stability contexts: a state-space system (A, C) is detectable if all unobservable modes—eigenvalues of A in the unobservable subspace—are stable (negative real parts for continuous time or inside the unit circle for discrete time).[60][61] This ensures that, even if some states are unobservable, their dynamics do not lead to instability, allowing asymptotic estimation of the observable subspace.[62]
Control Design
State Feedback
State feedback is a fundamental technique in control design for linear systems, where the control input u is constructed as a linear function of the full state vector x, given by u = -K x + r. Here, K is the state feedback gain matrix, and r represents a reference or setpoint input. This form assumes direct access to all state measurements. Substituting into the system dynamics \dot{x} = A x + B u yields the closed-loop system \dot{x} = (A - B K) x + B r, where the matrix A - B K governs the resulting behavior.For controllable systems, the eigenvalues of the closed-loop matrix A - B K can be assigned arbitrarily by appropriate choice of K, enabling designers to achieve desired performance characteristics such as stability, response speed, and damping. This pole placement capability stems from the ability to shape the closed-loop characteristic polynomial through feedback, provided the pair (A, B) satisfies the controllability condition.In single-input single-output (SISO) systems, Ackermann's formula provides a direct computational method for determining K. For a controllable system in controllable canonical form, the gain is K = [0 \ \dots \ 0 \ 1] \mathcal{C}^{-1} \phi(A), where \mathcal{C} is the controllability matrix and \phi(\lambda) is the desired closed-loop characteristic polynomial. This formula simplifies pole assignment by leveraging the structure of the controllability matrix.An optimal approach to state feedback design is the linear quadratic regulator (LQR), which minimizes the infinite-horizon cost functional J = \int_0^\infty (x^T Q x + u^T R u) \, dt, where Q \geq 0 and R > 0 are weighting matrices penalizing state deviations and control effort, respectively. The optimal gain K = R^{-1} B^T P is obtained by solving the algebraic Riccati equation A^T P + P A - P B R^{-1} B^T P + Q = 0 for the positive semidefinite P. This method balances performance and control effort while guaranteeing stability for controllable and observable systems.[63]Consider the double integrator system, a classic example modeling position and velocitydynamics: \dot{x} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} x + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u, with state x = \begin{bmatrix} [position](/page/Position) \\ [velocity](/page/Velocity) \end{bmatrix}. To stabilize it by placing closed-loop poles at -1 (double root), yielding the characteristic polynomial (s + 1)^2 = s^2 + 2s + 1, the feedbackgain is K = [1 \ 2], resulting in u = -[1 \ 2] x. The closed-loop matrix becomes A - B K = \begin{bmatrix} 0 & 1 \\ -1 & -2 \end{bmatrix}, confirming the desired eigenvalues. This example illustrates how statefeedback can transform an unstable open-loop system into a stable one with specified dynamics.For multi-input multi-output (MIMO) systems, pole placement via state feedback extends naturally to controllable pairs (A, B), but the solution is not unique when the number of inputs exceeds one. With m inputs and state dimension n, there are m n elements in K but only n poles to assign, providing (m-1) n degrees of freedom for additional objectives, such as minimizing sensitivity or optimizing transient response. Seminal results establish that arbitrary pole assignment is achievable if and only if the system is controllable.
Observer Design
In observable linear systems, state observers are designed to estimate the internal state vector from measured outputs and known inputs, enabling the implementation of state feedback when full state measurement is impractical. The foundational approach, known as the Luenberger observer, reconstructs the state by augmenting the system dynamics with a correction term based on the output prediction error.[64]The dynamics of the Luenberger observer for a continuous-time system \dot{x} = A x + B u, y = C x are given by\dot{\hat{x}} = A \hat{x} + B u + L (y - C \hat{x}),where \hat{x} is the estimated state and L is the observer gain matrix. The estimation error e = x - \hat{x} evolves according to \dot{e} = (A - L C) e, so the eigenvalues of A - L C can be arbitrarily placed (within the constraints of observability) to ensure rapid convergence of the error to zero, typically faster than the system dynamics.[64][65]Observer design exhibits duality with state feedback control: just as the feedback gain K in u = -K x places poles of A - B K, the observer gain L places poles of the dual form A - L C, often derived via L = -P C^T where P solves a dual Riccati-like equation.[65]For systems with process and measurement noise, the Kalman filter extends the Luenberger observer by incorporating stochastic models, yielding the estimator\dot{\hat{x}} = A \hat{x} + B u + K (y - C \hat{x}),where the steady-state gain K minimizes the error covariance via the algebraic Riccati equation involving process noise covariance Q and measurement noise covariance R.A classic example is velocity estimation in a second-order system where only position is measured, such as a mass-spring-damper with state vector [position, velocity]^T. The system matrices are A = \begin{bmatrix} 0 & 1 \\ -k/m & -c/m \end{bmatrix}, B = \begin{bmatrix} 0 \\ 1/m \end{bmatrix}, C = [1 \, 0]; designing L to place observer poles at, say, -10 \pm 10j yields accurate velocity reconstruction from position data within transients.[65]By the separation principle, the observer and state feedback designs can be performed independently for linear systems, with the combined output feedback controller u = -K \hat{x} having closed-loop poles as the union of those from A - B K and A - L C, provided both are stable.[65]
Setpoint Tracking
In state-space control, setpoint tracking refers to designing controllers that enable the system output to follow a desired reference signal r, such as a constant or slowly varying setpoint, while minimizing steady-state error. For a linear time-invariant system described by \dot{x} = A x + B u and y = C x, a basic state feedback law u = -K x + N r stabilizes the system around the origin but may result in nonzero steady-state error for nonzero r unless N is appropriately scaled. To achieve zero steady-state error for constant references, the scaling matrix N is selected such that the equilibrium output matches r. Assuming the closed-loop system is stable, the steady-state state is x_{ss} = -(A - B K)^{-1} B N r, and thus y_{ss} = C x_{ss} = r, yielding N = -(C (A - B K)^{-1} B)^{-1} for single-input single-output systems. This feedforward scaling ensures exact tracking at steady state without altering the feedback gains K.[66]However, this approach assumes perfect model knowledge and does not inherently reject constant disturbances or model mismatches, which can still cause offset. To address these issues and guarantee zero steady-state error robustly, integral control augments the state vector with an integrator on the tracking error. The augmented state is \tilde{x} = \begin{bmatrix} x \\ \xi \end{bmatrix}, where \xi evolves as \dot{\xi} = r - y = r - C x, leading to the extended dynamics:\dot{\tilde{x}} = \begin{bmatrix} A & 0 \\ -C & 0 \end{bmatrix} \tilde{x} + \begin{bmatrix} B \\ 0 \end{bmatrix} u + \begin{bmatrix} 0 \\ 1 \end{bmatrix} r.A stabilizing feedback law u = -\tilde{K} \tilde{x} is then applied, where \tilde{K} = [K_I \quad k_I] places the poles of the augmented system for desired performance. This structure embeds integral action, driving the accumulated error \xi to zero and ensuring asymptotic tracking of constant setpoints regardless of disturbances or parameter variations, at the expense of increased system order and potential stability margins.[66][67]A representative example is position control of a DC motor, modeled in state space with states for angular position \theta, velocity \omega, and an integral error state \xi to eliminate steady-state offset for step references. The motor dynamics are \dot{\theta} = \omega, \dot{\omega} = -\left( \frac{b}{J} + \frac{K_t K_b}{J R} \right) \omega + \frac{K_t}{J R} u, and \dot{\xi} = r - \theta, where parameters include inertia J, damping b, torque constant K_t, back-emf constant K_b, and resistance R. Augmenting with the integrator yields a third-order system, and full-state feedback gains are designed via pole placement to achieve fast settling (e.g., poles at -5, -10, -15) with zero steady-state error to a unit step, as verified in simulation where the position tracks r = 1 radian without offset after transients. This demonstrates how integral augmentation enables precise setpoint following in electromechanical systems.[68]For time-varying references in servo systems, such as ramp or sinusoidal commands, a precompensator extends the basic tracking design by shaping the reference before it enters the feedback loop. In state space, the control law becomes u = -K x + M v, where v is a filtered version of r generated by a precompensator model \dot{z} = F z + G r and v = H z + J r, chosen to match the command's dynamics (e.g., internal model principle for specific signals). This two-degree-of-freedom structure decouples tracking from regulation, allowing optimization of reference following independently of disturbance rejection, as shown in servo applications where it improves bandwidth and reduces lag for dynamic setpoints.[69]The integral action in state-space formulations embeds the I-term of classical PID control, providing a unified framework where proportional and derivative effects arise from state feedback on position and velocity, while the integrator handles steady-state elimination. This equivalence allows state-space methods to realize PID-like performance with full state utilization, extending to multivariable systems where traditional PID tuning is challenging.[70]
Nonlinear Systems
General Representation
The state-space representation for nonlinear dynamic systems provides a framework to model the evolution of internal states without assuming linearity in the relationships between states, inputs, and outputs. This approach is essential for capturing phenomena in physical systems where interactions lead to behaviors such as bifurcations, chaos, or saturation effects that linear models cannot represent.[71]In continuous time, the general form of a nonlinear state-space model is expressed as\dot{x}(t) = f(x(t), u(t), t),y(t) = h(x(t), u(t), t),where x(t) \in \mathbb{R}^n denotes the state vector, u(t) \in \mathbb{R}^m the input vector, y(t) \in \mathbb{R}^p the output vector, and f: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R} \to \mathbb{R}^n and h: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R} \to \mathbb{R}^p are nonlinear functions that may explicitly depend on time t.[71] This formulation arises naturally from first-principles modeling of systems like mechanical or electrical networks, where the rate of change of states depends nonlinearly on current states and external influences.[72] For discrete-time systems, sampled or inherently discrete processes are modeled asx_{k+1} = f(x_k, u_k),y_k = h(x_k, u_k),with the time index k indicating discrete steps, and f, h similarly nonlinear.[73]Under suitable conditions, solutions to these equations exist and are unique. Specifically, for the continuous-time case, the Picard-Lindelöf theorem guarantees local existence and uniqueness of solutions if f is continuous in t and locally Lipschitz continuous in x and u.[74] This theorem relies on fixed-point iteration in a Banach space, ensuring that initial-value problems yield well-defined trajectories starting from any initial state x(0).[74]The state space, or phase space, visualizes the system's behavior as trajectories—curves traced by the state vector over time—within the n-dimensional space of possible states.[75] Equilibrium points, where the system remains stationary under constant input, satisfy f(x_e, u_e) = 0, marking fixed points in this space around which trajectories may converge, diverge, or orbit depending on the nonlinearity.[76]A representative example is a simple nonlinear oscillator, such as one with a potential incorporating cubic terms, leading to state equations like \dot{x}_1 = x_2 and \dot{x}_2 = -x_1 + \alpha x_1^3 - \delta x_2 for states x_1 (position) and x_2 (velocity), where parameters \alpha and \delta introduce asymmetry and damping; this produces limit cycles absent in linear harmonic oscillators.[77]Analysis of nonlinear state-space models presents challenges beyond linear systems, primarily because the superposition principle does not hold: the response to combined inputs is not the sum of individual responses, complicating prediction and design.[71] Additionally, the absence of closed-form solutions and the potential for multiple equilibria or sensitive dependence on initial conditions demand specialized tools like Lyapunov functions or numerical simulation for stability assessment.[71]
Linearization Techniques
Linearization techniques approximate nonlinear state-space models by linear ones, enabling the application of linear analysis tools to study local behavior around specific operating points. These methods rely on the first-order Taylor series expansion of the nonlinear dynamics, which provides a valid approximation for small perturbations from the operating point. This approach is particularly useful for systems described by \dot{x} = f(x, u) and y = h(x, u), where f and h are nonlinear functions, by deriving a linear model that captures the system's response to small deviations \delta x = x - x_e and \delta u = u - u_e from an equilibrium or operating point (x_e, u_e).[78]The core of Jacobian linearization involves computing the partial derivatives of f and h at the operating point to form the state and input matrices. For an equilibrium point where f(x_e, u_e) = 0, the linearized state equation becomes:\begin{aligned}
\dot{\delta x} &= \left. \frac{\partial f}{\partial x} \right|_{x_e, u_e} \delta x + \left. \frac{\partial f}{\partial u} \right|_{x_e, u_e} \delta u, \\
\delta y &= \left. \frac{\partial h}{\partial x} \right|_{x_e, u_e} \delta x + \left. \frac{\partial h}{\partial u} \right|_{x_e, u_e} \delta u,
\end{aligned}where \left. \frac{\partial f}{\partial x} \right|_{x_e, u_e} is the Jacobian matrix A, and similarly for the other terms. This yields a linear time-invariant (LTI) model if the operating point is fixed; otherwise, if the point varies with time, the result is a linear time-varying (LTV) system. The approximation arises from truncating the higher-order terms in the Taylor expansion, such as quadratic and beyond, which become negligible for small signals where \|\delta x\| \ll \|x_e\| and \|\delta u\| \ll \|u_e\|. This small-signal analysis ensures the linear model accurately represents the nonlinear system's local dynamics, as validated in standard control theory texts.[79][80]Consider a generic nonlinear system \dot{x} = -x + 0.5 x^2 + u with equilibrium at (x_e, u_e) = (0, 0). The Jacobian matrices are A = \left. \frac{\partial}{\partial x}(-x + 0.5 x^2 + u) \right|_{0,0} = -[1](/page/1) and B = \left. \frac{\partial}{\partial u}(-x + 0.5 x^2 + u) \right|_{0,0} = [1](/page/1), leading to the linearized model \dot{\delta x} = - \delta x + \delta u. For small deviations, this LTI approximation holds, illustrating how linearization simplifies analysis without capturing global nonlinear effects. While higher-order Taylor terms like \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\delta x)^2 exist, they are omitted in first-order linearization to prioritize computational tractability and focus on dominant linear behavior.[78]In applications, such as stability assessment, the eigenvalues of the Jacobian matrix A determine local stability: if all have negative real parts, the equilibrium is asymptotically stable. This conclusion is rigorously justified by the Hartman-Grobman theorem, which states that near a hyperbolic equilibrium (where no eigenvalue has zero real part), the nonlinear flow is topologically conjugate to its linearization, preserving qualitative behavior like stability. Extensions of the theorem to control systems confirm that linear approximations suffice for local analysis under generic conditions, though full topological linearization may require non-pointwise transformations.[81][80]
Pendulum Example
The simple pendulum serves as a canonical example for demonstrating the state-space representation of nonlinear dynamical systems in control theory. The governing equation of motion, derived from Newton's laws or Lagrangian mechanics, is the second-order nonlinear differential equation\ddot{\theta} + \frac{g}{l} \sin \theta = u,where \theta denotes the angular displacement from the downward vertical, g is the acceleration due to gravity, l is the pendulum length, and u represents the applied torque input.[82] To express this in state-space form, define the state vector as x = [x_1, x_2]^T with x_1 = \theta and x_2 = \dot{\theta}, yielding the nonlinear state equations\dot{x}_1 = x_2,\dot{x}_2 = -\frac{g}{l} \sin x_1 + u,along with the output equation y = x_1 = \theta.[83] These equations capture the essential nonlinearity through the \sin x_1 term, which prevents superposition and leads to behaviors like amplitude-dependent periods.Equilibrium points occur where \dot{x} = 0 and u = 0, so x_2 = 0 and \sin x_1 = 0, giving x_1 = n\pi for integer n. The downward position (\theta = 0) is stable, while the upward position (\theta = \pi) is unstable.[84]Linearization around the downward equilibrium x_e = [0, 0]^T involves computing the Jacobian of the dynamics, resulting in the linear state-space matricesA = \begin{bmatrix} 0 & 1 \\ -\frac{g}{l} & 0 \end{bmatrix}, \quad B = \begin{bmatrix} 0 \\ 1 \end{bmatrix},with C = [1, 0]. The eigenvalues of A are purely imaginary (\pm j \sqrt{g/l}), indicating stable oscillatory behavior.[83] In contrast, linearization at the upward equilibrium x_e = [\pi, 0]^T (shifting states by \tilde{x}_1 = x_1 - \pi) yieldsA = \begin{bmatrix} 0 & 1 \\ \frac{g}{l} & 0 \end{bmatrix}, \quad B = \begin{bmatrix} 0 \\ 1 \end{bmatrix},with eigenvalues \pm \sqrt{g/l} (one positive real), confirming instability.[83] This distinction highlights the need for nonlinear control strategies, such as energy-based methods, to swing the pendulum up to the unstable equilibrium, followed by linear feedback for stabilization.Simulations reveal key differences between the nonlinear and linear models: the linear approximation accurately predicts small-angle oscillations with constant period $2\pi \sqrt{l/g}, but deviates for larger angles, where the nonlinear model exhibits amplitude-dependent periods (longer for bigger swings) and enables full rotations beyond \pm \pi, behaviors absent in the linear case.