Fact-checked by Grok 2 weeks ago

Overdetermined system

In linear algebra, an overdetermined system is a collection of linear equations where the number of equations exceeds the number of unknowns, typically denoted as m > n for an m \times n coefficient matrix.^[1] Such systems are often inconsistent, meaning no exact solution satisfies all equations simultaneously, due to the additional constraints imposed by the extra equations.^[2] However, if the equations are consistent—arising when the right-hand side vector lies in the column space of the coefficient matrix—a solution exists, though it may not be unique if the matrix is rank-deficient.^[3] When exact solutions are unavailable, overdetermined systems are commonly addressed through the method of least squares, which seeks to minimize the sum of the squared residuals between the observed equations and the model's predictions.^[4] This approach yields the optimal approximate solution \hat{x} by solving the normal equations A^T A \hat{x} = A^T b, where A is the coefficient matrix and b is the vector of constants, assuming A has full column rank. The least squares solution is particularly valuable in applications involving noisy data, as it provides a robust approximation that balances the influence of all equations.^[5] Overdetermined systems arise frequently in practical scenarios, such as linear regression for fitting models to experimental data, where the number of data points (equations) surpasses the number of parameters (unknowns).^[4] They also appear in signal processing tasks like linear prediction, smoothing, and deconvolution, enabling the extraction of underlying patterns from overabundant measurements.^[6] In numerical computations, techniques like the singular value decomposition (SVD) further enhance the stability of least squares solutions for ill-conditioned overdetermined systems.^[7] These methods underscore the system's role in bridging theoretical mathematics with data-driven analysis across engineering and sciences.^[8]

Fundamentals

Definition

In mathematics, an overdetermined system is a set of equations in which the number of equations exceeds the number of unknowns. Formally, consider m equations in n unknowns, where m > n, expressed as f_i(\mathbf{x}) = 0 for i = 1, \dots, m, with \mathbf{x} \in \mathbb{R}^n.^[9] This setup applies to both linear and nonlinear equations, though the linear case is more commonly analyzed in foundational texts. A key property of overdetermined systems is that they are often inconsistent, meaning no exact solution \mathbf{x} satisfies all equations simultaneously due to the excess constraints.^[10] In such cases, approximate solutions are sought to minimize the violation of the equations, typically via methods that balance the constraints.^[9] The concept emerged in the context of linear algebra during the 19th century, with foundational contributions from Carl Friedrich Gauss, who developed the least squares method around 1795 to address overdetermined systems arising in astronomical observations.^[11] Gauss's work, published in 1809, provided a rigorous justification for approximating solutions in inconsistent systems.^[12]

Comparison to Square and Underdetermined Systems

In linear algebra, systems of equations are classified based on the relationship between the number of equations m and the number of unknowns n. Square systems occur when m = n, meaning the number of equations matches the number of variables. If the coefficient matrix is full rank (invertible), such a system has a unique solution, as the matrix can be inverted to solve for the variables directly. This balance ensures that, under the full-rank condition, the system is well-posed with no redundancy or deficiency in information. Underdetermined systems, by contrast, feature fewer equations than unknowns, so m < n. These systems are underconstrained, often leading to infinitely many solutions if consistent, as the equations do not fully specify a unique point in the solution space; instead, solutions form an affine subspace of dimension n - r, where r is the rank of the coefficient matrix. Consistency is more likely in underdetermined cases because there are fewer constraints relative to the degrees of freedom, allowing flexibility in satisfying all equations simultaneously. Overdetermined systems, with m > n, present the opposite scenario: more equations than unknowns, making them typically overconstrained. Such systems may have no exact solution if the equations are inconsistent, though a solution exists if they are consistent, and it is unique if the coefficient matrix has full column rank (rank equal to the number of unknowns). The excess equations introduce potential conflicts, reducing the likelihood of exact consistency compared to square or underdetermined systems. This overconstraint often necessitates approximate methods, such as least squares, to find a best-fit solution. To illustrate these distinctions, the following table compares key properties across the three system types, assuming real-valued linear equations:

System Type	Equations vs. Unknowns	Degrees of Freedom (if consistent)	Likelihood of Consistency	Solution Uniqueness
Square	m = n	0 (unique point)	High (if full rank)	Unique (if full rank)
Underdetermined	m < n	n - m > 0 (affine subspace)	High	Infinitely many
Overdetermined	m > n	Negative (overconstrained)	Low	Unique (if consistent and full column rank); infinitely many (if consistent and rank-deficient); none otherwise

This comparison underscores the prerequisite role of overdetermined systems in real-world modeling, particularly with noisy or observational data where measurements exceed parameters to be estimated, ensuring robustness despite potential inconsistencies. For instance, in regression analysis or signal processing, overdetermined setups arise naturally from multiple data points, highlighting their utility in capturing underlying patterns amid imperfections.^[13]

Linear Overdetermined Systems

Geometric Interpretation

In the context of linear overdetermined systems, each linear equation can be geometrically interpreted as defining a hyperplane in the n-dimensional real vector space \mathbb{R}^n. A hyperplane is an (n-1)-dimensional affine subspace, representing all points x that satisfy the equation a_i \cdot x = b_i, where a_i is the normal vector and b_i is a scalar offset.^[14]^[15] The solution to the system is the intersection of all m such hyperplanes, but when m > n, the excess constraints make a common intersection point unlikely unless the equations are specially arranged.^[14]^[16] To illustrate in two dimensions (n=2, m=3), consider three lines in the plane, each corresponding to one equation. Generally, these lines will not concur at a single point; instead, they may form a triangle or fail to intersect pairwise, resulting in no solution.^[14] For instance, if two lines are parallel (inconsistent with the third), the system has no solution, as the hyperplanes do not overlap in the solution space.^[14] Conversely, if the lines are concurrent—meaning the third is dependent on the first two—a unique intersection point exists, though this is a rare configuration in overdetermined cases.^[14] This low-dimensional analogy extends to higher dimensions, where m > n hyperplanes in \mathbb{R}^n rarely share a common point due to the increased constraints.^[16] When an exact intersection is absent, the geometric intuition for approximation involves finding a point in \mathbb{R}^n that minimizes the collective distance to all hyperplanes, often in the least squares sense. This corresponds to the point whose perpendicular distances to each hyperplane sum to the minimal total squared error, providing the best "compromise" solution within the space.^[16]^[17] Such an approach projects the inconsistent constraints onto the feasible subspace, ensuring the error is orthogonal to possible solutions.^[17]^[18]

Matrix Formulation

A linear overdetermined system is commonly formulated in matrix notation as Ax = b, where A is an m \times n matrix with m > n (more equations than unknowns), the unknown vector x belongs to \mathbb{R}^n, and the right-hand side vector b belongs to \mathbb{R}^m. This setup arises in applications such as data fitting and parameter estimation, where the additional equations provide overconstraint but typically preclude an exact solution unless specific conditions hold.^[4] The matrix A is said to have full column rank if \operatorname{rank}(A) = n, a condition that ensures the columns of A are linearly independent and guarantees the existence of a unique approximate solution in the least squares sense. Without full column rank, the solution space may become non-unique, complicating the analysis.^[19] An exact solution to the system exists if and only if b lies in the column space of A, denoted b \in \operatorname{col}(A), meaning b can be expressed as a linear combination of the columns of A. In such cases, the system is consistent, and the residual vector vanishes.^[19] The residual vector is defined as r = b - Ax, which quantifies the degree of inconsistency between the observed b and the predicted Ax; its norm \|r\| is minimized in approximate solutions to capture the best fit within the constraints imposed by the overdetermined nature of the system.^[4]

Homogeneous Case

A homogeneous linear overdetermined system is given by the equation Ax = 0, where A is an m \times n matrix with m > n and x \in \mathbb{R}^n.^[20] This setup arises when there are more equations than unknowns, all with zero right-hand side. The system always admits the trivial solution x = 0.^[21] Nontrivial solutions exist if and only if the rank of A is less than n, meaning the columns of A do not span a full-dimensional space. In such cases, the solution set forms the kernel (or null space) of A, which is a vector subspace of \mathbb{R}^n with dimension n - \rank(A).^[22] The existence of nontrivial solutions is equivalent to the linear dependence of the columns of A.^[23] Geometrically, solving Ax = 0 corresponds to finding the intersection of m hyperplanes in \mathbb{R}^n, each passing through the origin.^[24] This intersection is precisely the kernel of A, representing the common directions satisfying all equations simultaneously. For instance, consider the $3 \times 2 matrix

A = \begin{pmatrix} 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{pmatrix},

which has rank 1. The kernel is the one-dimensional subspace spanned by \begin{pmatrix} -1 \\ 1 \end{pmatrix}, as any scalar multiple of this vector satisfies Ax = 0. This illustrates how a reduced rank leads to a nontrivial solution space despite the overdetermined nature of the system.

Inhomogeneous Case

In the inhomogeneous case, an overdetermined linear system is given by Ax = b, where A is an m \times n matrix with m > n, x \in \mathbb{R}^n, and b \in \mathbb{R}^m with b \neq 0. Such systems typically lack an exact solution because b does not belong to the column space of A.^[25] This contrasts with the homogeneous case (b = 0), which always admits the trivial solution x = 0, whereas the inhomogeneous system offers no such guarantee, resulting in nonzero residuals b - Ax for all x.^[1] If no exact solution exists, the vector in the column space of A closest to b (in the Euclidean norm) is the orthogonal projection of b onto that subspace.^[25] For example, consider the inconsistent system with three equations and two unknowns:

\begin{cases} 5x + 2y = 18 \\ 3x - 4y = 16 \\ 2x + 3y = 9 \end{cases}

The subsystem formed by the first two equations has coefficient matrix \begin{pmatrix} 5 & 2 \\ 3 & -4 \end{pmatrix} with determinant $5(-4) - 2(3) = -26 \neq 0, yielding the unique solution x = 4, y = -1. Substituting into the third equation gives $2(4) + 3(-1) = 5 \neq 9, demonstrating inconsistency.^[26] Consistency can be tested by solving square subsystems of n equations, ensuring the coefficient matrix has full rank (nonzero determinant), obtaining a candidate solution, and verifying whether it satisfies the remaining equations; agreement across multiple such subsystems indicates an exact solution exists.^[1]

Exact and Approximate Solutions for Linear Systems

Conditions for Exact Solutions

In an overdetermined linear system Ax = b, where A is an m \times n matrix with m > n and x \in \mathbb{R}^n, b \in \mathbb{R}^m, an exact solution exists if and only if the vector b lies in the column space of A.^[3] This condition ensures that b can be expressed as a linear combination of the columns of A, allowing for an x that satisfies the equation precisely.^[27] Equivalently, the system has an exact solution when the rank of the augmented matrix [A \mid b] equals the rank of A, denoted \operatorname{rank}([A \mid b]) = \operatorname{rank}(A).^[28] In this scenario, the redundancy introduced by the extra equations does not lead to inconsistency, as the additional constraints are satisfied by the same x that solves a minimal consistent subsystem. The equations are consistent precisely when any linear dependencies among the rows of A extend compatibly to the components of b.^[29] A standard algorithmic method to verify this consistency is Gaussian elimination on the augmented matrix [A \mid b], which transforms it into row echelon form; the system admits an exact solution if no row emerges with all zeros in the coefficient part but a nonzero entry in the augmented column.^[30] In practical settings, such as those arising from experimental data in the inhomogeneous case, exact solutions are uncommon because measurement errors or noise perturb b away from the column space of A.^[31]

Least Squares Method

The least squares method addresses inconsistent overdetermined linear systems Ax = b, where A is an m \times n matrix with m > n, by seeking the vector x \in \mathbb{R}^n that minimizes the squared Euclidean norm of the residual, \|Ax - b\|_2^2 = \sum_{i=1}^m (a_i^T x - b_i)^2, where a_i^T and b_i denote the i-th row of A and the i-th entry of b, respectively. This objective function, known as the sum of squared residuals, quantifies the overall deviation between the predicted values Ax and the observed data b. The method originates from astronomical applications, with Adrien-Marie Legendre first publishing it in 1805 for orbit determination, framing it as a minimization of squared errors in observational data.^[32] Carl Friedrich Gauss later provided a rigorous justification in 1809, linking the approach to probabilistic error assumptions under the Gaussian error model.^[33] To derive the solution, consider the objective S(x) = \|Ax - b\|_2^2 = (Ax - b)^T (Ax - b). Differentiating with respect to x and setting the gradient to zero yields $2A^T (Ax - b) = 0, which simplifies to the normal equations:

A^T A x = A^T b.

Geometrically, this corresponds to the orthogonal projection of b onto the column space of A, where the residual Ax - b is perpendicular to every column of A, ensuring the minimal distance in the least squares sense. If A has full column rank (rank n), then A^T A is positive definite and invertible, yielding a unique solution x = (A^T A)^{-1} A^T b. In cases of heteroscedastic errors, where observations have unequal variances, weighted least squares extends the method by minimizing \|W^{1/2} (Ax - b)\|_2^2 with a positive definite diagonal weight matrix W (typically inverse variances), leading to the weighted normal equations A^T W A x = A^T W b.

Numerical Solution Techniques

Numerical solution techniques for linear overdetermined systems primarily revolve around computing the least squares solution while ensuring numerical stability, especially for ill-conditioned matrices. The QR factorization method decomposes the matrix A \in \mathbb{R}^{m \times n} (with m > n) into an orthogonal matrix Q and an upper triangular matrix R, such that A = QR, allowing the least squares problem to be reduced to solving the triangular system R x = Q^T b. This approach is numerically stable because the orthogonality of Q preserves the condition number of the original system, avoiding amplification of rounding errors.^[34]^[35] The singular value decomposition (SVD) provides a more general framework, factoring A = U \Sigma V^T, where U and V are orthogonal, and \Sigma is diagonal with singular values. The least squares solution is then given by the pseudoinverse: x = V \Sigma^+ U^T b, where \Sigma^+ inverts the non-zero singular values. SVD excels in handling rank-deficient or ill-conditioned systems by naturally identifying and truncating small singular values, thus providing robust solutions even when the matrix is not full rank.^[36]^[37] In comparison, QR factorization is preferred for full-rank, well-conditioned problems due to its lower computational cost (approximately $2mn^2 flops versus SVD's $6mn^2 to $12mn^2 flops) and simplicity in implementation. However, SVD is essential for general cases, including rank deficiency or severe ill-conditioning, as it offers better insight into the system's structure through singular values.^[38]^[35] For large-scale sparse systems, direct methods like QR or SVD become impractical due to high memory and time requirements; instead, iterative methods such as the conjugate gradient (CG) algorithm are employed. CG solves the normal equations indirectly by minimizing the quadratic form associated with the least squares objective, converging rapidly for systems with favorable eigenvalue distributions and requiring only matrix-vector multiplications per iteration. Variants like LSQR extend CG to non-symmetric cases, making them suitable for sparse overdetermined least squares.^[39]^[40] A key computational consideration is avoiding the direct formation of the normal equations A^T A x = A^T b, as this squares the condition number of A (from \kappa(A) to approximately \kappa(A)^2), exacerbating ill-conditioning and leading to significant loss of accuracy in finite precision arithmetic. Orthogonal decompositions like QR and SVD circumvent this issue by working directly with A, maintaining stability even for matrices with \kappa(A) up to around $10^{12} in double precision.^[38]^[41]

Nonlinear Overdetermined Systems

Definition and Formulation

In mathematics, a nonlinear overdetermined system consists of m nonlinear equations in n variables, where m > n. These equations are typically expressed as f_i(\mathbf{x}) = 0 for i = 1, \dots, m, with \mathbf{x} \in \mathbb{R}^n. Such systems arise when attempting to satisfy more constraints than degrees of freedom allow, often in modeling or optimization contexts.^[9] A compact formulation is F(\mathbf{x}) = \mathbf{0}, where F: \mathbb{R}^n \to \mathbb{R}^m is a nonlinear mapping, and the system is overdetermined precisely when m > n. Unlike linear overdetermined systems, which can be represented via matrices, the nonlinear nature precludes a global matrix form but permits local approximations. The Jacobian matrix DF(\mathbf{x}), an m \times n matrix of first partial derivatives, enables linearization of F near a point \mathbf{x}_0 via the approximation F(\mathbf{x}_0 + \mathbf{h}) \approx F(\mathbf{x}_0) + DF(\mathbf{x}_0) \mathbf{h} = \mathbf{0}, reducing the problem locally to a linear overdetermined system.^[9]^[42] A representative example is circle fitting to m > 3 data points (\xi_i, \eta_i) in the plane, yielding the nonlinear system

(\xi_i - a)^2 + (\eta_i - b)^2 - r^2 = 0, \quad i = 1, \dots, m,

where (a, b, r) are the circle's center coordinates and radius; this forms an overdetermined set of quadratic equations in three unknowns.^[43] Due to overdetermination and nonlinearity, these systems generally lack an exact root, meaning no \mathbf{x} satisfies F(\mathbf{x}) = \mathbf{0} precisely, necessitating approximate solutions that minimize residuals.^[9]

Challenges in Solving

Unlike linear overdetermined systems, which often admit closed-form solutions under certain rank conditions, nonlinear overdetermined systems generally lack such explicit solutions and must be addressed through iterative optimization of the objective function \|F(x)\|^2, where F: \mathbb{R}^n \to \mathbb{R}^m with m > n represents the residual vector. This objective can exhibit multiple local minima due to the inherent nonconvexity of nonlinear mappings, leading to solutions that depend critically on the problem's structure and potentially trapping algorithms in suboptimal points.^[44] A key challenge is the ill-posedness of these systems, where small perturbations in the input data or equations can cause drastic changes in the computed solution, amplifying noise and undermining stability, particularly in applications like parameter estimation with imperfect observations. This sensitivity arises because the solution does not depend continuously on the data, often requiring regularization techniques to mitigate instability, though these add further complexity.^[45] Iterative methods for solving nonlinear overdetermined systems are highly sensitive to the initial guess x_0, as convergence to different local minima can occur based on the starting point, with poor choices leading to failure or slow progress. For instance, in nonlinear regression problems involving exponential fitting, such as approximating data with models like y(t) = a e^{bt}, the objective landscape can feature multiple critical points and non-unique minima, sometimes forming continua of solutions along bifurcation-like curves where parameters satisfy specific relations, resulting in ambiguous approximations.^[46]^[47] In contrast to linear systems, where the solution is globally unique if the matrix is full rank, local linearizations using the Jacobian matrix provide only approximate behavior near a point but fail to capture global nonlinear effects, such as bifurcations or distant minima, exacerbating the challenges in achieving reliable solutions.^[44]

Solution Strategies

Nonlinear overdetermined systems, formulated as finding x \in \mathbb{R}^n that approximately solves F(x) = 0 where F: \mathbb{R}^n \to \mathbb{R}^m with m > n, are typically addressed by minimizing the nonlinear least squares objective \| F(x) \|^2_2.^[9] This approach seeks to reduce the residual norm through iterative optimization, as exact solutions rarely exist.^[48] The Gauss-Newton method is a foundational iterative technique for this minimization, relying on local linearization of F at the current iterate x_k. It approximates the Hessian of the objective via J_k^T J_k, where J_k is the Jacobian of F at x_k, and solves the linear least squares system

J_k^T J_k \Delta x = -J_k^T F(x_k)

for the step \Delta x, updating x_{k+1} = x_k + \Delta x.^[9] This method exhibits quadratic convergence near a local minimum if the Jacobian is well-conditioned and the residual is small, but it can fail for ill-conditioned problems or poor initial guesses.^[49] To enhance robustness, the Levenberg-Marquardt algorithm introduces damping to the Gauss-Newton step, modifying the normal equations to

(J_k^T J_k + \lambda_k I) \Delta x = -J_k^T F(x_k),

where \lambda_k > 0 is adjusted dynamically to balance trust-region-like behavior with gradient descent.^[46] This damping improves global convergence properties, particularly for problems with singular Jacobians or when starting far from a solution, at the cost of potentially slower local convergence compared to pure Gauss-Newton.^[46] For multimodal objective landscapes where local methods may converge to suboptimal minima, global optimization techniques such as genetic algorithms or simulated annealing provide alternatives. Genetic algorithms evolve a population of candidate solutions through selection, crossover, and mutation, effectively searching the parameter space for low-residual configurations.^[50] Simulated annealing, inspired by metallurgical processes, allows probabilistic acceptance of worse solutions to escape local minima, gradually cooling to converge toward a global optimum. These stochastic methods are computationally intensive but valuable for non-convex problems lacking reliable initial points. Local linearization underpins these iterative approaches, including successive applications of Newton-like steps on linearized overdetermined systems to refine approximations progressively.^[9] Software implementations, such as MATLAB's lsqnonlin function, automate these strategies, supporting user-defined Jacobians and handling large-scale problems via trust-region or Levenberg-Marquardt variants.^[51] Convergence is typically assessed by tolerances on the residual norm \| F(x) \|_2 < \epsilon or the gradient norm \| \nabla \| F(x) \|^2 \|_2 < \delta, ensuring the algorithm halts when improvements are negligible.^[42] These criteria balance computational efficiency with solution accuracy in practice.^[52]

Applications and General Use

In Statistics and Data Fitting

In statistics, overdetermined systems arise prominently in linear regression, where the goal is to model the relationship between a response variable and one or more predictors using more observations than parameters to estimate. The standard linear regression model is formulated as \mathbf{y} = X \boldsymbol{\beta} + \boldsymbol{\varepsilon}, where \mathbf{y} is an m \times 1 vector of observed responses, X is an m \times n design matrix with m > n (full column rank), \boldsymbol{\beta} is the n \times 1 vector of unknown parameters, and \boldsymbol{\varepsilon} is the m \times 1 error vector with mean zero and constant variance \sigma^2.^[53] This setup creates an overdetermined system X \boldsymbol{\beta} = \mathbf{y} that typically lacks an exact solution due to noise in the data, necessitating approximation methods to fit the model to empirical observations.^[4] The ordinary least squares (OLS) method is the standard approach for solving such systems, minimizing the sum of squared residuals \| \mathbf{y} - X \boldsymbol{\beta} \|^2_2 to obtain parameter estimates. Under OLS assumptions—including linearity in parameters, no perfect multicollinearity among predictors, errors with zero conditional mean, and homoscedasticity (constant error variance across predictor levels)—the estimates are unbiased, consistent, and efficient.^[54]^[55] Homoscedasticity ensures reliable inference, as violations lead to inefficient estimates and biased standard errors; it can be assessed via residual plots showing random scatter without patterns.^[55] For a simple linear regression example with m > 2 data points (x_i, y_i), the model is y_i = \beta_0 + \beta_1 x_i + \varepsilon_i. In matrix form, this becomes \mathbf{y} = X \boldsymbol{\beta} + \boldsymbol{\varepsilon}, where X includes a column of ones and the x_i values. The OLS estimator is derived by solving the normal equations, yielding \hat{\boldsymbol{\beta}} = (X^T X)^{-1} X^T \mathbf{y}, which projects the response onto the column space of X for the best linear unbiased fit.^[54] This closed-form solution assumes X^T X is invertible, providing explicit coefficients like slope and intercept for line fitting through noisy points. Model validation in these overdetermined setups relies on goodness-of-fit measures such as the coefficient of determination R^2, which quantifies the proportion of variance in \mathbf{y} explained by the model: R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}. Values closer to 1 indicate better fit, though R^2 increases with added predictors regardless of relevance, so adjusted variants are preferred for comparison.^[56] Residuals analysis complements this by examining e_i = y_i - \hat{y}_i for patterns; standardized residuals should approximate a normal distribution with no autocorrelation or heteroscedasticity to confirm assumption validity and detect outliers.^[57] When multicollinearity inflates variance in overdetermined regressions (e.g., highly correlated predictors), extensions like ridge regression address this by introducing a bias term. The ridge estimator is \hat{\boldsymbol{\beta}}_k = (X^T X + k I)^{-1} X^T \mathbf{y} for k > 0, shrinking coefficients toward zero and stabilizing estimates in ill-conditioned systems while reducing mean squared error compared to OLS.^[58] This method, originally proposed for nonorthogonal problems, is particularly useful in high-dimensional data fitting where m \gg n but predictors are interdependent.^[58]

In Engineering and Signal Processing

In engineering, overdetermined systems arise frequently when modeling physical processes with noisy or redundant measurements, where the number of observations exceeds the number of unknown parameters, necessitating least-squares optimization to find the best approximate solution. This approach is essential for tasks such as parameter estimation in dynamic systems and designing filters that minimize reconstruction errors in the presence of imperfections. By formulating the problem as minimizing the squared residuals between predicted and observed data, engineers can obtain robust estimates that account for measurement uncertainties, improving the reliability of control systems and signal reconstruction algorithms.^[6] A key application is in system identification, where overdetermined setups allow estimation of model parameters from experimental data exceeding the parameter count, enabling accurate representation of physical systems like mechanical or electrical devices. For instance, AutoRegressive with eXogenous input (ARX) models, commonly used for linear time-invariant systems, are estimated via least-squares methods on input-output data sequences that provide more equations than unknowns, ensuring stable and unbiased parameter fits even with noise. This technique has been widely adopted in control engineering to derive transfer functions from test data, such as identifying vibration modes in structures.^[59]^[60] In signal processing, overdetermined systems facilitate the design of finite impulse response (FIR) filters by solving for coefficients that best approximate desired frequency responses through least-squares minimization of the error between ideal and achieved outputs. Overdetermined convolutions, where the filter impulse response is convolved with input signals to match multiple observed samples, are solved iteratively to reduce mean-squared error, particularly useful in applications like audio equalization or image restoration. This method outperforms exact matching by handling discrepancies in real-world signals, such as those corrupted by sensor noise.^[61]^[6] A practical example is global positioning system (GPS) positioning, where receivers use pseudorange measurements from more than four satellites (m > 4) to estimate the three-dimensional coordinates plus receiver clock bias (n = 4), forming an overdetermined nonlinear system solved via least-squares iteration to mitigate atmospheric and multipath noise. With typically 6–12 visible satellites, the redundancy enhances accuracy to sub-meter levels in civilian applications, making it indispensable for navigation in aerospace and automotive engineering.^[62] The Kalman filter extends this paradigm to dynamic overdetermined systems by providing a sequential least-squares solution that incorporates time-evolving measurements, recursively updating state estimates for systems like aircraft tracking or robotic control. In engineering contexts, it treats successive observations as an accumulating overdetermined set, optimally fusing sensor data to predict trajectories while bounding errors through covariance propagation.^[63] The method of least squares has a long history dating back to the early 19th century, with foundational contributions by Carl Friedrich Gauss and Adrien-Marie Legendre in astronomy and geodesy, where it was used to adjust observations from redundant measurements for precise calculations, such as in geodetic surveys starting around 1800 and formalized in the mid-19th century.^[64] In the 20th century, least-squares methods were extended to engineering fields, including electrical network analysis and parameter estimation, with iterative techniques and integration into early computational devices contributing to advancements in simulation and signal processing.^[65]

References

[1]
[PDF] Notes on Solving Systems of Linear Equations - UC Davis Math
Mar 8, 2007 · Definition 2.7. The system of linear equations System (1) is called. 1. overdetermined if m>n. 2. square if m = n. 3. underdetermined if m<n ...
[2]
[PDF] Lecture 16: Linear Algebra III - cs.wisc.edu
Nov 4, 2010 · 7.9. Figure 2: An Overdetermined Linear System. An overdetermined system is a system of linear equations in which the number of equations is. ...
[3]
[PDF] Section 3.5. Linear Systems of Equations
Jun 28, 2018 · In an overdetermined system, the condition rank([A | b]) > rank(A) implies that b is not in the column space of A and so there is no solution of ...Missing: algebra | Show results with:algebra
[4]
[PDF] 1 Review of Least Squares Solutions to Overdetermined Systems
Nov 9, 2010 · One important application of least squares solutions to overdetermined systems is in fitting a function to a data set. We can informally state ...
[5]
[PDF] ECE 586 Application: Least-Squares Solutions - Henry D. Pfister
2 Linear Least-Squares. The most common linear least-squares problem is the solution of an overdetermined system of linear equations. Let A ∈ RN×M be an N ...
[6]
[PDF] Least Squares with Examples in Signal Processing
In these notes, least squares is illustrated by applying it to several basic problems in signal processing: 1. Linear prediction. 2. Smoothing. 3. Deconvolution.
[7]
[PDF] Singular Value Decomposition (SVD)
• Least Squares Solutions of mxn Systems. - Consider the over-determined system of linear equations. Ax = b, (A is mxn with m>n). - Let r be the residual ...Missing: algebra | Show results with:algebra<|control11|><|separator|>
[8]
[PDF] Chapter 11 Least Squares, Pseudo-Inverses, PCA & SVD
The method of least squares is a way of “solving” an overdetermined system of linear equations. Ax = b,. i.e., a system in which A is a rectangular m × n-matrix.<|control11|><|separator|>
[9]
[PDF] Nonlinear Least Squares
nonlinear overdetermined systems. Let f = (f1,f2,...,fm) define the overdetermined system f(x) = 0, with. 1 m equations fi(x) = 0, i = 1,2,...,m,. 2 in n ...<|control11|><|separator|>
[10]
[PDF] Lecture 2i Overdetermined Systems (pages 345-346)
Definition: An overdetermined system of linear equations is a system that has more equations than variables. These systems do sometimes have solutions, but that ...
[11]
Gauss and the Invention of Least Squares - Project Euclid
The dispute is between Gauss and Legendre over the least squares method. It's argued Gauss knew it before, but didn't communicate it.
[12]
(PDF) Gauss and the Method of the Least Squares - ResearchGate
Aug 6, 2025 · The following article describes the history of the discovery of the method of least squares. Carl Friedrich Gauss (1777-1855) developed this ...
[13]
[PDF] System of linear equations
Mar 12, 2013 · 3. Geometric interpretation. For a system involving two variables (x and y), each linear equation determines a line on the.
[14]
[PDF] An Introduction to Hyperplane Arrangements - UPenn CIS
A finite hyperplane arrangement is a finite set of affine hyperplanes in a vector space, where a linear hyperplane is an (n-1)-dimensional subspace.<|control11|><|separator|>
[15]
[PDF] Data Modeling and Least Squares Fitting - cs.Princeton
Geometric Interpretation for Over-determined System. • Find the x that comes “closest” to satisfying. Ax=b. – i.e., minimize b–Ax. Page 20. Geometric ...
[16]
Geometric Interpretation of Least Squares
In these cases, we have an overdetermined system of equations (more equations than unknowns). Therefore, we cannot generally satisfy all the equations, and ...
[17]
[PDF] Least Squares
For an overdetermined linear system Ax = b, the normal equations are AT Ax ... geometric interpretation. Theorem. The solution x of AT Ax = AT b ...
[18]
[PDF] Linear Algebra : Essence & Form - Penn Math
Dec 28, 2024 · ... full column rank, its pseudoinverse is: A† = (AT A). −1. AT. This ... unique least squares solution: ˆx = A†b = (AT A). −1. ATb. This ...
[19]
[PDF] Lecture notes: overdetermined homogeneous linear system
An overdetermined homogeneous linear system (Ax=0) has more equations than unknowns. The solution is the eigen-vector of A^T A with the smallest eigen-value.Missing: properties kernel
[20]
Solution Sets
A homogeneous system always has the solution x = 0. This is called the trivial solution. Any nonzero solution is called nontrivial.
[21]
https://textbooks.math.gatech.edu/ila/solution-sets.html
[22]
https://people.cs.uchicago.edu/~laci/REU05/linalg/lin4.pdf
[23]
SYS-0050: Homogeneous Linear Systems - Ximera
Geometrically, a homogeneous system can be interpreted as a collection of lines or planes (or hyperplanes) passing through the origin.Missing: intersection | Show results with:intersection
[24]
[PDF] Linear Algebra 1: Matrices and Least Squares - MIT OpenCourseWare
We will see later that a linear system, Ax = b, in which A is a triangular matrix is particularly ... “closest” solution to a general overdetermined system of the ...
[25]
[PDF] 11.1 Linear Systems
3. A linear system with more equations than variables is called overdetermined. Such a system usually has no solutions. EXERCISES. 1–4.Missing: definition | Show results with:definition
[26]
[PDF] A review of important results for linear systems and ... - CSUN
If the system Ax = b is consistent, then by (1), b is in the column space of A. Therefore the column space of (A|b) will equal the column space of A. Since ...
[27]
Systems of Linear Equations - Department of Mathematics at UTSA
Nov 14, 2021 · In general, a system with more equations than unknowns has no solution. Such a system is also known as an overdetermined system. In the first ...Missing: definition | Show results with:definition
[28]
[PDF] System of Linear Equations
augmented matrix of an m x n system of linear equations (S), then : q. ( ). 1) If rank (A) < rank (M) , then S has no solutions. 2) If rank (A)=rank (M)=n, S ...
[29]
[PDF] Introduction - Calvin University
system is consistent if and only if b ∈ Span{a1,a2,...,an. }. On the other hand, we solve the system by using Gaussian elimination to put the aug- mented ...
[30]
[PDF] 1.4 Pseudo-Inverse, Least-Squares, and Regression
In fact, there will only be a solution x if b is in the column space of A, i.e. b 2 col(A). Technically, there may be some choices of b that admit infinitely ...
[31]
[PDF] Nouvelles méthodes pour la détermination des orbites des comètes
NOUVELLES MÉTHODES. POUR LA DÉTERMINATION. DES. ORBITES DES COMÈTES;. PAR A. M. LEGENDRE,. Membre de l'Institut et de la Légion d'honneur, de la Société royale ...
[32]
[PDF] Theoria motus corporum coelestium in sectionibus conicis solem ...
Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the.
[33]
[PDF] Least Squares - MathWorks
Sep 17, 2013 · β = X \y. Most of the computation is done by an orthogonalization algorithm known as the QR factorization.<|separator|>
[34]
[2412.19960] Numerical Linear Algebra: Least Squares, QR and SVD
Dec 28, 2024 · These lecture notes focus on some numerical linear algebra algorithms in scientific computing. We assume that students are familiar with elementary linear ...
[35]
[PDF] Singular value decomposition and least squares solutions
The decomposition (t) is called the singular value decomposition (SVD). There are alternative representations to that given by (t). We may write or. A=U, ...
[36]
[PDF] Calculating the Singular Values and Pseudo-Inverse of a Matrix
A matrix decomposition. In order to facilitate the computation of the singular values nd the pseudo-inverse of the complex m X n matrix A, we. Downloaded 04 ...
[37]
[PDF] Least squares reminder - CS@Cornell
We generally recommend solving least squares via QR factorization because κ(R1) = κ(A), while forming the normal equations squares the condition number. If ...
[38]
[PDF] Iterative Methods for Sparse Linear Systems Second Edition
In the six years that passed since the publication of the first edition of this book, iterative methods for linear systems have made good progress in ...
[39]
[PDF] An Introduction to the Conjugate Gradient Method Without the ...
Aug 4, 1994 · The Conjugate Gradient Method is the most prominent iterative method for solving sparse systems of linear equations.
[40]
[PDF] numerically efficient methods for solving least squares problems
Aug 24, 2012 · Thus we can conclude that any numerical method using the Normal Equations will be unstable, since the rounding errors will correspond to (κ(A))2 ...
[41]
4.7. Nonlinear least squares - Toby Driscoll
The nonlinear least-squares problem is to find x ∈ R n such that is minimized. As in the linear case, we consider only overdetermined problems, where m > n.
[42]
[PDF] Least-Squares Fitting of Circles and Ellipses∗
To fit a circle, we need to compute the coefficients a, b and c from the given data points. If we insert the coordinates of the points into equation (2.1), we ...
[43]
[PDF] METHODS FOR NON-LINEAR LEAST SQUARES PROBLEMS
Least squares problems can be solved by general optimization methods, but we shall present special methods that are more efficient. In many cases they.
[44]
[PDF] On an Elliptical Trust-Region Procedure for Ill-Posed Nonlinear ...
In this paper, we address the stable numerical solution of ill-posed non- linear least-squares problems with small residual. We propose an elliptical trust- ...
[45]
[PDF] The Levenberg-Marquardt algorithm for nonlinear least squares ...
May 5, 2024 · The Levenberg-Marquardt method (as described herein) does not necessarily converge to the closest local minimum. A good initial guess need not ...
[46]
[PDF] on fitting exponentials by nonlinear least squares
some examples of non-unique solutions, extending work of Kammler ("1979). Finally in Section 4, we treat the special case where f(t) is itself an expo- nential ...
[47]
[PDF] Approximate Gauss-Newton methods for nonlinear least squares ...
The Gauss-Newton method consists in solving a sequence of linearized least squares approx- imations to the nonlinear (NLSP) problem, each of which can be ...
[48]
Approximate Gauss–Newton Methods for Nonlinear Least Squares ...
The Gauss–Newton algorithm is an iterative method regularly used for solving nonlinear least squares problems.<|control11|><|separator|>
[49]
Genetic algorithms: A powerful tool for large-scale nonlinear ...
Genetic algorithms represent an efficient global method for nonlinear optimization problems, that are encountered in the earth sciences.
[50]
lsqnonlin - Solve nonlinear least-squares (nonlinear data-fitting ...
Nonlinear least-squares solver. Solves nonlinear least-squares curve fitting problems of the form min x ‖ f ( x ) ‖ 2 2 = min x ( f 1 ( x ) 2 + f 2Description · Examples · Input Arguments · Limitations
[51]
Solving Non-linear Least Squares - Ceres Solver
The general strategy when solving non-linear optimization problems is to solve a sequence of approximations to the original problem.
[52]
Linear Least Squares Regression - Information Technology Laboratory
The minimization process reduces the overdetermined system of equations formed by the data to a sensible system of p , ... Finally, the theory associated with ...
[53]
5.4 - A Matrix Formulation of the Multiple Regression Model
Here, we review basic matrix algebra, as well as learn some of the more important multiple regression formulas in matrix form.
[54]
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear ...
This preferred condition is known as homoscedasticity (same scatter). If the variance changes, we refer to that as heteroscedasticity (different scatter). The ...
[55]
Coefficient of Determination (R²) | Calculation & Interpretation - Scribbr
Apr 22, 2022 · More technically, R2 is a measure of goodness of fit. It is the proportion of variance in the dependent variable that is explained by the model.<|separator|>
[56]
How To Interpret R-squared in Regression Analysis - Statistics By Jim
R-squared is a goodness-of-fit measure for linear regression models. This statistic indicates the percentage of the variance in the dependent variable.R-Squared Has Limitations · Are Low R-Squared Values... · Are High R-Squared Values...
[57]
[PDF] Ridge Regression: Biased Estimation for Nonorthogonal Problems
HOERL AND ROBERT W. KENNARD. University of Delaware and E. 1. du Pont de Nemours & Co. In multiple regression it is shown that parameter estimates based ...
[58]
[PDF] System Identification with Multi-Step Least-Squares Methods
The purpose of system identification is to build mathematical models for dynam- ical systems from experimental data. With the current increase in complexity ...
[59]
Frequency domain ARX model and multi-harmonic FRF estimators ...
Overdetermined, least-mean-squares calculations are used to minimize model ... System Identification Theory for the User, PTR Prentice-Hall, Englewood ...
[60]
Least-Squares Linear-Phase FIR Filter Design - DSPRelated.com
In these cases, we have an overdetermined system of equations (more equations than unknowns). Therefore, we cannot generally satisfy all the equations, and are ...
[61]
https://www.dsprelated.com/freebooks/sasp/Least_Squares_Linear_Phase_FIR_Filter.html
[62]
[PDF] Iterative Solution of Linear Systems in the 20-th Century
This motivated Freund and Nachtigal [75] to propose an algorithm in which the projected overdetermined tridiagonal system is solved in a least-squares sense.
[63]
Iterative solution of linear systems in the 20th century - ScienceDirect
This motivated Freund and Nachtigal [80] to propose an algorithm in which the projected overdetermined tridiagonal system is solved in a least-squares sense.