Fact-checked by Grok 2 weeks ago

Projection matrix

In linear algebra, a projection matrix is a square matrix P that represents a linear transformation projecting vectors from a vector space onto a subspace, satisfying the idempotence property P^2 = P. Such matrices arise in the context of decomposing a vector space V as a direct sum V = X \oplus Y, where P maps any vector v = x + y (with x \in X and y \in Y) to its component x \in X, ensuring the image of P is X and the kernel is Y. For orthogonal projections, which are the most common type, P is symmetric and projects onto a subspace W (e.g., the column space of a matrix A) such that the error vector is perpendicular to W; the explicit formula is P = A (A^T A)^{-1} A^T, assuming A has full column rank and its columns form a basis for W. Projection matrices are fundamental in applications like regression, where they provide the closest approximation of a vector b in the subspace spanned by A's columns, minimizing the \| b - Pb \|. They also appear in for coordinate projections, for , and numerical methods for solving overdetermined systems. Key properties include , the fact that I - P is also a projection (onto the orthogonal complement for orthogonal cases), and invariance under certain linear operators if the subspaces are invariant.

Fundamentals

Definition

In linear algebra, a projection matrix P is defined as a square matrix that satisfies the idempotence condition P^2 = P. This property characterizes projections onto a subspace, where applying the operator twice yields the same result as applying it once. For orthogonal projections, which are the most common in applications involving Euclidean spaces, the matrix is additionally symmetric, satisfying P^T = P. This symmetry ensures that the projection is perpendicular to the complementary subspace. Such matrices project vectors from \mathbb{R}^n onto a lower-dimensional subspace while preserving angles and lengths within that subspace. In statistical linear models, the projection matrix takes the specific form of the hat matrix H = X(X^T X)^{-1} X^T, where X is the n \times p design matrix with full column rank. This matrix projects the observed response vector y onto the column space of X, producing the vector of fitted values \hat{y} = H y. The hat matrix inherits the idempotence and symmetry properties, making it an orthogonal projection operator in this context. The diagonal elements h_{ii} of the hat matrix, known as leverages, quantify the influence of the i-th response value y_i on the corresponding fitted value \hat{y}_i. These leverages satisfy $0 \leq h_{ii} \leq 1 for each i, with their sum equal to the number of parameters p in the model. This formulation assumes familiarity with basic concepts of vectors, matrices, and linear subspaces.

Geometric Interpretation

In linear algebra, a projection matrix P provides an orthogonal mapping of a \mathbf{v} in a onto a spanned by the columns of a matrix A, such that the projected vector P\mathbf{v} lies within the subspace and the vector \mathbf{v} - P\mathbf{v} is to every vector in that subspace. This orthogonality ensures that the projection minimizes the between \mathbf{v} and the subspace, representing the "closest point" approximation in the geometric sense. Consider an example in \mathbb{R}^2, where the subspace is the x-axis, spanned by the standard basis vector \mathbf{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}. The corresponding orthogonal projection matrix is P = \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}, which maps any \mathbf{v} = \begin{pmatrix} x \\ y \end{pmatrix} to P\mathbf{v} = \begin{pmatrix} x \\ 0 \end{pmatrix}, effectively dropping the y-component while preserving the x-coordinate. Geometrically, this can be visualized as a \mathbf{v} being "dropped" perpendicularly from its tip to the x-axis line, forming a between the \begin{pmatrix} 0 \\ y \end{pmatrix} and the projected point on the axis; such diagrams often illustrate the \mathbf{v} = P\mathbf{v} + (\mathbf{v} - P\mathbf{v}) with the orthogonal to the . This focus on orthogonality distinguishes orthogonal projections from oblique projections, where the residuals are not perpendicular to the subspace, leading to a slanted "drop" rather than a right-angled one; however, the orthogonal case maintains the property that applying P repeatedly to a vector in the subspace yields the same result, keeping it fixed within the subspace. In higher dimensions, such as projecting onto a plane in \mathbb{R}^3, the visualization extends to the residual being perpendicular to the entire plane, akin to a shadow cast by perpendicular light rays onto a flat surface.

Properties

Algebraic Properties

A projection matrix P in linear algebra is characterized by its idempotence, meaning P^2 = P. This property implies that P acts as a onto an , where applying P multiple times yields the same result as applying it once, leaving vectors in the of P unchanged while mapping others to that . To see this, consider the of P, where its eigenvalues are either 0 or 1; thus, P^2 has the same eigenvalues, confirming P^2 = P. For orthogonal projections, P is also symmetric, satisfying P^T = P. This symmetry ensures that the projection is self-adjoint, preserving inner products in the sense that \langle Pv, w \rangle = \langle v, Pw \rangle for all vectors v, w. Consequently, P coincides with its own Moore-Penrose pseudoinverse in the context of orthogonal projections onto the column space, as P^+ = P. The rank of P equals the dimension of its range, which is the subspace onto which it projects, and this rank is also equal to the trace of P. Since the nonzero eigenvalues of P are all 1 and their count matches the rank, the trace, as the sum of eigenvalues, directly gives \operatorname{trace}(P) = \operatorname{rank}(P). Regarding null spaces, the kernel of I - P is precisely the range of P, while the kernel of P is the range of I - P. This decomposition highlights how P and I - P partition the space into the projected subspace and its for orthogonal projections. A standard formula for the orthogonal projection matrix onto the column space of a full column rank matrix A \in \mathbb{R}^{m \times n} (with m \geq n) is P = A A^+, where A^+ is the Moore-Penrose pseudoinverse of A. For full column rank, A^+ = (A^T A)^{-1} A^T, so P = A (A^T A)^{-1} A^T. To derive this, note that P must satisfy P^2 = P and project onto \operatorname{range}(A). Substituting yields P^2 = A (A^T A)^{-1} A^T A (A^T A)^{-1} A^T = A (A^T A)^{-1} A^T = P, confirming idempotence, and the columns of A lie in the range of P since PA = A. Symmetry follows from the transpose: P^T = A (A^T A)^{-1} A^T = P. This form generalizes via the pseudoinverse for rank-deficient cases, where A A^+ remains the orthogonal projection onto \operatorname{range}(A).

Trace and Rank Interpretations

In the context of , the of a projection matrix P equals its , which corresponds to the number of columns in the X, or equivalently, the number of parameters in the model, assuming X has full column rank. This holds because P is idempotent and symmetric, projecting onto the column space of X. As established in the algebraic properties of projection matrices, \operatorname{trace}(P) = \operatorname{rank}(P). The trace of P provides a measure of model complexity in , representing the associated with the fitted values. Specifically, \operatorname{trace}(P) = p, where p is the number of parameters, while the trace of the residual projection matrix I - P equals n - p, indicating the for the residuals, with n denoting the number of observations. This interpretation links the projection's dimensionality directly to , such as in estimating variance or testing hypotheses. The diagonal elements of P, known as leverage values, quantify the influence of each observation on the fitted values; their sum equals \operatorname{trace}(P) = p, yielding an average leverage of p/n. This sum underscores the overall "pull" of the model on the data, with high-leverage points potentially affecting fit stability. For instance, in with an intercept and slope, the projection matrix has trace 2, reflecting the two parameters and the rank of the two-column .

Applications in Statistics

Ordinary Least Squares

In the ordinary least squares (OLS) framework, the linear regression model is expressed in matrix form as \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}, where \mathbf{y} is an n \times 1 vector of observations, \mathbf{X} is an n \times p design matrix with full column rank, \boldsymbol{\beta} is a p \times 1 vector of unknown parameters, and \boldsymbol{\epsilon} is an n \times 1 vector of errors. The OLS estimator minimizes the residual sum of squares \|\mathbf{y} - \mathbf{X}\hat{\boldsymbol{\beta}}\|^2 and is given by \hat{\boldsymbol{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}. This estimator is unbiased and has minimum variance among linear unbiased estimators under the model's assumptions. The fitted values are \hat{\mathbf{y}} = \mathbf{X} \hat{\boldsymbol{\beta}} = \mathbf{X} (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}, which can be written compactly as \hat{\mathbf{y}} = \mathbf{P} \mathbf{y}, where \mathbf{P} = \mathbf{X} (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top is the projection matrix, also known as the hat matrix. Geometrically, \mathbf{P} orthogonally projects the response \mathbf{y} onto the column of \mathbf{X}, ensuring that \hat{\mathbf{y}} lies in this and minimizes the to \mathbf{y}. The residuals are defined as \mathbf{e} = \mathbf{y} - \hat{\mathbf{y}} = (\mathbf{I} - \mathbf{P}) \mathbf{y}, where \mathbf{I} is the n \times n . These residuals are orthogonal to the columns of \mathbf{X}, satisfying \mathbf{X}^\top \mathbf{e} = \mathbf{0}, which implies that the unexplained variation is perpendicular to the fitted spanned by the predictors. This property decomposes \mathbf{y} into fitted and residual components with no between them. The projection matrix \mathbf{P} in OLS arises under the assumptions of a , errors with zero conditional E[\boldsymbol{\epsilon} \mid \mathbf{X}] = \mathbf{0}, homoscedasticity \text{Var}(\boldsymbol{\epsilon} \mid \mathbf{X}) = \sigma^2 \mathbf{I}, and uncorrelated errors. These conditions ensure that \mathbf{P} represents an orthogonal projection onto \text{Col}(\mathbf{X}), with \mathbf{P} being symmetric and idempotent. For a univariate example with an intercept, consider the simple linear model y_i = \beta_0 + \beta_1 x_i + \epsilon_i for i = 1, \dots, n, where the \mathbf{X} has first column of ones and second column \mathbf{x} = (x_1, \dots, x_n)^\top. The projection matrix \mathbf{P} then projects \mathbf{y} onto the of \{\mathbf{1}, \mathbf{x}\}, yielding fitted values that represent the best linear approximation in the sense; for instance, with centered predictors, the estimate \hat{\beta}_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} aligns with this projection.

Generalized Least Squares

In the context of models where the errors exhibit heteroscedasticity or , the (GLS) method employs a projection matrix adapted to the error structure. Consider the y = X \beta + \epsilon, where \mathbb{E}(\epsilon) = 0 and \mathrm{Var}(\epsilon) = \Sigma, with \Sigma a positive definite matrix. The GLS of the parameter vector \beta is given by \hat{\beta} = (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} y. The fitted values are then \hat{y} = H y, where the projection matrix H = X (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} projects the response vector onto the column space of X with respect to the inner product induced by \Sigma^{-1}. This formulation generalizes the case, which arises as a special instance when \Sigma = \sigma^2 I. A key special case of GLS is (WLS), which applies when the errors are uncorrelated but have unequal variances, so \Sigma is diagonal with entries \sigma_i^2. In this scenario, the weights are w_i = 1 / \sigma_i^2, and the projection matrix becomes H = X (X^T W X)^{-1} X^T W, where W = \Sigma^{-1} is the diagonal weight matrix. This weighting ensures that observations with smaller error variances contribute more to the estimation, yielding more efficient parameter estimates under the specified error structure. The projection matrix H in GLS retains the idempotence property, satisfying H^2 = H, which confirms its role as a . However, unlike the orthogonal projection in ordinary , H is generally not symmetric (H^T \neq H) unless \Sigma is a scalar multiple of the , reflecting the oblique nature of the projection in the presence of correlated or heteroscedastic errors. For the residuals e = y - \hat{y} = (I - H) y, the is \mathrm{Var}(e) = (I - H) \Sigma (I - H)^T, demonstrating how the projection accounts for the underlying error structure to produce unbiased residuals with adjusted variance.

Advanced Formulations

Oblique Projections

Oblique projections extend the concept of projections beyond the orthogonal case by allowing the direction of projection to be non-perpendicular to the target . A matrix P represents an oblique projection if it is idempotent, satisfying P^2 = P, but not symmetric, so P^T \neq P; consequently, the projected vector and the residual vector (I - P)x are not orthogonal under the standard inner product. This distinguishes oblique projections from orthogonal ones, where symmetry holds and orthogonality is preserved. The formula for the oblique projection onto the column space of a matrix A parallel to the column space of B is given by P = A (A^T B^{-1} A)^{-1} A^T B^{-1}, assuming B is positive definite and invertible, and the inner matrices have full rank to ensure well-definedness. This expression captures projections in spaces equipped with a non-standard metric induced by B^{-1}, where the "parallel" direction aligns with the geometry defined by B. Representative examples of oblique projections appear in coordinate transformations and non-Euclidean metrics, such as affine mappings or engineering applications requiring angled views. For instance, in a 2D setting analogous to multiview projections, the matrix P = \begin{bmatrix} 1 & 0 & -\cot \theta & 0 \\ 0 & 1 & -\cot \phi & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} (for a 4D homogeneous coordinate system) implements an oblique projection onto a plane along directions specified by angles \theta and \phi, preserving certain lengths while shearing others. This satisfies P^2 = P but lacks symmetry unless \theta = \phi = 90^\circ. Oblique projections arise naturally when working in weighted spaces, such as those with a non-identity covariance structure \Sigma \neq I, where the projection becomes oblique in the Euclidean metric but orthogonal under the weighted inner product \langle x, y \rangle = x^T \Sigma^{-1} y. In this context, setting B = \Sigma in the formula yields the projection used in such settings.

Blockwise Formulas

Blockwise formulas for projection matrices arise in the context of partitioned design matrices, particularly when the column space is decomposed into nested or augmented subspaces. Consider a design matrix X partitioned as X = [A \, B], where A and B are matrices whose columns span subspaces \mathcal{A} and the augmentation to \mathcal{X} = \operatorname{span}(\mathcal{A} \cup \operatorname{span}(B)), respectively. The orthogonal projection matrix onto \mathcal{X}, denoted P_X, can be expressed in terms of the projection P_A onto \mathcal{A} and the contribution from B orthogonal to \mathcal{A}: P_X = P_A + (I - P_A) B \left[ B^\top (I - P_A) B \right]^{-1} B^\top (I - P_A), assuming B^\top (I - P_A) B is invertible, which holds if the columns of B span a subspace with full rank relative to the orthogonal complement of \mathcal{A}. This decomposition highlights the incremental nature of the projection, where the first term projects onto the initial subspace, and the second term adds the projection onto the component of B orthogonal to \mathcal{A}. The second term, (I - P_A) B \left[ B^\top (I - P_A) B \right]^{-1} B^\top (I - P_A), represents the incremental projection contributed by the added variables in B. This structure is particularly useful in building, as it allows updating the projection matrix without recomputing the full of X^\top X. In the context of the hat matrix in , this incremental term captures how additional predictors modify the fitted values and associated diagnostics. For example, in , adding a single covariate corresponding to a column vector b (so B = b) updates the s, which are the diagonal elements of the hat matrix. The new leverage for observation i becomes h_{ii}^{(new)} = h_{ii}^{(old)} + \frac{[(I - P_A) b]_i^2}{[(I - P_A) b]^\top (I - P_A) b}, reflecting the increased influence of that observation if it has high leverage in the orthogonalized direction. This update formula facilitates efficient computation in procedures. Regarding ranks, the decomposition preserves rank additivity in the projected subspaces: \operatorname{rank}(P_X) = \operatorname{rank}(P_A) + \operatorname{rank}\left( (I - P_A) B \left[ B^\top (I - P_A) B \right]^{-1} B^\top (I - P_A) \right), where the second rank term measures the dimension added by the orthogonal component of B. This follows from the decomposition of \mathcal{X} = \mathcal{A} \oplus (\mathcal{X} \ominus \mathcal{A}), ensuring the total dimension is the sum of the dimensions of the nested and complementary subspaces. The of P_X is preserved through this block structure, consistent with general algebraic properties of projections.

Computation and Extensions

Numerical Aspects

Computing the projection matrix P = X (X^T X)^{-1} X^T directly via the normal equations is numerically unstable, particularly when the design matrix X exhibits or near-collinearity, as the condition number of X^T X is the square of that of X, amplifying errors. Instead, a stable approach involves the of X = Q R, where Q has orthonormal columns, yielding P = Q Q^T; this avoids explicit inversion and preserves numerical accuracy even for ill-conditioned X. High , indicated by a large \kappa(X) > 30, inflates the variance of coefficients and can lead to extreme values on the diagonal of P, where leverages h_{ii} = x_i^T (X^T X)^{-1} x_i measure an observation's potential influence and may exceed typical bounds (e.g., h_{ii} > 2p/n for p predictors and n observations). For rank-deficient cases where \rank(X) = r < \min(m,n), the (SVD) X = U \Sigma V^T provides a robust alternative, with the projection onto the column space given by P = U_r U_r^T, where U_r comprises the first r left singular vectors corresponding to nonzero singular values; this handles numerical deficiency effectively. Orthogonal projections are typically computed via QR factorization algorithms, such as reflections, which introduce zeros column-by-column through orthogonal transformations and ensure backward with O(m n^2) for an m \times n , or the modified Gram-Schmidt process, which orthogonalizes columns sequentially while mitigating loss-of-orthogonality issues present in the classical version. In practice, blockwise updates from symbolic decompositions can enhance efficiency for large-scale computations, though remains paramount. Implementations in statistical software often compute projections implicitly for efficiency and stability; for instance, the R function lm() employs internally to solve problems, with leverages accessible via hatvalues() on the fitted model object. Similarly, Python's linalg.lstsq uses QR or based on matrix conditioning to handle projections in linear models.

Applications Beyond Regression

In (PCA), a projection matrix is formed by the leading eigenvectors of the data , projecting high-dimensional observations onto a lower-dimensional that maximizes variance retention for tasks like and feature extraction. This matrix, typically denoted as V_k V_k^T where V_k contains the top k eigenvectors, ensures the projected data preserves essential structural information while discarding noise-dominated components. Such projections are idempotent, allowing repeated applications without altering the , which aligns with the algebraic properties of projection matrices. In , matrices referred to as projection matrices transform 3D world coordinates into 2D screen space during rendering pipelines, with perspective projections incorporating depth scaling to simulate realistic foreshortening and orthographic projections maintaining uniform scaling for technical visualizations like blueprints. Note that these are transformation matrices analogous to projections but do not satisfy the property (P^2 = P) of linear algebra projection matrices. The projection matrix, often defined with parameters for near and far clipping planes, maps vertices such that the homogeneous coordinate z influences x and y divisions, enabling efficient GPU processing in frameworks like and . Projection matrices play a key role in applications, such as the , where they facilitate state estimation by projecting noisy measurements onto the predicted state subspace, particularly in extended Kalman filters to eliminate unknown inputs and refine observation equations. In , these matrices enhance robustness by projecting the presumed steering vector onto the signal-plus-interference subspace, countering mismatches from array imperfections and steering errors to improve directional signal enhancement in antenna arrays. This technique, as in projection-based adaptive beamformers, reduces interference leakage and maintains array gain under limited snapshot conditions. In , kernel projections in support vector machines (SVMs) enable nonlinear classification by implicitly mapping data to higher-dimensional spaces through functions, equivalent to a projection onto the feature space spanned by support vectors without computing the full . For linear SVMs, the explicit matrix defines the normal, optimizing margins in the input space. layers—linear matrix multiplications that adjust feature dimensions between modules—enable efficient dimensionality adaptation, as seen in mechanisms where query-key projections compute scaled dot-products. These layers support on-device models by combining with random projections to compress parameters while preserving accuracy. A representative example is the use of (SVD) for , where the projection matrix P = U_k U_k^T (with U_k the leading left singular vectors) projects onto the optimal low-rank column , minimizing the Frobenius error for data and in large-scale matrices. A \approx P A = U_k U_k^T A This yields the rank-k U_k \Sigma_k V_k^T, widely applied in recommender systems and image for scalable computations.

Historical Development

Origins

The origins of the projection matrix trace back to foundational developments in 18th- and 19th-century mathematics, where geometric and algebraic concepts of projecting points onto subspaces laid the groundwork for later formalizations in linear algebra. The 19th century saw significant advancements through the emergence of linear algebra, with Hermann Grassmann's Die lineale Ausdehnungslehre (1844) introducing the theory of linear extensions and subspaces, enabling the conceptual framework for projecting vectors onto lower-dimensional flats or subspaces in multilinear spaces. Grassmann's extension algebra provided tools for describing decompositions of space into direct sums, a core idea in projection theory, though without matrix notation. Complementing this, Arthur Cayley's development of matrix theory in the 1850s, particularly in his seminal paper "A Memoir on the Theory of Matrices" (1858), established matrices as representations of linear transformations, including those that map vectors to their projections onto invariant subspaces. Cayley's work formalized compositions of such transformations, setting the stage for idempotent matrices as projection operators. A pivotal formalization came with Camille Jordan's Traité des substitutions et des équations algébriques (1874), where he developed forms for linear transformations, highlighting idempotent operators—matrices P satisfying P^2 = P—as those diagonalizable into forms with eigenvalues 0 and 1, corresponding to projections onto eigenspaces. Jordan's analysis of groups and their matrix representations explicitly treated such operators in the context of reducing transformations to simplest forms, bridging and . Prior to 1900, these ideas found application in and early tensor analysis, where projections decomposed manifolds and tensors into tangential and normal components without statistical interpretation. For instance, Bernhard Riemann's 1854 habilitation lecture on implicitly used projection-like decompositions for metrics on curved spaces, while Gregorio Ricci-Curbastro's foundational (1880s) employed analogous operators to project multivectors onto coordinate subspaces, influencing subsequent .

Key Contributions

The concept of the projection matrix gained prominence in statistical analysis during the mid-20th century, particularly through its role in estimation. Although the underlying formula traces back to Carl Friedrich Gauss's work on in 1809, the modern interpretation as the "hat matrix" in regression diagnostics was introduced by John W. Tukey around 1972, with David C. Hoaglin and Roy E. Welsch formalizing its properties in their 1978 paper, where they described the matrix that maps observed responses to fitted values and aids in identifying influential points. This naming and application built on earlier informal uses, including Tukey's introduction of the technique for residual analysis in exploratory methods. In the realm of multivariate analysis, Calyampudi R. Rao significantly expanded the theory of oblique projections during the 1960s and early 1970s, leveraging generalized matrix inverses to handle non-orthogonal projections in estimation problems. Rao's seminal 1971 book with Sujit K. Mitra detailed how these inverses enable projectors, which are essential for resolving systems where subspaces are not perpendicular, thus broadening applications in and multivariate hypothesis testing. This work formalized properties like and range-null space relations for oblique cases, influencing subsequent statistical methodologies. George A. F. Seber's 1977 textbook Linear Regression Analysis provided a comprehensive synthesis of properties, including diagnostics and variance interpretations, establishing it as a foundational for understanding the matrix's in model and . Seber emphasized geometric interpretations, such as the hat matrix's decomposition into orthogonal components, which clarified its use in detecting and effects. Post-2000 advancements addressed computational challenges in high-dimensional settings. The randomized () algorithm by Nathan Halko, Per-Gunnar Martinsson, and Joel A. Tropp in 2011 introduced probabilistic methods for approximating matrices via low-rank decompositions, achieving and efficiency for matrices with millions of rows by reducing the effective dimension through random sampling—demonstrating significant speedups over deterministic SVD in empirical tests on large datasets. In , particularly in the 2020s, matrices have been pivotal in efficient model adaptation; for instance, the framework by Edward J. Hu et al. in 2021 employs low-rank projections to fine-tune large language models, reducing trainable parameters by over 10,000 times while preserving performance on benchmarks , thus enabling scalable deployment in resource-constrained environments. These contributions, from diagnostic tools to high-dimensional algorithms, have solidified the projection matrix's centrality in statistical and computational practice, with brief extensions to blockwise formulations in partitioned models further supporting modular analysis in complex systems.

References

  1. [1]
    Orthogonal Projection
    The vector x W is called the orthogonal projection of x onto W . This is exactly what we will use to almost solve matrix equations, as discussed in the ...
  2. [2]
    Linear Algebra, Part 5: Projection Operators (Mathematica)
    A linear operator P : V ⇾ V such that P² = P is called the projection or idempotent operator. P is also said to be the projection onto X along Y. Example 1: ...
  3. [3]
    Projection onto a subspace - Ximera - The Ohio State University
    If has maximal rank, verify that satisfies the identity (a matrix satisfying such an identity is called a projection matrix, since the linear transformation it ...
  4. [4]
    Projection Matrix -- from Wolfram MathWorld
    A projection matrix P is an n×n square matrix that projects a vector space from R^n to a subspace W, where P^2=P.
  5. [5]
    [PDF] 2 Review of Linear Algebra and Matrices
    2.51 Definition: A matrix P is idempotent if P2 = P. A symmetric idempotent matrix is called a projection matrix. Properties of a projection matrix P: 2.52 ...
  6. [6]
    The Hat Matrix in Regression and ANOVA
    The Hat Matrix in Regression and ANOVA. DAVID C. HOAGLIN AND ROY E. WELSCH*. 1. Introduction. 2. Basic Properties which summarizes the dependence of the ...
  7. [7]
    5.4 - A Matrix Formulation of the Multiple Regression Model
    One important matrix that appears in many formulas is the so-called "hat matrix," H = X(X^{'}X)^{-1}X^{'}, since it puts the hat on Y! Linear Dependence. There ...
  8. [8]
    [PDF] Math 2331 – Linear Algebra - 6.3 Orthogonal Projections
    Geometric Interpretation of Orthogonal Projections. The Best Approximation Theorem. The Best Approximation Theorem: Example. New View of Matrix Multiplication.
  9. [9]
    [PDF] Geometry of Matrix Transformations on IR² - Reflexions - Table
    Orthogonal projection. (x, y) onto the x-axis. X. T(e). T(e₂). T(1, 0) = (1, 0). T(0, 1) = (0, 0). (0,0). [ ]. T(x, y) = (x, 0). (x,0) x. T(x). Orthogonal ...
  10. [10]
    [PDF] 5.8.3 Oblique Projections - UNM CS
    Oblique projections are characterized by the angle projectors make with the projection plane, and are implemented by first shearing objects, then doing an ...Missing: interpretation | Show results with:interpretation
  11. [11]
    [PDF] Linear Algebra and It's Applications by Gilbert Strang
    Revising this textbook has been a special challenge, for a very nice reason. So many people have read this book, and taught from it, and even loved it.
  12. [12]
    [PDF] Lecture 6: Generalized inverse and pseudoinverse
    A square matrix P is called a projection matrix if P = P2. Example 0.2. The ... – Property: AA† is an orthogonal projection matrix onto Col(A). Dr ...
  13. [13]
    [PDF] Simple Linear Regression - Kosuke Imai
    If A is a symmetric n × n matrix, rank(A) equals the number of non-zero ... If A is a projection matrix, then rank(A) = trace(A). Proof: Let Av = λv ...
  14. [14]
    [PDF] Contents - USC Dornsife
    a projection matrix, and tr(P) = p and tr(N) = n − p, that is, the trace of a projection matrix equals the dimension of the subspace upon which it projects ...
  15. [15]
    [PDF] 1 Simple Linear Regression - Statistics
    The degrees of freedom associated with any quadratic form is equal to the rank of the defining matrix, which is equal to its trace when the defining matrix is ...
  16. [16]
    [PDF] Linear Models
    ... where it is, therefore (I − H)Xβ = 0. And the trace of the projection matrix (I − H) is its rank, which is − tr(H) = − , since X has full rank . It ...
  17. [17]
    [PDF] 3.0 Linear Regression with Matrices - Stat@Duke
    In the hat matrix, the eigenvalues of H consist of p + 1 ones and n − p − 1 zeroes. • HX = X, (I - H) X = 0. • For the hat matrix, the trace of H equals the ...
  18. [18]
    Elements of Statistical Learning: data mining, inference, and ...
    The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition February 2009. Trevor Hastie, Robert Tibshirani, Jerome Friedman.
  19. [19]
    Econometrics Notes - 9 Least Squares with Matrix Algebra - my.SMU
    The objective of this chapter is to help you become familiarized with the mathematics of least squares estimation of linear models, and to provide proofs of ...
  20. [20]
    [PDF] Generalized Least Squares Theory
    is a projection matrix. Thus, ˆbTN can be obtained by regressing (ITN − PD. )y on (ITN − PD. )Z. Let ˆaTN denote the OLS estimator of the vector a of ...
  21. [21]
    Projection matrix - StatLect
    It turns out that idempotent matrices and projection matrices are the same thing! Proposition A matrix is idempotent if and only if it is a projection matrix.Projections · Matrix of the projection operator · How to derive the projection...
  22. [22]
    [PDF] Generalized Least Squares Theory
    that yGLS is an oblique (but not orthogonal) projection of y onto span(X). It can also be verified that the vector of GLS residuals is not orthogonal to X ...
  23. [23]
    Matrix Algebra From a Statistician's Perspective - SpringerLink
    In stockMatrix algebra plays a very important role in statistics and in many other dis- plines. In many areas of statistics, it has become routine to use matrix ...
  24. [24]
    Multicollinearity in Regression Analysis: Problems, Detection, and ...
    Multicollinearity is when independent variables in a regression model are correlated. I explore its problems, testing your model for it, and solutions.Missing: projection | Show results with:projection
  25. [25]
    [PDF] (U.John Hopkins) Matrix Computations (3rd Ed.) [ripped by sabbanji]
    Van Loan (1988). Handbook for Matrix Computa- tions, SLAM Publications, Philadelphia, PA. Fortran 77, The Basic Linear Algebra Subprograms, Linpack, MATLAB ...
  26. [26]
    A novel approach for Fair Principal Component Analysis based on ...
    Aug 24, 2022 · Principal component analysis (PCA), a ubiquitous dimensionality reduction technique in signal processing, searches for a projection matrix that ...
  27. [27]
    [1406.3836] Projected principal component analysis in factor models
    Jun 15, 2014 · This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employs principal component analysis to the projected (smoothed) data ...
  28. [28]
    WebGL model view projection - Web APIs - MDN Web Docs
    Jun 10, 2025 · The projection matrix is used to convert world space coordinates into clip space coordinates. A commonly used projection matrix, the perspective ...The model, view, and... · Clip space · Simple projection · The viewing frustum
  29. [29]
    The Perspective and Orthographic Projection Matrix - Scratchapixel
    The orthographic projection matrix offers a different view of three-dimensional scenes by projecting objects onto the viewing plane without the depth ...
  30. [30]
    Kalman filter method for the real-time optimal identification of linear ...
    Kalman filter method for the real-time optimal identification ... eliminated unknown inputs in the measurement equation by introducing a projection matrix ...
  31. [31]
    A projection approach for robust adaptive beamforming - IEEE Xplore
    The proposed method involves projecting the presumed steering vector onto the observed signal-plus-interference subspace to overcome perturbation and sample ...
  32. [32]
    Projection-based robust adaptive beamforming with quadratic ...
    A new robust adaptive beamforming technique is proposed in this study to address performance degradation of adaptive beamforming methods in the presence of ...
  33. [33]
    Projection-SVM: Distributed Kernel Support Vector Machine for Big ...
    The training of kernel support vector machine (SVM) is a computationally complex task for large datasets where the number of samples ranges in millions.
  34. [34]
    [PDF] Random Projections for Support Vector Machines
    The linear support vector machine constructs a hyperplane separator that maximizes the 1- norm soft margin. We develop a new obliv- ious dimension reduction ...
  35. [35]
    [1812.09489] Random Projection in Deep Neural Networks - arXiv
    Dec 22, 2018 · This work investigates the ways in which deep learning methods can benefit from random projection (RP), a classic linear dimensionality reduction method.
  36. [36]
    [PDF] Efficient On-Device Models using Neural Projections
    Parameterized projection functions that permit efficient computation; can be combined with other operations like convolutions to yield flexible projection ...
  37. [37]
    [PDF] The Singular Value Decomposition (SVD) and Low-Rank Matrix ...
    That is, the SVD expresses A as a nonnegative linear combination of min{m, n} rank-1 matrices, with the singular values providing the multipliers and the outer ...
  38. [38]
    Joseph-Louis Lagrange (1736 - 1813) - Biography - MacTutor
    In a work on the foundations of dynamics, Lagrange based his development on the principle of least action and on kinetic energy. ... Cross-references (show).
  39. [39]
    Die Lineale Ausdehnungslehre ein neuer Zweig der Mathematik
    Oct 18, 2008 · Die Lineale Ausdehnungslehre ein neuer Zweig der Mathematik: Dargestellt und durch Anwendungen ... ... PDF download · download 1 file · SINGLE ...
  40. [40]
    [PDF] A Memoir on the Theory of Matrices
    A Memoir on the Theory of Matrices. By Authub Cayliy, Esq.^ F.B.S.. Eeceived December 10, 1857,—Bead January 14, 1858. The term matrix might be used in a ...
  41. [41]
    Traité des substitutions et des équations algébriques - Internet Archive
    Dec 3, 2008 · Traité des substitutions et des équations algébriques. by: Jordan, Camille, 1838-1922. Publication date: 1870. Topics: Galois theory, Groups ...
  42. [42]
    [PDF] A Brief History of Linear Algebra and Matrix Theory
    Matrix algebra was nurtured by the work of Arthur Cayley in 1855. Cayley studied compositions of linear transformations and was led to define matrix ...
  43. [43]
    The Hat Matrix in Regression and ANOVA - Taylor & Francis Online
    A projection matrix known as the hat matrix contains this information and, together with the Studentized residuals, provides a means of identifying exceptional ...
  44. [44]
    [PDF] Generalized Inverse of Matrices and its Applications
    A generalized inverse (g-inverse) of a matrix A is a matrix G satisfying AGA = A, where G is not unique.
  45. [45]
    Finding Structure with Randomness: Probabilistic Algorithms for ...
    This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.