Fact-checked by Grok 2 weeks ago

Design matrix

In statistics, a design matrix, also known as a model matrix or regressor matrix and often denoted by X, is a matrix that organizes the values of explanatory variables (predictors) across multiple observations for use in linear models such as regression analysis or analysis of variance (ANOVA). It forms the core of the linear model equation Y = X\beta + \epsilon, where Y is the vector of response variables, \beta is the vector of unknown parameters (coefficients), and \epsilon represents the random error term with mean zero. The design matrix enables efficient matrix-based computations for parameter estimation, hypothesis testing, and prediction, making it fundamental to quantitative data analysis in fields like genomics, economics, and engineering. The construction of a design matrix depends on the nature of the predictors: for continuous variables, it includes the raw values alongside a column of ones for ; for categorical factors, it employs dummy (indicator) variables coded as 0s and 1s to represent group memberships, with the number of columns equal to the number of levels minus one in a parameterization to avoid . For example, in a of weight on height for n individuals, X is an n \times 2 matrix with the first column filled with 1s and the second containing height measurements; in a one-way ANOVA comparing means across k s, X becomes an n \times k matrix of indicators specifying treatment assignments for each . Key properties include its dimensions (n rows for observations and p columns for parameters), and its , which must typically be full (equal to p) for unique parameter estimates, as lower signals among predictors that can inflate variance and hinder inference. Beyond , design matrices play a crucial role in experimental , particularly in experiments, where they specify the combinations of levels (often coded as -1 for low and +1 for high) across experimental runs to assess main effects and interactions orthogonally. For a two-level full with three factors, the $8 \times 3 design matrix lists all $2^3 treatment combinations in standard order, facilitating balanced and efficient estimation of effects without . This versatility extends to generalized linear models and beyond, underscoring the design matrix's importance in ensuring model interpretability and statistical validity across diverse applications.

Fundamentals

Definition

In linear statistical modeling, the design matrix, commonly denoted as X, serves as a foundational tool represented by an n \times p matrix, where n denotes the number of observations and p the number of predictor variables or factors, with each row corresponding to an individual observation and each column to a specific predictor. This structure organizes the explanatory data to facilitate parameter estimation in regression analyses. The design matrix underpins the general linear model, expressed mathematically as \mathbf{Y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\varepsilon}, where \mathbf{Y} is the n \times 1 response vector containing the observed outcomes, \boldsymbol{\beta} is the p \times 1 vector of unknown parameters to be estimated, and \boldsymbol{\varepsilon} is the n \times 1 error vector. The error term \boldsymbol{\varepsilon} is assumed to consist of independent components with zero mean and constant variance across observations, embodying the principles of independence and homoscedasticity essential for valid inference in the model. The use of matrices in statistical modeling, including design matrices, developed in the , building on early work in experimental design such as that of in agricultural applications. Unlike a , which quantifies the variance and correlations among random variables in the data, or a general that simply stores raw observations, the design matrix specifically encodes the structural relationships between predictors and the response within the framework.

Dimensions and Notation

The design matrix, conventionally denoted as X, is an n \times p matrix, where n represents the number of or samples in the , and p denotes the number of predictors or parameters to be estimated, including the intercept term when it is part of the model. Each row of X corresponds to a single , typically represented as the row \mathbf{x}_i for the i-th observation, which captures the predictor values associated with that sample. In standard formulations that include an intercept, the first column of X is a vector of ones, enabling the model to incorporate a across all observations. This column of ones multiplies the intercept parameter in the linear combination, shifting the hyperplane to allow for non-zero predictions when all predictors are zero. For models excluding the intercept, known as through the origin, the design matrix omits the column of ones, resulting in dimensions n \times (p-1) where p-1 is the number of non-intercept predictors.

Construction

For Continuous Predictors

In the construction of a design matrix for models with continuous predictors, each column beyond corresponds to one such predictor, where the entry in row i and column j is the observed value x_{ij} of the j-th predictor for the i-th . The design matrix X thus takes the form of an n \times (p+1) array, with n rows for observations and p+1 columns including , ensuring the model captures the linear relationship between the response and these numerical explanatory variables. For a simple case involving one continuous predictor x and an intercept, the design matrix is constructed as X = [ \mathbf{1} \mid \mathbf{x} ], where \mathbf{1} is an n \times 1 column of ones and \mathbf{x} is the n \times 1 of predictor values; this structure supports estimation via ordinary least squares in the general framework. To enhance and coefficient interpretability, continuous predictors are often centered by subtracting their sample means and scaled by dividing by their standard deviations, transforming each column j to entries (x_{ij} - \bar{x}_j)/s_j, where \bar{x}_j is the mean and s_j the standard deviation of the j-th predictor. This mitigates issues from differing scales among predictors, reduces in interactions, and facilitates comparison of effect sizes across variables. For modeling nonlinear relationships, higher-order terms are incorporated by adding columns for powers of the predictors, such as x^2 for effects or x^3 for cubic, effectively expanding the design matrix to include these derived features while maintaining the linear-in-parameters form. Interactions between continuous predictors, like the product x_1 x_2, are similarly added as separate columns to capture effects, as in second-order models where the function includes terms up to degree two in multiple variables. Orthogonal polynomials may be used for these terms to minimize numerical instability from high correlations among powers of the same predictor.

For Categorical Predictors

Categorical predictors, which represent qualitative or discrete factors, must be encoded numerically to be incorporated into the design matrix for linear models. The most common approach is dummy coding, where for a with k levels, k-1 binary indicator columns are created in the design matrix, each corresponding to one level excluding a chosen reference level. This encoding ensures that the columns are mutually exclusive and exhaustive, allowing the model to estimate separate effects for each non-reference level relative to the reference. In dummy coding, the entry D_{i,j} in the design matrix is 1 if the i-th observation belongs to the (j+1)-th category (for j = 1, \dots, k-1), and 0 otherwise, with the k-th category serving as the (all zeros in those columns). For example, consider a with levels A (reference), B, and C; the design matrix includes two columns: one for B (1 if level B, 0 otherwise) and one for C (1 if level C, 0 otherwise). This setup avoids by preventing linear dependence among the columns, as including all k indicators would make their sum equal to the intercept column of ones, rendering the design matrix singular. The level is often selected based on interpretability, such as the most frequent or category. An alternative to dummy coding is effect coding, which constructs k-1 columns such that the values across all levels sum to zero for each column, facilitating interpretations centered on deviations from the rather than a specific . In effect coding, non-reference levels are typically coded as 1, the as -1, and adjustments are made for balance (e.g., using -1/(k-1) for the in some implementations to ensure the sum-to-zero property). This is particularly useful in balanced designs, where the intercept estimates the overall , and coefficients represent average deviations for each level from that , aiding in the analysis of main effects without biasing toward a reference category. For unordered (nominal) categories, or coding is essential to capture qualitative distinctions without implying . In contrast, ordered (ordinal) categories may sometimes be treated as continuous predictors by assigning numeric scores to levels, or encoded using polynomials to model trends, though coding remains applicable for nominal when is not central to the .

Properties

Full Rank Conditions

In linear models, the design matrix X of dimensions n \times p, where n is the number of observations and p the number of parameters, possesses full column if \operatorname{rank}(X) = p, signifying that its columns are linearly . This property guarantees unique ordinary estimates for the model parameters \beta. Achieving full necessitates the absence of perfect among the predictor columns. Practitioners can verify this condition computationally by confirming that the of X^T X exceeds zero, thereby establishing the and invertibility of X^T X, or by applying to X = U \Sigma V^T, where full rank holds if all singular values in \Sigma are strictly positive with no zeros. Rank deficiency arises when \operatorname{rank}(X) < p, rendering X^T X singular and non-invertible, which precludes a unique solution to the normal equations and yields infinitely many parameter estimates consistent with the data. Addressing this typically involves employing a generalized inverse of X^T X to compute a minimum-norm solution or simplifying the model by eliminating linearly dependent predictors. In overparameterized designs, such deficiency manifests as aliasing, where distinct parameter configurations produce indistinguishable fitted values, complicating interpretation. The normal equations underpinning least squares estimation are formulated as X^T X \beta = X^T y, which demand full column rank for a unique solution; in general, \operatorname{rank}(X) \leq \min(n, p).

Orthogonality and Efficiency

An orthogonal design matrix X in linear regression satisfies X^T X = cI, where c is a scalar and I is the identity matrix, implying that the columns of X are pairwise orthogonal and each has equal norm \sqrt{c}. This property ensures that the ordinary least squares estimator simplifies to \hat{\beta} = \frac{1}{c} X^T Y, as the inverse (X^T X)^{-1} becomes diagonal and straightforward to compute, decoupling the estimates of individual parameters. Under the assumptions of the Gauss-Markov theorem—linearity, unbiasedness, homoscedasticity, and no serial correlation—the ordinary least squares estimator is the best linear unbiased estimator (BLUE), with covariance matrix \operatorname{Var}(\hat{\beta}) = \sigma^2 (X^T X)^{-1}. For orthogonal designs, this covariance matrix diagonalizes, yielding uncorrelated parameter estimates with minimal variances \sigma^2 / c for each component, thereby enhancing estimation efficiency by avoiding variance inflation from collinearity. Examples of orthogonal designs include balanced $2^k factorial experiments, where factors are coded as \pm 1 and the resulting columns of X are orthogonal, allowing independent assessment of main effects and interactions. Similarly, Helmert contrast matrices in balanced one-way ANOVA produce orthogonal columns by comparing each group mean to the average of subsequent groups, partitioning the sum of squares into independent components for hypothesis testing. In non-orthogonal designs, collinearity inflates the variances of \hat{\beta}, as measured by the condition number \kappa(X) = \sigma_{\max} / \sigma_{\min} from the singular value decomposition of X, where large \kappa(X) (e.g., >30) signals numerical and amplified estimation errors. This degradation contrasts with orthogonal cases, where \kappa(X) = 1, ensuring optimal stability and precision.

Examples

Arithmetic Mean Estimation

The simplest application of the design matrix arises in estimating the population from a sample of observations. Consider a sample of n observations Y_i for i = 1, \dots, n, modeled as Y_i = \mu + \varepsilon_i, where \mu is the unknown population mean and \varepsilon_i are error terms with mean zero. In matrix form, this is expressed as \mathbf{Y} = X \boldsymbol{\beta} + \boldsymbol{\varepsilon}, where \mathbf{Y} is the n \times 1 vector of observations, \boldsymbol{\beta} = \mu is the scalar parameter, and \boldsymbol{\varepsilon} is the n \times 1 vector of errors. The design matrix X in this intercept-only model is an n \times 1 column vector of ones, denoted \mathbf{1}_n. The ordinary least squares (OLS) estimator for \boldsymbol{\beta} is given by \hat{\boldsymbol{\beta}} = (X^T X)^{-1} X^T \mathbf{Y}. Substituting X = \mathbf{1}_n, this simplifies to \hat{\beta} = (\mathbf{1}_n^T \mathbf{1}_n)^{-1} \mathbf{1}_n^T \mathbf{Y} = n^{-1} \sum_{i=1}^n Y_i = \bar{y}, the sample arithmetic mean. Thus, the least squares estimate coincides with the familiar sample mean, providing an unbiased estimator of \mu under the model assumptions. The parameter \mu represents the grand mean of the population, serving as the constant level around which the observations fluctuate. The variance of the estimator \hat{\beta} is \sigma^2 / n, where \sigma^2 is the error variance, reflecting that precision improves with larger sample sizes. This formulation has historical roots in Carl Friedrich Gauss's development of the least squares method in the early 19th century, where he applied it to constant models in astronomical observations, demonstrating that the arithmetic mean minimizes the sum of squared deviations under normality assumptions. Gauss's work, detailed in his 1809 publication Theoria Motus Corporum Coelestium, established the statistical foundation for such estimation.

Simple Linear Regression

In simple linear regression, the model posits a linear relationship between a response Y and a single continuous predictor x, expressed as Y_i = \beta_0 + \beta_1 x_i + \varepsilon_i for i = 1, \dots, n, where \beta_0 is , \beta_1 is the , and \varepsilon_i are errors typically assumed to follow a with mean zero and constant variance \sigma^2. The design \mathbf{X} for this model is an n \times 2 that facilitates the matrix formulation of the , with the first column consisting of ones to account for term and the second column containing the observed values of the predictor x. This structure is denoted as \mathbf{X} = [\mathbf{1}_n \mid \mathbf{x}], where \mathbf{1}_n is an n \times 1 of ones and \mathbf{x} is the n \times 1 of predictor values. The ordinary least squares (OLS) estimator for the parameters is given by \hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{Y}, where \mathbf{Y} is the n \times 1 response and \boldsymbol{\beta} = [\beta_0, \beta_1]^T. This yields the explicit formulas \hat{\beta}_1 = \frac{\text{Cov}(x, y)}{\text{Var}(x)} = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2} and \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}, where \bar{x} and \bar{y} are the sample means of x and y, respectively. Geometrically, the columns of \mathbf{X} span a two-dimensional in \mathbb{R}^n, and the OLS fit projects the response \mathbf{Y} onto this , minimizing the (residual sum of squares) to obtain the fitted values \hat{\mathbf{Y}} = \mathbf{X} \hat{\boldsymbol{\beta}}. For the design matrix to be full and the OLS to be well-defined, the columns must be linearly independent, which holds as long as the predictor x is not across all observations (i.e., \text{[Var](/page/Var)}(x) > [0](/page/0)). If x is , the second column becomes a scalar multiple of the first, rendering \mathbf{X}^T \mathbf{X} singular and the model reducible to a . This setup assumes the predictor is continuous, as constructed in standard matrix form for such variables.

Multiple Linear Regression

In multiple linear regression, the model extends the simple linear case to incorporate several continuous predictor variables, allowing for the joint estimation of their effects on the response. The general form is given by Y_i = \beta_0 + \sum_{j=1}^{p-1} \beta_j x_{ij} + \epsilon_i for i = 1, \dots, n, where Y_i is the response, x_{ij} are the continuous predictors, \beta_0 is , \beta_j are the partial coefficients representing the change in Y per unit change in x_j holding other predictors , and \epsilon_i are independent errors with mean zero and variance \sigma^2. In matrix notation, this becomes \mathbf{Y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}, where \mathbf{Y} is the n \times 1 response , \boldsymbol{\beta} is the p \times 1 parameter (with p = k+1 for k predictors), and \boldsymbol{\epsilon} is the error . The design matrix \mathbf{X} is n \times p, constructed as \mathbf{X} = [\mathbf{1}_n \mid \mathbf{x}_1 \mid \dots \mid \mathbf{x}_{p-1}], with \mathbf{1}_n a column of ones for and each \mathbf{x}_j the n \times 1 of observations for the j-th predictor. The ordinary least squares (OLS) estimator for \boldsymbol{\beta} is \hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{Y}, which minimizes the sum of squared residuals and provides unbiased estimates under the model assumptions, provided \mathbf{X} has full column rank. The partial coefficients \hat{\beta}_j in \hat{\boldsymbol{\beta}} quantify the unique contribution of each predictor, adjusted for the others, enabling assessment of multicollinear influences or . To model interactions among continuous predictors, additional columns are appended to \mathbf{X} consisting of products of the predictor vectors, such as \mathbf{x}_1 \odot \mathbf{x}_2 (element-wise ) for a two-way term \beta_p (x_{i1} x_{i2}). This expands the model to Y_i = \beta_0 + \sum_{j=1}^{p-1} \beta_j x_{ij} + \sum_{m} \beta_m z_{im} + \epsilon_i, where z_{im} are the interaction terms, allowing the effect of one predictor to vary with levels of another. The OLS estimation applies similarly to the augmented \mathbf{X}. A key property arises when predictors are highly correlated, leading to multicollinearity: the matrix \mathbf{X}^T \mathbf{X} becomes ill-conditioned and near-singular, causing the inverse (\mathbf{X}^T \mathbf{X})^{-1} to have large elements and inflating the variances of the coefficient estimates, as \text{Var}(\hat{\boldsymbol{\beta}}) = \sigma^2 (\mathbf{X}^T \mathbf{X})^{-1}. This instability can make individual \hat{\beta}_j unreliable, though the overall model fit may remain adequate; orthogonal predictors mitigate such issues by simplifying \mathbf{X}^T \mathbf{X} to a scaled .

One-Way ANOVA Models

In the one-way analysis of variance (ANOVA) model, the design matrix X facilitates the estimation of group means through the Y = X\beta + \epsilon, where Y is the response vector, \beta contains the parameters of interest, and \epsilon represents the errors assumed to be normally distributed with mean zero and constant variance \sigma^2. Two primary parameterizations are used: the cell means model and the reference group model. These approaches differ in how X is constructed and how the parameters \beta interpret the group-specific means \mu_j for k groups. The cell means model directly parameterizes each \beta_j = \mu_j, the mean of group j. Here, X is an n \times k matrix with k columns, each serving as an indicator for membership in a specific group; the entry x_{ij} = 1 if observation i belongs to group j, and 0 otherwise, with no intercept column included. This construction ensures X has full column k, as the columns have disjoint supports corresponding to the groups, avoiding linear dependence. In balanced designs, where each group has an equal number of observations n_j = n/k, the columns each contain the same number of 1s, leading to orthogonal columns and simplified computations. In unbalanced designs, with unequal n_j, the columns have varying numbers of 1s, but X remains full , though parameter estimates and inferences adjust for the differing sample sizes. In contrast, the reference group model uses an intercept and k-1 dummy variables to achieve . The design matrix X is n \times k, with the first column of all 1s for the intercept and subsequent columns as indicators for groups 1 through k-1, omitting the reference group (typically group k). Here, \beta_0 = \mu_k, the mean of the reference group, and \beta_j = \mu_j - \mu_k for j = 1, \dots, k-1, representing deviations from the reference mean. As in the cell means model, balanced designs feature uniform replication across columns, while unbalanced designs result in unequal 1s per column, influencing the least squares estimates \hat{\beta} = (X'X)^{-1}X'Y. The for equality of group means in one-way ANOVA relies on the design matrix to partition the into between-group and within-group components via orthogonal projections. The between-group (SSB) is computed as Y'(H - \frac{1}{n}J)Y, where H = X(X'X)^{-1}X' is the hat matrix projecting onto the column space of X, and J is the all-ones matrix; this quantifies variation explained by the group effects under either parameterization. The within-group (SSW) is Y'(I_n - H)Y, and the is F = \frac{\text{SSB}/(k-1)}{\text{SSW}/(n-k)}, testing the that all \mu_j are equal, with the same result across balanced and unbalanced designs when using appropriate projections.

Extensions

In Generalized Linear Models

In generalized linear models (GLMs), the design matrix X retains its fundamental role as the matrix encoding the predictor variables, much like in ordinary , but the linear predictor \eta = X \beta is connected to the expected response \mu through a monotonic link function g, yielding g(\mu) = X \beta, where \beta is the parameter vector. This framework accommodates non-normal response distributions from the , such as , , or gamma, allowing GLMs to model diverse data types including binary outcomes and counts. The canonical link functions vary by distribution; for example, the logit link g(\mu) = \log\left(\frac{\mu}{1-\mu}\right) is standard for responses in . Parameter estimation in GLMs proceeds via maximum likelihood, which lacks a closed-form solution like the ordinary estimator \hat{\beta} = (X^T X)^{-1} X^T y of linear models, necessitating iterative numerical methods. The (IRLS) algorithm is the primary approach, transforming the problem into a sequence of weighted linear regressions. Starting with an initial guess for \beta, IRLS constructs a working response vector z that approximates the current \eta, along with a diagonal weight matrix W derived from the variance function and link derivative, then solves \hat{\beta}^{(k+1)} = (X^T W^{(k)} X)^{-1} X^T W^{(k)} z^{(k)} until . Throughout, the X remains fixed, while W updates to reflect the nonlinear of the model. In logistic regression, a canonical GLM example, the design matrix X is constructed identically to the linear case—for instance, with an intercept column and columns for continuous or categorical predictors— but \beta is estimated by maximizing the binomial likelihood rather than minimizing squared residuals. The logit link ensures predicted probabilities lie between 0 and 1, and IRLS iteratively refines \beta using weights W_{ii} = \mu_i (1 - \mu_i) based on the current fitted values. This adaptation highlights how the design matrix's structure supports predictor effects additively on the link scale, enabling interpretation of coefficients as log-odds changes, without altering X itself from its linear regression form.

In Experimental Design

In experimental design, the design matrix \mathbf{X} plays a central role in planning controlled experiments to optimize inference about effects. Optimal design criteria guide the selection of levels and run combinations to enhance the of estimates in the \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}. One widely used criterion is D-optimality, which maximizes the of the information matrix \det(\mathbf{X}^T \mathbf{X}), thereby minimizing the generalized variance of the least-squares estimates \hat{\boldsymbol{\beta}}. This approach selects subsets of points from a set of levels, ensuring efficient use of limited experimental resources while reducing the volume of the confidence ellipsoid for \boldsymbol{\beta}. Another criterion, A-optimality, minimizes the of the variance-covariance matrix \trace((\mathbf{X}^T \mathbf{X})^{-1}), which lowers the average variance of the estimates and improves overall prediction accuracy in models. These criteria prioritize designs that yield precise inferences, often computed algorithmically for complex spaces. Factorial designs, particularly full $2^[k](/page/K) setups, construct the design matrix \mathbf{X} with rows representing all $2^[k](/page/K) combinations of k factors at two levels each, typically coded as -1 (low) and +1 (high). Each factor column contains an equal number of -1s and +1s, facilitating orthogonal contrasts for estimating main effects and interactions. The standard order arranges columns such that the first alternates -1 and +1, while subsequent columns repeat blocks of $2^{i-1} identical values before switching signs, ensuring balanced representation across levels. For resource-constrained scenarios, fractional factorial designs select a of these rows (e.g., $2^{k-p}), preserving effects while aliasing higher-order interactions, with the matrix still using \pm 1 coding to maintain estimability. These constructions often yield orthogonal columns, enhancing estimation efficiency as discussed in properties of the design matrix. Blocking and further refine the design matrix to control extraneous variability. In randomized complete block designs, block effects are incorporated as additional columns in \mathbf{X}, typically using indicator variables for each block level, allowing the model to account for nuisance factors like batch or operator differences without treatment effects. The model becomes \mathbf{y} = \mathbf{X}_T \boldsymbol{\beta}_T + \mathbf{X}_B \boldsymbol{\beta}_B + \boldsymbol{\epsilon}, where \mathbf{X}_T and \mathbf{X}_B the matrix for treatments and blocks, respectively, partitioning total variability into treatment, block, and error components via ANOVA. within blocks assigns levels to experimental units randomly, ensuring unbiased estimates and mitigating systematic errors, with the full matrix reflecting these assignments for subsequent . Software tools facilitate the generation of these design matrices. In R, the AlgDesign package computes exact D-, A-, and I-optimal designs, including blocked variants, from user-specified candidate sets and models, with version 1.2.1.2 released in April 2025. Similarly, the skpr package supports optimal design creation for D-, A-, and other criteria, handling split-plot and blocked structures while evaluating power, updated to version 1.9.2 in September 2025.

References

  1. [1]
    7.1 - Linear Models | STAT 555
    X is called the design matrix. It is a matrix with known entries which is a function of our data x - in this case a column of 1's and the column with the ...
  2. [2]
    Chapter 7 Design Matrices | Statistics for Genomics
    Design matrices are fundamental concepts used in differential expression analysis to understand the relationship between gene expression and explanatory ...
  3. [3]
    Design matrix - StatLect
    A design matrix is a matrix containing data about multiple characteristics of several individuals or objects. Each row corresponds to an individual and each ...Examples · How the design matrix is... · Rank of the design matrix
  4. [4]
    5.3.3.3.1. Two-level full factorial designs
    The table formed by the columns X1, X2 and X3 is called the Design Table or Design Matrix. Orthogonality Properties of Analysis Matrices for 2-Factor ...
  5. [5]
    Linear Regression Models - SAS Help Center
    where is the design matrix (rows are observations and columns are the regressors), is the vector of unknown parameters, and is the vector of unobservable ...Missing: definition | Show results with:definition
  6. [6]
    Testing the assumptions of linear regression - Duke People
    The four assumptions are: linearity/additivity, independence of errors, homoscedasticity (constant variance) of errors, and normality of the error distribution.
  7. [7]
    1.1 - A Quick History of the Design of Experiments (DOE) | STAT 503
    Note: A lot of what we are going to learn in this course goes back to what Sir Ronald Fisher developed in the UK in the first half of the 20th century. He ...Missing: matrix | Show results with:matrix
  8. [8]
    [PDF] Stat 5102 Notes: Regression
    Apr 27, 2007 · Note that y and e have dimension n, but β has dimension p. The matrix X is called the design matrix or model matrix and has dimension n × p.
  9. [9]
    [PDF] Topic 3 Chapter 5: Linear Regression in Matrix Form
    ... design matrix and additional beta parameters). Multiple Regression. Data for Multiple Regression. • ... Solutions -> analysis -> interactive data analysis.
  10. [10]
    [PDF] 3.0 Linear Regression with Matrices - Stat@Duke
    The Design Matrix is the n × (p + 1) matrix X whose ith row is. (1,xi1,...,xip) for i = 1,...n. The name comes from the fact that in.
  11. [11]
    Lecture 12: Matrix Notation for Regression
    Here, β represents a vector of regression coefficients (intercepts, group means, etc.), X is an n×k “design matrix” for the model (more on this later), ...
  12. [12]
    [PDF] Chapter 2 Multiple Linear Regression
    The no intercept MLR model, also known as regression through the origin, is still Y = Xβ + e, but there is no intercept β1 in the model, so X does not.Missing: centering | Show results with:centering
  13. [13]
    [PDF] Multicollinearity (and Model Validation) - San Jose State University
    To do this, Ridge regression assumes that the model has no intercept term, or both the response and the predictors have been centered so that. ˆ β0 = 0. Dr ...
  14. [14]
    [PDF] Applied Linear Regression - Purdue Department of Statistics
    ... Predictors and Regressors, 55. 3.4 Ordinary Least Squares, 58. 3.4.1 Data and Matrix ... Continuous Predictor, 103. 5.1.4 The Main Effects Model, 106. 5.2 Many ...
  15. [15]
    Centering in Multiple Regression Does Not Always Reduce ...
    Mean centering is recommended both to simplify the interpretation of the coefficients and to reduce the problem of multicollinearity.
  16. [16]
    4.4 - Dummy Variable Regression | STAT 502
    ### Summary of Dummy Variable Regression for Categorical Predictors
  17. [17]
    Coding Systems for Categorical Variables in Regression Analysis
    Unlike dummy coding, effect coding allows you to assign different weights the various levels of the categorical variable. While the “rule” in dummy coding is ...
  18. [18]
    Coding schemes for categorical predictors - Support - Minitab
    The default coding scheme is 1, 0 (also known as binary and dummy coding) is commonly used in regression analyses. Using 1, 0 coding, coefficients represent the ...
  19. [19]
    Chapter 6 Categorical predictor variables | Analysing Data using ...
    The basic trick that we need is dummy coding. Dummy coding involves making one or more new variables, that reflects the categorisation seen with a categorical ...<|control11|><|separator|>
  20. [20]
    [PDF] Linear Models - Math
    is the socalled “regression matrix,” or “design matrix.” The elements of the × matrix X are assumed to be known; these are the “descriptive”.
  21. [21]
    [PDF] Full Rank Linear Models
    Definition 2.1. A set V ⊆ Rk is a vector space if for any vectors x, y, z ∈ V, and scalars a and b, the operations of vector addition and scalar.
  22. [22]
    None
    ### Summary: Using SVD to Verify Full Rank of Design Matrix in Linear Regression
  23. [23]
    [PDF] Applying Generalized Linear Models - LEG/UFPR
    1.3.3 Aliasing. For various reasons, the design matrix, Xn×p, in a linear model may not be of full rank p. If the columns, x1,..., xj, form a linearly ...
  24. [24]
    [PDF] 8 Orthogonal Structure in the Design Matrix
    Then the theorem implies that the optimal design has orthogonal columns and all variables set to +1 or −1. If n = 2k such a design is called a 2k factorial ...Missing: variance | Show results with:variance
  25. [25]
    [PDF] Chapter 4 - The Gauss-Markov Theorem
    By the Gauss-Markov theorem bγLSE is the BLUE for γ and l/β = a/γ is a linear function of γ. n − k . Proof. The simple proof is to observe that this estimator ...
  26. [26]
    None
    ### Summary of Helmert Contrasts in ANOVA from https://pdixon.stat.iastate.edu/stat511/notes4/part%201.pdf
  27. [27]
    [PDF] ARTICLE TEMPLATE Variance Inflation Factor and Condition ...
    ABSTRACT The Variance Inflation Factor and the Condition Number are measures traditionally applied to detect the presence of collinearity in a multiple linear ...
  28. [28]
    Linear regression model | Mathematics and matrix notation - StatLect
    Linear regression model · Dependent and independent variables · Regression coefficients and errors · Example · Matrix notation · Intercept · Zero-mean errors · OLS ...Matrix notation · Zero-mean errors · OLS estimator · Formula for the OLS estimator<|control11|><|separator|>
  29. [29]
    24.4 - Mean and Variance of Sample Mean | STAT 414
    The mean of the sample mean is the same as the mean of the individual population. The variance of the sample mean decreases as the sample size increases.
  30. [30]
    [PDF] Gauss on least-squares and maximum-likelihood estimation1
    Dec 18, 2021 · Key words: Gauss, least squares, maximum likelihood, history ... postulate of the arithmetic mean, which is in fact a consequence of the nor-.Missing: origin | Show results with:origin
  31. [31]
    [PDF] Chapter 5 – Matrix Approach to Simple Linear Regression - Statistics
    Definition: A matrix is a rectangular array of numbers or symbolic elements. • In many applications, the rows of a matrix will represent individuals cases ...
  32. [32]
    [PDF] Lecture 13: Simple Linear Regression in Matrix Format
    Oct 14, 2015 · That is, xβ is the n × 1 matrix which contains the point predictions. The matrix x is sometimes called the design matrix. 1.2 Mean Squared Error.
  33. [33]
    5.4 - A Matrix Formulation of the Multiple Regression Model
    Here, we review basic matrix algebra, as well as learn some of the more important multiple regression formulas in matrix form.
  34. [34]
    10.4 - Multicollinearity | STAT 462
    Multicollinearity exists when two or more of the predictors in a regression model are moderately or highly correlated with one another.
  35. [35]
    [PDF] One-Way Analysis of Variance - University of Minnesota Twin Cities
    Jan 4, 2017 · In matrix form, the one-way ANOVA model is y = Xb + e ... In one-way ANOVA model, the relevant sums-of-squares are. Total: SST ...
  36. [36]
    [PDF] Chapter 5 One Way ANOVA
    Definition 5.9. The cell means model is the parameterization of the one way fixed effects ANOVA model such that. Yij = µi + eij where Yij is the value of the ...
  37. [37]
    Balanced and unbalanced designs in ANOVA models - Minitab
    An unbalanced design has an unequal number of observations. Balanced Design. You have exactly one observation for all possible combinations of the factor levels ...
  38. [38]
    4: ANOVA Models Part II - STAT ONLINE
    Apply the overall mean, cell means, and dummy variable regression models for a one-way ANOVA and interpret the results. Identify the design matrix and the ...
  39. [39]
    Generalized Linear Models - jstor
    The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distri-.
  40. [40]
    [PDF] Generalized Linear Models - Department of Statistics
    Dec 6, 2021 · Logistic regression is a specific type of GLM. We will develop logistic regression from first principles before discussing GLM's in general ...
  41. [41]
    5.5.2.1. D-Optimal designs - Information Technology Laboratory
    D-optimal designs are often used when classical designs do not apply, D-optimal designs are one form of design provided by a computer algorithm.
  42. [42]
    [PDF] D-Optimal Designs - NCSS
    D-optimal designs are constructed to minimize the generalized variance of the estimated regression coefficients. In the multiple regression setting, ...
  43. [43]
    Optimality Criteria - Mixture Designs - Stat-Ease
    An A-optimal design minimizes the trace of the variance-covariance matrix. This has the effect of minimizing the average prediction variance of the polynomial ...
  44. [44]
    Lesson 6: The \(2^k\) Factorial Design - STAT ONLINE
    The 2 k refers to designs with k factors where each factor has just two levels. These designs are created to explore a large number of factors, with each factor ...
  45. [45]
    5.3.3.2. Randomized block designs
    The general rule is: "Block what you can, randomize what you cannot." Blocking is used to remove the effects of a few of the most important nuisance variables ...
  46. [46]
    [PDF] Design of Engineering Experiments The Blocking Principle
    Blocking is a technique to deal with nuisance factors, which are factors of no interest but their variability needs to be minimized. A block is a specific ...
  47. [47]
    AlgDesign: Algorithmic Experimental Design
    - **Does AlgDesign generate design matrices?** Yes, it generates design matrices for experimental designs.
  48. [48]
    skpr: Design of Experiments Suite: Generate and Evaluate Optimal Designs
    ### Summary of skpr Package (Version 1.9.2, Released 2025)