Fact-checked by Grok 2 weeks ago

Vector autoregression

Vector autoregression (VAR) is a multivariate time series modeling framework that extends univariate autoregressive models to capture the dynamic interdependencies among multiple endogenous variables, where each variable is expressed as a linear function of its own past values and the past values of all other variables in the system, plus a vector of white noise errors.^[1] In its basic form, a VAR(p) model for an n-dimensional time series Y_t is specified as Y_t = c + \Pi_1 Y_{t-1} + \cdots + \Pi_p Y_{t-p} + \varepsilon_t, where c is a constant vector, the \Pi_i are n × n coefficient matrices, p is the lag order, and \varepsilon_t is a multivariate white noise process with covariance matrix \Sigma.^[1] This structure assumes covariance stationarity and allows for the inclusion of deterministic trends or exogenous variables as needed.^[1] Developed in the late 1970s and early 1980s, VAR models were pioneered by economist Christopher A. Sims as a response to the perceived shortcomings of traditional large-scale macroeconometric models, which relied on "incredible" identifying assumptions like strict exogeneity and exclusion restrictions that often lacked empirical justification and led to unreliable policy inferences.^[2] In his seminal 1980 paper, "Macroeconomics and Reality," Sims argued that these conventional approaches imposed too many theoretical priors, stifling data-driven analysis, and proposed VARs as a more atheoretical, reduced-form alternative that lets the data speak with minimal restrictions.^[3] This innovation marked a paradigm shift in empirical macroeconomics, earning Sims the 2011 Nobel Prize in Economic Sciences (shared with Thomas Sargent) for advancing time series analysis in economic policy evaluation.^[2] VAR models are estimated equation-by-equation using ordinary least squares (OLS), treating the system as a set of seemingly unrelated regressions, which yields efficient and consistent estimates under standard assumptions.^[1] Lag length selection is typically guided by information criteria such as the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), or Hannan-Quinn (HQ) to balance model fit and parsimony.^[1] Once estimated, VARs facilitate a range of analyses, including multi-step forecasting by iteratively applying the model, impulse response functions to trace the effects of shocks on variables over time, variance decompositions to assess shock contributions to forecast error variance, and Granger causality tests to evaluate predictive relationships among variables.^[4] Beyond economics, VARs have found applications in finance for modeling asset returns and volatility spillovers,^[4] in neuroscience for brain network analysis,^[5] and in political science for studying event interdependencies.^[6] Extensions such as Bayesian VARs address overfitting in high-dimensional settings by incorporating priors, while structural VARs impose economic identifying restrictions to interpret shocks more causally.^[7] Despite their flexibility, VARs require stationary data (or differencing for integrated series) and can suffer from the curse of dimensionality with many variables, prompting ongoing refinements like factor-augmented or sparse VARs.^[4]

Introduction

Definition and Motivation

Vector autoregression (VAR) is a multivariate time series model that captures the linear interdependencies among multiple variables evolving over time. In this framework, each variable in the system is modeled as a linear function of its own past values, the past values of all other variables in the system, and a contemporaneous error term. This approach treats all variables symmetrically as endogenous, allowing the model to flexibly represent how shocks to one variable propagate through the system to affect others without imposing restrictive assumptions about causality or exogeneity a priori.^[8]^[1] The motivation for VAR models stems from the need to generalize univariate autoregressive processes to multivariate contexts, where economic or financial variables influence each other dynamically over time. Unlike single-variable AR models, VARs enable the examination of feedback loops and joint dynamics, such as how changes in interest rates might impact output and inflation simultaneously in macroeconomic analysis. In particular, they provide a practical tool for policy evaluation by simulating the effects of interventions without relying on overly structured theoretical models that may fail empirical tests.^[8]^[4] VAR models were introduced by Christopher A. Sims in 1980 as an alternative to the large-scale macroeconometric models prevalent at the time, which Sims critiqued for their implausible identifying restrictions and poor forecasting performance. By emphasizing unrestricted reduced-form representations, VARs allow researchers to let the data reveal empirical regularities in variable interactions, supporting applications in forecasting economic time series and assessing policy impacts.^[3]

Historical Development

The origins of vector autoregression (VAR) models trace back to advancements in multivariate time series analysis during the 1970s, building on univariate autoregressive techniques to capture interdependencies among multiple economic variables.^[1] These early efforts laid the groundwork for handling dynamic relationships in economic data without rigid theoretical restrictions. The formal introduction of VAR models to econometrics occurred in 1980, when Christopher Sims published "Macroeconomics and Reality" in Econometrica, proposing VARs as a flexible, data-driven alternative to traditional structural econometric models.^[3] Sims' work represented a significant critique of the Cowles Commission approach, which relied on simultaneous equation systems with strong identifying assumptions that often lacked empirical support and failed to account for rational expectations.^[3] Alongside Thomas Sargent, Sims emphasized VARs' ability to empirically assess macroeconomic relationships without imposing "incredible" a priori restrictions, influencing a shift toward more agnostic empirical methods in macroeconomics.^[9] Mark Watson further advanced this framework through collaborative research on VAR estimation and inference, solidifying its role in policy analysis and forecasting.^[10] In the late 1980s and 1990s, VAR models evolved from purely reduced-form specifications to structural interpretations, addressing identification challenges in Sims' original atheoretical setup. A key milestone was Olivier Blanchard and Danny Quah's 1989 paper in the American Economic Review, which introduced long-run restrictions to decompose VAR innovations into demand and supply shocks, enabling clearer causal inferences in macroeconomic disturbances. By the 1990s, structural VAR (SVAR) models gained prominence for policy counterfactuals, though debates persisted on assumption validity.^[10] Post-2000, integration with Bayesian methods addressed overfitting in high-dimensional VARs; the Minnesota prior, originally developed by Thomas Doan, Robert Litterman, and Sims in 1984, was refined for shrinkage estimation, enhancing forecasting accuracy and parameter stability in large systems. This Bayesian evolution has since become standard for incorporating prior economic beliefs while maintaining empirical flexibility.^[11]

Model Specification

Core Components and Notation

A vector autoregression (VAR) of order p, denoted VAR(p), models the dynamic interdependencies among a set of n endogenous time series variables as a system of linear equations. The general form is given by

Y_t = A_1 Y_{t-1} + A_2 Y_{t-2} + \cdots + A_p Y_{t-p} + \epsilon_t,

where Y_t is an n \times 1 vector of endogenous variables at time t, each A_i (for i = 1, \dots, p) is an n \times n matrix of coefficients capturing the influence of the i-th lag of the system on the current values, p is the lag order selected based on information criteria or theoretical considerations, and \epsilon_t is an n \times 1 vector of error terms. The endogenous variables in Y_t are treated as jointly determined within the system, allowing each to respond to lagged values of all variables, including itself, which distinguishes VAR models from univariate autoregressions. Deterministic terms, such as a constant intercept vector or linear trends, are often incorporated to account for non-stochastic components; for instance, the model can be extended to

Y_t = c + A_1 Y_{t-1} + \cdots + A_p Y_{t-p} + \epsilon_t,

where c is an n \times 1 vector of intercepts representing the mean of each series under stationarity assumptions. Trends may be added as d t, with d an n \times 1 vector of trend coefficients, particularly for non-stationary data after differencing. The error terms \epsilon_t are assumed to be white noise, satisfying E(\epsilon_t) = 0, E(\epsilon_t \epsilon_t') = \Sigma where \Sigma is a positive definite n \times n covariance matrix, and E(\epsilon_t \epsilon_s') = 0 for t \neq s, ensuring no serial correlation across time periods. These assumptions imply that innovations are contemporaneously correlated but unpredictable from past information.

Stationarity and Order of Integration

In vector autoregression (VAR) models, stationarity is defined in a multivariate context as covariance stationarity, which requires that the mean vector of the process remains constant over time, the covariance matrix is time-invariant, and the autocovariance matrices depend solely on the time lag between observations. This condition ensures that the statistical properties of the multivariate time series do not change systematically with time, allowing for reliable inference and forecasting within the VAR framework. Violation of covariance stationarity can lead to unstable parameter estimates and misleading interpretations of dynamic relationships among variables. The order of integration classifies the stationarity properties of time series variables in VAR models. A process is integrated of order zero, denoted I(0), if it is covariance stationary. In contrast, a process is integrated of order one, I(1), if it requires first differencing to become stationary, meaning the levels exhibit unit roots and non-constant variance over time. For multivariate systems, variables may share common stochastic trends, leading to cointegration, where linear combinations of I(1) variables are I(0), capturing long-run equilibrium relationships despite individual non-stationarity. Prior to estimating a VAR model, pre-estimation tests for stationarity are essential to determine the order of integration. Unit root tests, such as the Augmented Dickey-Fuller (ADF) test, are commonly applied to each variable in the system to assess the presence of unit roots under the null hypothesis of non-stationarity. The ADF test augments the basic Dickey-Fuller regression with lagged differences to account for serial correlation, providing critical values for inference since standard t-statistics do not apply under the unit root null. If variables are found to be I(1), differencing is used to induce stationarity, transforming the model to operate on \Delta Y_t = Y_t - Y_{t-1}, where Y_t is the vector of variables. Fitting a VAR model to non-stationary data without addressing integration issues results in spurious regressions, where apparent statistical significance arises from shared trends rather than true economic relationships. Granger and Newbold demonstrated this phenomenon through simulations of independent random walks, showing inflated R-squared values and invalid t-statistics in regressions of non-stationary series. For cointegrated systems, a VAR in levels may still be valid but requires reparameterization into a vector error correction model (VECM) to explicitly incorporate the error correction mechanism that enforces long-run equilibria while modeling short-run dynamics.

Companion Matrix Representation

A vector autoregression of order p, denoted VAR(p), with n endogenous variables can be reformulated into an equivalent VAR(1) model using the companion matrix representation. This involves stacking the current and lagged values of the vector Y_t into a state vector Z_t = (Y_t', Y_{t-1}', \dots, Y_{t-p+1}')', which is an np \times 1 vector. The model then becomes Z_t = F Z_{t-1} + u_t, where F is the np \times np companion matrix composed of blocks from the original coefficient matrices A_i (i=1,\dots,p), and u_t = (\epsilon_t', 0_{n \times 1}', \dots, 0_{n \times 1}')' is the error vector with the innovation \epsilon_t in the first block and zeros elsewhere. The companion matrix F has a specific block structure that embeds the dynamics of the higher-order lags. Its first block row consists of the coefficient matrices A_1, A_2, \dots, A_p, each of size n \times n. The subsequent block rows feature identity matrices I_n along the subdiagonal to shift the lags, with zeros elsewhere:

F = \begin{pmatrix} A_1 & A_2 & \cdots & A_{p-1} & A_p \\ I_n & 0 & \cdots & 0 & 0 \\ 0 & I_n & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & I_n & 0 \end{pmatrix}.

This structure ensures that the reformulation preserves the original VAR(p) dynamics while reducing it to a first-order form. The companion matrix representation offers several analytical advantages. It facilitates eigenvalue analysis of F, where the process is stable if all eigenvalues of F have modulus less than 1, equivalent to the roots of the characteristic polynomial lying outside the unit circle.

Illustrative Example

To illustrate the specification and computation of a vector autoregression (VAR) model, consider a simple bivariate VAR(1) with two variables: GDP growth rate y_{1t} (in percentage points) and inflation rate y_{2t} (also in percentage points). This setup, with n=2 variables and lag order p=1, captures the interdependence between the two series through lagged values, assuming the process is stationary. The system of equations is specified as follows:

\begin{align} y_{1t} &= \nu_1 + a_{11} y_{1,t-1} + a_{12} y_{2,t-1} + \epsilon_{1t}, \\ y_{2t} &= \nu_2 + a_{21} y_{1,t-1} + a_{22} y_{2,t-1} + \epsilon_{2t}, \end{align}

where \boldsymbol{\nu} = \begin{pmatrix} \nu_1 \\ \nu_2 \end{pmatrix} is the intercept vector, \mathbf{A}_1 = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} is the coefficient matrix for lag 1, and \boldsymbol{\epsilon}_t = \begin{pmatrix} \epsilon_{1t} \\ \epsilon_{2t} \end{pmatrix} \sim N(\mathbf{0}, \boldsymbol{\Sigma}) with covariance matrix \boldsymbol{\Sigma} = \begin{pmatrix} 1 & 0.5 \\ 0.5 & 1 \end{pmatrix}. For this hypothetical example, inspired by standard simulated bivariate VAR(1) models, set \nu_1 = 0.5, \nu_2 = 1.0, a_{11} = 0.6, a_{12} = 0.3, a_{21} = 0.2, and a_{22} = 0.7; these coefficients ensure stationarity, as the eigenvalues of \mathbf{A}_1 lie inside the unit circle (0.4 and 0.9). The off-diagonal elements a_{12} and a_{21} reflect cross-variable dynamics: a shock to GDP growth influences future inflation, and vice versa. The population mean is \boldsymbol{\mu} = (I - \mathbf{A}_1)^{-1} \boldsymbol{\nu} \approx (7.50, 8.33)^\top. To demonstrate computation, simulate a short sample path starting from initial values y_{1,0} = 0, y_{2,0} = 0, and specific shocks to highlight propagation. Assume shocks \boldsymbol{\epsilon}_1 = (0.5, 0)^\top, \boldsymbol{\epsilon}_2 = (0, 0.3)^\top, \boldsymbol{\epsilon}_3 = (-0.2, 0.1)^\top, and \boldsymbol{\epsilon}_t = (0, 0)^\top for t \geq 4; these are drawn from the specified normal distribution for illustration. The first few periods are computed step-by-step:

At t=1: y_{11} = 0.5 + 0.6(0) + 0.3(0) + 0.5 = 1.00, y_{21} = 1.0 + 0.2(0) + 0.7(0) + 0 = 1.00.
At t=2: y_{12} = 0.5 + 0.6(1.0) + 0.3(1.0) + 0 = 1.40, y_{22} = 1.0 + 0.2(1.0) + 0.7(1.0) + 0.3 = 2.20.
At t=3: y_{13} = 0.5 + 0.6(1.4) + 0.3(2.2) - 0.2 = 1.80, y_{23} = 1.0 + 0.2(1.4) + 0.7(2.2) + 0.1 = 2.92.
At t=4: y_{14} = 0.5 + 0.6(1.8) + 0.3(2.92) + 0 = 2.46, y_{24} = 1.0 + 0.2(1.8) + 0.7(2.92) + 0 = 3.40.

The positive shock to GDP growth at t=1 raises y_{11}, which then boosts y_{22} via a_{21}; similarly, the shock to inflation at t=2 feeds back into y_{13} through a_{12}, illustrating how disturbances propagate across equations over time. For a longer simulation of 100 observations using these parameters (generated via standard methods in statistical software), the sample path reveals persistent but mean-reverting oscillations, with GDP growth leading mild inflationary pressures due to the asymmetric coefficients. In a long simulation, the sample mean would approach the population \boldsymbol{\mu} \approx (7.50, 8.33)^\top, and the sample covariance matrix would be close to \boldsymbol{\Sigma}, reflecting the built-in correlation. In a time-series plot of the simulated series, the two lines would show synchronized fluctuations, with visible lags in response to initial shocks, underscoring the model's ability to capture joint dynamics without assuming exogeneity. This bivariate representation can optionally be stacked into a companion form for higher-order analysis, though it remains a first-order system here.

Time t	GDP Growth y_{1t}	Inflation y_{2t}	Shocks (\epsilon_{1t}, \epsilon_{2t})
0	0.00	0.00	-
1	1.00	1.00	(0.5, 0)
2	1.40	2.20	(0, 0.3)
3	1.80	2.92	(-0.2, 0.1)
4	2.46	3.40	(0, 0)

Model Forms

Reduced-Form VAR

The reduced-form vector autoregression (VAR) model represents an unrestricted multivariate time series framework in which each endogenous variable is modeled as a linear combination of its own past values and the past values of all other variables in the system, without imposing contemporaneous restrictions on the relationships among them. The standard equation is

Y_t = \sum_{i=1}^p A_i Y_{t-i} + \epsilon_t,

where Y_t is an n \times 1 vector of stationary time series variables, the A_i are n \times n matrices of autoregressive coefficients for lag i = 1, \dots, p, and \epsilon_t is an n \times 1 white noise error vector with zero mean and a positive definite covariance matrix \Sigma = E(\epsilon_t \epsilon_t'), permitting correlations among the contemporaneous errors.^[4] This formulation treats all variables as endogenous, eschewing exogeneity assumptions and enabling the model to flexibly capture interdependencies in the data driven by underlying economic dynamics.^[12] A key feature of the reduced-form VAR is its atheoretical structure, which avoids embedding specific economic theory into the contemporaneous relations; instead, these are summarized implicitly through the error covariance matrix \Sigma, reflecting reduced-form dynamics derived from a deeper structural model.^[13] This approach simplifies estimation, as the model can be fit equation by equation using ordinary least squares, making it computationally straightforward and robust to model uncertainty in applications like macroeconomic forecasting.^[4] By not requiring a priori identification of causal structures, the reduced-form VAR provides a neutral description of joint time series behavior, particularly useful when theoretical priors risk misspecification.^[3] Under the assumption that the roots of the characteristic polynomial lie outside the unit circle (ensuring stationarity), the reduced-form VAR admits an infinite-order vector moving average (VMA) representation:

Y_t = \mu + \sum_{j=0}^\infty \Theta_j \epsilon_{t-j},

where \mu is the mean vector and the \Theta_j matrices (with \Theta_0 = I_n) trace the cumulative dynamic responses of the system to past shocks, facilitating the study of long-run equilibrium relationships and propagation mechanisms.^[4] This VMA form underscores the model's ability to represent the data-generating process as a convolution of innovations, highlighting its role in uncovering empirical regularities without theoretical imposition. The advocacy for reduced-form VARs originated with Christopher Sims, who in his 1980 paper argued that traditional macroeconomic models often imposed implausible identifying restrictions that distorted inference; he promoted unrestricted VARs as a data-driven alternative to reveal true empirical dynamics and avoid such biases in policy analysis.^[3]

Structural VAR

The structural vector autoregression (SVAR) model imposes contemporaneous restrictions on the relationships among variables to identify economically meaningful shocks, distinguishing it from the unrestricted reduced-form VAR. The model is typically specified as B Y_t = \sum_{i=1}^p \Gamma_i Y_{t-i} + u_t, where Y_t is an n \times 1 vector of endogenous variables, B is an n \times n invertible contemporaneous coefficient matrix, the \Gamma_i are n \times n matrices of autoregressive coefficients, and u_t is an n \times 1 vector of structural shocks with E(u_t) = 0 and E(u_t u_t') = D, a diagonal covariance matrix ensuring orthogonality among the shocks.^[14]^[15] The primary purpose of the SVAR framework is to recover these orthogonal structural shocks, which represent primitive economic disturbances such as monetary policy changes or supply shocks, enabling analysis of their dynamic effects on the economy for policy evaluation and theoretical testing.^[16]^[17] The SVAR relates directly to the reduced-form VAR through the transformation A = B^{-1}, where the reduced-form errors are \epsilon_t = A u_t, yielding the reduced-form covariance matrix \Sigma = A D A'; in the just-identified case where D = I_n, this simplifies to \Sigma = (B^{-1})(B^{-1})'.^[14]^[18] SVAR models come in several types, with the recursive form—often implemented via Cholesky decomposition assuming a lower triangular B—being the simplest, as it imposes a causal ordering on contemporaneous relations without feedback among shocks at time t. Non-recursive SVARs relax this by allowing more general contemporaneous interactions, while those with long-run restrictions set certain elements of the infinite-horizon impact matrix to zero, such as assuming demand shocks have no permanent effect on output.^[14]^[18]

Identification Challenges

In structural vector autoregression (SVAR) models, estimation of the reduced-form equations recovers the autoregressive coefficient matrices and the covariance matrix \Sigma of the reduced-form errors, but these do not uniquely determine the structural parameters or the orthogonal structural shocks. The covariance matrix \Sigma can be factored as \Sigma = BB', where B captures the contemporaneous relationships among variables and shocks, but infinitely many such factorizations exist, leading to an underidentification problem that requires additional economic restrictions to resolve.^[19] For just-identification in an n-variable system, exactly n(n-1)/2 independent restrictions are needed on the elements of B (or the impact matrix), corresponding to the number of free off-diagonal parameters after normalizing the diagonal to unity.^[19] Common strategies to achieve identification impose zero restrictions on contemporaneous effects, resulting in a recursive structure where variables are ordered such that earlier variables do not respond contemporaneously to shocks in later ones. A specific implementation is the Cholesky decomposition, which assumes a lower-triangular form for B based on this ordering and directly computes the structural shocks from the eigenvalues and eigenvectors of \Sigma. Alternative approaches include long-run restrictions, which set the cumulative effect of certain shocks on specific variables to zero over infinite horizons; for instance, Blanchard and Quah (1989) identify aggregate supply and demand disturbances by assuming demand shocks have no permanent effect on output.^[18] Sign restrictions provide a less rigid method by constraining the signs (positive or negative) of impulse responses to particular shocks over specified horizons, without requiring exact zeros.^[20] Uhlig (2005) applies this to monetary policy shocks, requiring, for example, that a contractionary policy raises interest rates while leaving output unchanged on impact and prices non-decreasing.^[20] These strategies often yield sets of admissible models rather than point estimates, with inference based on the distribution across them. Despite these advances, identification remains challenging due to sensitivity of results to the choice and economic justification of restrictions, as different schemes can produce conflicting impulse responses.^[18] Over-identification, imposing more than n(n-1)/2 restrictions, allows testing via likelihood ratio statistics or GMM-based over-identification tests to assess restriction validity.^[19]

Estimation

Parameter Estimation via OLS

In vector autoregression (VAR) models, parameter estimation is typically performed using ordinary least squares (OLS) applied equation by equation to the reduced-form specification. For a VAR(p) model with K variables, the j-th equation regresses the current value of variable y_{t,j} on lagged values of all K variables up to lag p, along with a constant term if included. The regressors for each equation are identical, consisting of a stacked matrix X_t that includes the intercept and the lagged observations \mathbf{y}_{t-1}, \dots, \mathbf{y}_{t-p}. This yields coefficient estimates \hat{\mathbf{A}}_i^{(j)} for i=1,\dots,p and the intercept \hat{\mathbf{c}}^{(j)} in the j-th equation, where the superscript denotes the equation-specific row. The full system of coefficients can be estimated simultaneously in vectorized form. Let Y be the T \times K matrix of observations, and Z the T \times m matrix of regressors (with m = 1 + Kp), then the coefficient matrix \hat{A} satisfies \hat{A} = (Z'Z)^{-1} Z' Y, where A stacks the intercepts and lag coefficient matrices row-wise. This multivariate least squares approach is computationally straightforward and leverages the structure of the VAR. Under the assumption of stationarity, where the VAR process has roots outside the unit circle, the OLS estimators are consistent and asymptotically normal as the sample size T grows. Specifically, if the errors are white noise with finite fourth moments, \sqrt{T} \operatorname{vec}(\hat{A} - A) \xrightarrow{d} N(0, \Gamma^{-1} \otimes \Sigma_u), where \Gamma = \lim_{T \to \infty} Z'Z / T is positive definite and \Sigma_u is the error covariance matrix. This ensures reliable inference for stationary VARs. For cointegrated systems, where the variables are integrated of order one but share long-run equilibrium relations, OLS estimation of the VAR in levels remains consistent, with super-consistency for the cointegrating parameters, converging at rate T rather than \sqrt{T}. This property holds even without explicit reparameterization to a vector error correction model, though finite-sample bias may arise.^[10] Given that the regressors are identical across equations, OLS estimation is equivalent to seemingly unrelated regressions (SUR) generalized least squares (GLS) and achieves efficiency under joint normality of the errors, making it the preferred method despite cross-equation error correlations.

Error Covariance Matrix

In vector autoregression (VAR) models, the error covariance matrix \Sigma captures the contemporaneous correlations and variances among the disturbance terms across equations. After estimating the coefficient matrices via ordinary least squares (OLS) on each equation, the residuals \hat{\epsilon}_t are obtained as the differences between observed values and fitted values. These residuals serve as proxies for the true errors \epsilon_t, enabling the computation of a sample covariance matrix to estimate \Sigma. The standard estimator for the error covariance matrix is the sample covariance of the residuals, given by

\hat{\Sigma} = \frac{1}{T - m} \sum_{t=p+1}^T \hat{\epsilon}_t \hat{\epsilon}_t',

where T is the total number of observations, p is the lag order, m = 1 + Kp is the number of regressors per equation (including the intercept), and the sum runs over the T - p usable observations after accounting for initial lags. This estimator is consistent for the true \Sigma under standard assumptions, including stationarity of the VAR process and ergodicity of the second moments, and is unbiased under multivariate normality. It plays a crucial role in inference, as it is used to construct standard errors for the parameter estimates and to derive asymptotic distributions for hypothesis tests.^[1] The diagonal elements of \hat{\Sigma} represent the marginal variances of the individual equation errors, indicating the scale of unexplained variation in each variable, while the off-diagonal elements quantify the contemporaneous covariances (or correlations after normalization) between errors across variables, reflecting potential spillover effects within the same time period.

Asymptotic Covariance of Estimators

In vector autoregression (VAR) models estimated via ordinary least squares (OLS), the estimator of the coefficient matrices exhibits desirable asymptotic properties under standard assumptions of stationarity and innovations that are independent and identically distributed (i.i.d.) with finite variance. Specifically, for a stationary VAR(p) process Y_t = A_1 Y_{t-1} + \dots + A_p Y_{t-p} + \varepsilon_t, where \varepsilon_t \sim \text{i.i.d.}(0, \Sigma) and the process has all roots outside the unit circle, the vectorized form of the OLS estimator \hat{A} satisfies

\sqrt{T} \operatorname{vec}(\hat{A} - A) \xrightarrow{d} N\left(0, \Sigma \otimes \left( E[Z_{t-1} Z_{t-1}'] \right)^{-1} \right),

as the sample size T \to \infty, where A = (A_1', \dots, A_p')', Z_{t-1} stacks the lagged vectors (Y_{t-1}', \dots, Y_{t-p}')' along with any deterministic terms, and \operatorname{vec}(\cdot) denotes the vectorization operator that stacks the columns of a matrix into a single column vector.^[21]^[22] The Kronecker product \Sigma \otimes \left( E[Z_{t-1} Z_{t-1}'] \right)^{-1} arises from the multivariate structure of the regression, capturing both the covariance of the innovations \Sigma and the inverse of the second-moment matrix of the regressors, which ensures the estimator's efficiency in large samples.^[22] The asymptotic covariance matrix provides the foundation for inference on individual coefficients. Standard errors are obtained from the square roots of the diagonal elements of this matrix, scaled by $1/\sqrt{T}, enabling t-tests and confidence intervals for testing hypotheses about specific entries in A, such as the significance of lagged effects in Granger causality.^[21] In practice, consistent estimators replace the population quantities: \hat{\Sigma} from the sample residual covariance and \widehat{E[Z_{t-1} Z_{t-1}']} = (1/T) \sum Z_{t-1} Z_{t-1}', yielding plug-in estimates for the covariance that converge to the true asymptotic variance.^[22] When the i.i.d. assumption on \varepsilon_t fails—due to conditional heteroskedasticity or serial correlation up to a moving average (MA) order—the standard asymptotic covariance understates parameter uncertainty, leading to invalid inference. In such cases, heteroskedasticity and autocorrelation consistent (HAC) estimators, such as the Newey-West covariance matrix, are employed to robustify standard errors in VAR models. The Newey-West estimator adjusts the covariance by weighting autocovariances of score functions up to a bandwidth lag m (often chosen as m \approx 4(T/100)^{2/9}), ensuring positive semi-definiteness and consistency under weak dependence:

\widehat{\operatorname{Var}}(\hat{\beta}) = \hat{\Gamma}_0 + \sum_{l=1}^m w(l/m) (\hat{\Gamma}_l + \hat{\Gamma}_l'),

where \hat{\beta} = \operatorname{vec}(\hat{A}), w(\cdot) is a Bartlett kernel, and \hat{\Gamma}_l are sample autocovariances of the OLS moment conditions; this is applied equation-by-equation or jointly in multivariate settings.^[22]

Degrees of Freedom Adjustments

In vector autoregressive (VAR) models of order p with n variables, the total number of parameters includes n^2 p + n for the coefficient matrices and intercepts across all equations, plus n(n+1)/2 for the symmetric error covariance matrix, resulting in rapid growth that can lead to overparameterization when the sample size T is small relative to n and p.^[23] This overparameterization reduces the effective degrees of freedom, inflating variance and potentially causing unreliable estimates, particularly in macroeconomic applications where quarterly data limits T to around 100-200 observations.^[23] To address these issues and preserve degrees of freedom, practitioners often employ lag selection procedures, such as information criteria that balance fit and parsimony, to choose a smaller p without sacrificing model adequacy.^[24] Reduced-rank estimation imposes structure on the coefficient matrices, such as low-rank constraints, to decrease the parameter count while maintaining interpretive power, thereby mitigating overparameterization in finite samples.^[25] For instance, envelope models or reduced-rank VAR variants can halve the effective parameters in high-dimensional settings.^[26] In ordinary least squares (OLS) estimation of individual VAR equations, t-statistics for coefficients are computed using a Student's t-distribution with degrees of freedom T - K, where K = np + 1 represents the number of regressors per equation (including the intercept).^[24] Small-sample estimation introduces bias in coefficients, particularly for autoregressive parameters near unity, with magnitudes up to 0.2 in samples of T=25 but diminishing to under 0.05 for T=200.^[27] To correct for such finite-sample distortions in inference, bootstrap methods resample residuals to generate empirical distributions for statistics like impulse responses, improving accuracy over asymptotic approximations. A common rule of thumb for reliable VAR estimation is to ensure T > 5np, providing sufficient observations relative to the core lag parameters to avoid severe overparameterization, though larger T (e.g., at least 50) is recommended for bias reduction in practice.^[27]

Interpretation and Analysis

Impulse Response Functions

Impulse response functions (IRFs) in vector autoregression (VAR) models quantify the dynamic responses of the system's variables to an exogenous shock in one of the error terms. Specifically, the IRF traces the effect of a one-unit innovation in the j-th structural error ε_{j,t} on the value of the vector Y_{t+h} at horizon h, while holding all other contemporaneous errors to zero. This interpretation derives from the infinite-order moving average (MA(∞)) representation of the VAR process, expressed as Y_t = \sum_{h=0}^\infty \Theta_h \epsilon_{t-h}, where the coefficient matrices \Theta_h capture the h-step ahead responses to the reduced-form errors \epsilon_t, with \Theta_0 = I_k and the errors assumed to have zero mean and covariance matrix \Sigma. In reduced-form VAR models, the \Theta_h matrices are computed recursively from the estimated autoregressive coefficients using the companion form representation of the model. The companion matrix F stacks the VAR coefficient matrices A_1, \dots, A_p into a kp \times kp block matrix, allowing the impulse responses to be obtained via matrix powers of the companion matrix F; specifically, \Theta_h is the top k \times k block of F^h for h \geq 1, with \Theta_0 = I_k. This recursive method efficiently generates the IRFs up to any desired horizon, facilitating the analysis of shock propagation through the system over time. For structural VAR (SVAR) models, which aim to identify economically meaningful shocks, the reduced-form IRFs are orthogonalized to isolate responses to structural innovations. A common approach employs Cholesky decomposition of the error covariance matrix \Sigma = P P', where P is a lower triangular matrix that imposes a recursive ordering on the variables; the structural IRFs are then \Theta_h^* = \Theta_h P, ensuring the shocks are mutually uncorrelated and have unit variance. This method, while sensitive to the chosen ordering, provides interpretable structural responses under the assumption of contemporaneous recursiveness. Confidence bands for IRFs account for estimation uncertainty and are typically constructed using either analytical asymptotic methods or resampling techniques. Asymptotic bands rely on the delta method applied to the variance-covariance matrix of the \Theta_h estimates, yielding approximate normal intervals under standard regularity conditions. Bootstrap methods, such as the residual-based bootstrap, simulate the empirical distribution of IRFs by resampling the estimated residuals to generate artificial data series, providing more robust bands in finite samples, particularly for non-normal errors.^[28]

Forecast Error Variance Decomposition

Forecast error variance decomposition (FEVD) is a key analytical tool in vector autoregression (VAR) models that quantifies the proportion of the forecast error variance of a specific variable attributable to shocks from different variables over various forecast horizons. It provides insights into the relative importance of structural shocks in explaining the uncertainty in multi-step-ahead forecasts, helping to assess dynamic interdependencies among variables in the system. In the generalized FEVD framework, the proportion of the h-step-ahead forecast error variance of variable j due to a shock in variable k is given by

\phi_{jk}(h) = \frac{\sigma_{kk}^{-1} \sum_{i=0}^{h-1} (e_j' \Theta_i e_k)^2}{\sum_{m=1}^k \sigma_{mm}^{-1} \sum_{i=0}^{h-1} (e_j' \Theta_i e_m)^2},

where \Theta_i denotes the i-step moving average coefficients from the VAR model's infinite moving average representation, \sigma_{kk} is the (k,k) element of the covariance matrix \Sigma of the reduced-form errors, and e_j is the j-th unit vector. This formulation, proposed by Pesaran and Shin, accommodates contemporaneous correlations among shocks without imposing an orthogonalization.^[29] The FEVD captures the cumulative impact of shocks on forecast uncertainty, revealing, for instance, how much of a variable's multi-period forecast error variance stems from its own shocks versus those from other variables in the system. Unlike impulse response functions, which trace the path of a shock's effect over time, FEVD summarizes these effects by focusing on their contributions to overall variance shares across horizons. A common alternative is the orthogonalized (Cholesky) FEVD, which relies on a Cholesky decomposition of the error covariance matrix to identify shocks recursively, assuming a specific causal ordering among variables; however, this approach yields results that depend on the chosen variable ordering, potentially altering the inferred shock contributions. In contrast, the generalized FEVD is invariant to such orderings, making it more robust for analyses where the contemporaneous relationships are not clearly hierarchical.^[29] For each variable j and horizon h, the FEVD proportions across all shocks k sum to unity, ensuring a complete partitioning of the forecast error variance. This property facilitates straightforward interpretation of shock dominance, such as identifying the primary drivers of variability in economic variables over short- versus long-term forecasts.

Granger Causality Tests

Granger causality, introduced by Clive Granger, assesses whether one time series can improve the prediction of another beyond what is achievable using only the past values of the latter.^[30] In the context of vector autoregression (VAR) models, this involves examining the equations of the VAR system, where each variable is regressed on lagged values of all variables in the system, including its own. Specifically, variable X Granger-causes variable Y if the estimated coefficients on the lagged values of X in Y's equation are jointly statistically significant, indicating that past X enhances forecasts of Y. This test leverages the ordinary least squares (OLS) estimates from the VAR, as derived in the estimation process. The standard Granger causality test is implemented as a Wald test on the null hypothesis that the coefficients associated with the lags of the potential causing variable(s) are zero in the equation of interest. Under the null hypothesis of no Granger causality, the test statistic follows an asymptotic \chi^2 distribution with degrees of freedom equal to the number of restricted coefficients. For a VAR(p) model with k variables, the test focuses on subsets of lags, often all p lags jointly, and can be computed using the covariance matrix of the OLS estimators to account for parameter uncertainty. In practice, the F-statistic form is also commonly used for finite samples, approximating the Wald statistic under normality assumptions. Block exogeneity extends the Granger causality framework to groups of variables within the VAR system. A block of variables is said to be block exogenous with respect to the remaining variables if it does not Granger-cause any of them, meaning the coefficients on the lags of the block are zero in all equations except those of the block itself.^[31] This joint test across multiple equations uses a Wald statistic that is asymptotically \chi^2 distributed, with the number of restrictions corresponding to the excluded lags multiplied by the number of affected equations. Block exogeneity is particularly useful for simplifying VAR models by allowing certain variables to enter only their own equations, facilitating reduced-rank or subset modeling without loss of predictive power.^[31] Despite its utility, Granger causality testing has notable limitations. It does not imply true philosophical or structural causality, as it only measures predictive content and can be confounded by omitted variables or contemporaneous correlations not captured in the reduced-form VAR.^[30] Additionally, the test is highly sensitive to the choice of lag order p, where misspecification can lead to incorrect inferences; for instance, overly short lags may induce residual autocorrelation, while excessive lags reduce degrees of freedom and power. Researchers must therefore precede testing with appropriate lag selection criteria, such as information-based metrics, to ensure robustness.

Forecasting

Generating Forecasts

Once a vector autoregression (VAR) model has been estimated, point forecasts can be generated by applying the model's equations to known or previously forecasted values. For a one-step-ahead forecast at time T, denoted \hat{Y}_{T+1|T}, the predicted vector is computed as \hat{Y}_{T+1|T} = \hat{c} + \hat{A}_1 Y_T + \hat{A}_2 Y_{T-1} + \cdots + \hat{A}_p Y_{T-p+1}, where \hat{c} is the estimated intercept (or mean adjustment), \hat{A}_i are the estimated coefficient matrices for lags i = 1, \dots, p, and Y_t are the observed data vectors up to time T.^[1] This approach leverages the linear structure of the VAR to produce the conditional mean expectation, which minimizes the mean squared error under the model's assumptions.^[1] For multi-step-ahead forecasts, such as the h-step prediction \hat{Y}_{T+h|T}, the process involves iterative substitution: the one-step forecast feeds into the next step, replacing unknown future values with their predictions, yielding \hat{Y}_{T+h|T} = \hat{c} + \hat{A}_1 \hat{Y}_{T+h-1|T} + \hat{A}_2 \hat{Y}_{T+h-2|T} + \cdots + \hat{A}_p \hat{Y}_{T+h-p|T} for h > 1.^[1] To compute these efficiently, the VAR can be rewritten in companion form, a state-space representation where the forecast corresponds to raising the companion matrix to the power h-1 and multiplying by the initial state vector.^[32] This matrix exponentiation avoids repeated substitutions, particularly for large h, and ensures numerical stability when the model is stationary.^[32] VAR models often incorporate deterministic components to account for trends, seasonal patterns, or other non-stochastic drifts in the data. In forecasting, these are handled by extrapolating the deterministic terms directly—for instance, linear trends continue with constant slope, while seasonal dummies repeat their periodic values—added to the autoregressive component.^[1] This adjustment ensures that forecasts reflect both the stochastic dynamics and any systematic patterns in the series.^[1] Under the assumption of Gaussian errors, the infinite-order moving average (MA) representation derived from the Wold decomposition of a stationary VAR process guarantees that these linear forecasts are optimal in the mean squared error sense, as they project the future values onto the space spanned by past observations.^[1] The Wold form expresses the process as Y_t = \mu + \sum_{j=0}^\infty \Psi_j \varepsilon_{t-j}, where \Psi_0 = I and the \Psi_j coefficients decay geometrically, enabling the truncation at finite lags for practical prediction.^[1]

Uncertainty and Confidence Intervals

In vector autoregression (VAR) models, the h-step-ahead forecast error for the vector Y_{T+h} given information up to time T is expressed as Y_{T+h} - \hat{Y}_{T+h} = \sum_{i=0}^{h-1} \Theta_i u_{T+h-i}, where \Theta_i are the moving average coefficients from the infinite MA representation of the VAR, and u_t are the structural innovations with covariance matrix \Sigma.^[33] This error arises primarily from future shocks, as the forecast \hat{Y}_{T+h} incorporates all available past information. The variance of this forecast error, which quantifies the uncertainty, is given by \sum_{i=0}^{h-1} \Theta_i \hat{\Sigma} \Theta_i', where \hat{\Sigma} is the estimated innovation covariance matrix.^[33] As the forecast horizon h increases, this variance approaches the unconditional variance of the process, reflecting diminishing predictability.^[34] Two main sources contribute to this uncertainty: the variance from future innovations, which dominates, and parameter estimation uncertainty in the VAR coefficients and \Sigma, which becomes negligible in large samples (T >> number of parameters).^[33] For 95% confidence intervals around point forecasts, an asymptotic normal approximation yields bounds of \hat{Y}_{T+h} \pm 1.96 \sqrt{\text{diag}\left( \sum_{i=0}^{h-1} \Theta_i \hat{\Sigma} \Theta_i' \right)}, assuming Gaussian innovations; these intervals widen with h due to accumulated shock variance.^[33] Fan charts visualize these intervals by plotting multiple density forecasts, often using stochastic simulations from the VAR to shade probability bands (e.g., 90% coverage), aiding communication of forecast distributions in policy applications.^[35] When innovations are non-normal or small-sample issues arise, bootstrap methods provide robust alternatives for confidence intervals by resampling residuals to mimic the empirical distribution of forecast errors, improving coverage over asymptotic methods.^[36]

Model Evaluation for Forecasting

Evaluating vector autoregression (VAR) models for forecasting involves selecting appropriate lag orders, assessing out-of-sample predictive performance, and verifying parameter stability to ensure reliable predictions. Lag order selection balances model fit with parsimony to avoid overfitting, particularly important in multivariate settings where the number of parameters grows rapidly with lags. Common approaches include information criteria such as the Akaike Information Criterion (AIC), which favors models with better likelihood while lightly penalizing complexity, the Bayesian Information Criterion (BIC), which imposes a stronger penalty on additional parameters and thus tends to select smaller lag lengths, and the Hannan-Quinn Information Criterion (HQIC), which falls between the two in penalization strength. These criteria are computed for candidate lag orders up to a maximum, often chosen based on data length or prior knowledge, and the order minimizing the criterion is selected; for instance, BIC's heavier penalty on model size makes it preferable in finite samples for VAR forecasting to promote sparsity.^[37] Sequential testing complements these by starting from a baseline model and adding lags if likelihood ratio or Lagrange multiplier tests reject the null of no additional autocorrelation, providing a data-driven alternative or confirmation. Out-of-sample evaluation assesses how well the VAR generalizes to unseen data, crucial since in-sample fit can mislead forecasting ability. Key metrics include the root mean squared error (RMSE), which quantifies average squared forecast errors and penalizes large deviations, and the mean absolute error (MAE), which measures average absolute deviations and is less sensitive to outliers. To compare competing VAR specifications or benchmarks, the Diebold-Mariano test evaluates whether one model's forecast errors significantly outperform another's by testing the null of equal predictive accuracy against loss differentials, often using squared or absolute errors as the loss function. Since the 1980s, emphasis has grown on pseudo-out-of-sample validation in VAR forecasting, where the sample is split into estimation and holdout periods to simulate real-time prediction, addressing concerns over data mining and ensuring robustness beyond historical fitting. Stability checks confirm that VAR parameters remain constant over time, a prerequisite for valid multi-step forecasts, as structural breaks can distort predictions. Recursive residuals, computed by estimating the model on expanding initial subsamples and predicting the next observation, help detect time-varying dynamics; deviations from zero indicate instability. The cumulative sum (CUSUM) test plots the running sum of these standardized residuals, with bounds derived from Brownian motion approximations under the null of stability; excursions beyond bounds signal parameter shifts.^[38] These tests, applied post-estimation, ensure the VAR's forecasting reliability by validating parameter constancy, with adjustments for degrees of freedom in small samples to maintain test sizes.

Applications

Economic and Financial Modeling

Vector autoregression (VAR) models have become a cornerstone in macroeconomic analysis since Christopher Sims' seminal 1980 critique of traditional econometric methods, which imposed excessive restrictions on dynamic interactions among economic variables. Sims argued that such restrictions often lacked empirical justification and hindered accurate representation of reality, advocating instead for unrestricted VARs to capture the joint dynamics of key macro aggregates like output, inflation, and interest rates without a priori theoretical constraints. This approach enabled economists to let the data reveal interrelationships, marking a paradigm shift toward empirical, data-driven modeling in macroeconomics. Common datasets in these applications include U.S. quarterly series such as real GDP for output, the CPI or GNP deflator for inflation, and the federal funds rate for monetary policy, often spanning post-WWII periods to assess business cycle fluctuations.^[39] In monetary policy analysis, VAR models facilitate the identification and quantification of shocks from central bank actions, such as those by the Federal Open Market Committee (FOMC). A prominent example is the work of Christiano, Eichenbaum, and Evans (1999), who employed recursive VARs to isolate exogenous monetary policy shocks and trace their effects via impulse response functions, revealing that a contractionary shock typically reduces output and employment while exerting limited immediate pressure on prices—a pattern consistent with liquidity effects dominating inflation dynamics.^[40] This framework has informed policy evaluation, highlighting how VARs provide counterfactual simulations for understanding transmission mechanisms in economies like the U.S. during the 1980s and 1990s. In finance, VAR models underpin analyses of asset price dynamics, including stock return predictability and spillover effects across markets. For stock return predictability, VARs integrate macroeconomic predictors—such as dividend yields or consumption-wealth ratios—with excess returns to test whether fundamentals forecast future movements, as demonstrated in multivariate setups that reveal time-varying predictability patterns over horizons of one to several quarters. Spillover effects, measuring contagion in asset prices, are quantified using the Diebold-Yilmaz index derived from VAR variance decompositions, which applied to global equity markets shows that return spillovers have increased modestly since the 1990s, while volatility spillovers exhibit sharper rises during crises, underscoring interconnectedness in international portfolios. VAR applications extend to dissecting specific economic events, such as the 2008 global financial crisis. Bussière, Fratzscher, and Straub (2011) utilized a global VAR (GVAR) model to trace transmission channels, finding that shocks to U.S. liquidity and risk appetite accounted for a substantial portion (around 30% combined) of the variance in equity market movements in emerging economies, with spillovers amplified through trade and financial linkages.^[41] Similarly, in fiscal policy, structural VARs (SVARs) with narrative identification—drawing on historical records of policy actions—estimate multipliers, as in Blanchard and Perotti (2002), who reported government spending multipliers around 1.0-1.5 for U.S. data, indicating that a dollar of spending boosts output by that amount over two years, though narrative approaches like Ramey's (2011) military news shocks yield lower estimates of 0.6-1.2 to address endogeneity concerns. These tools have proven vital for policy counterfactuals, such as evaluating fiscal stimulus efficacy during recessions.

Applications in Other Fields

Vector autoregression (VAR) models have been applied in climate science to analyze interactions between variables such as temperature and precipitation, enabling forecasts of climatic patterns and assessments of their interdependencies. For instance, VAR models have been used to model and forecast monthly rainfall and temperature in regions like Nigeria, capturing the multivariate dynamics to predict future climatic conditions based on historical data.^[42] In addition, panel VAR approaches have examined the distributional effects of climate change by estimating relationships between temperature anomalies and precipitation across multiple countries over a century of data, highlighting how shocks in one variable propagate to others.^[43] Granger causality tests within these VAR frameworks have further identified key drivers of climate variability, such as the influence of temperature on precipitation patterns in dynamic assessments of regional shifts.^[44] In political science, VAR models are used to study dynamic interdependencies in political processes, such as the effects of policy changes on voting behavior or international event spillovers. For example, they have been applied to analyze causality in U.S. political partisanship and public policy preferences, as well as foreign direct investment responses to political stability in emerging markets.^[45]^[46] In neuroscience, VAR models facilitate the study of functional connectivity in brain networks, particularly using functional magnetic resonance imaging (fMRI) data to infer directed interactions between regions. Multivariate autoregressive models, including VAR variants, have been employed to estimate effective connectivity from fMRI time series, revealing how neural activity in one area predicts activity in others during rest or tasks.^[47] Bayesian hierarchical VAR approaches have advanced this by modeling brain connectivity in resting-state fMRI, accounting for uncertainty and providing probabilistic inferences on network structures across subjects.^[48] These methods contrast with static models by incorporating time-resolved dynamics, as seen in vector autoregressive models with external inputs (VARX) applied to intracranial EEG data to link intrinsic brain rhythms with sensory stimulation responses.^[49] VAR models have proven effective in epidemiology for forecasting disease outbreaks, such as the spread of COVID-19 across regions, by integrating multiple time series like case counts, mobility, and behavioral indicators. A simple VAR model has demonstrated superior county-level predictions of new COVID-19 cases compared to other methods, capturing spatial and temporal dependencies in transmission dynamics.^[50] Extensions incorporating exogenous variables, like VARX, have analyzed interactions between COVID-19 incidences, human mobility, and social behaviors, revealing how restrictions influenced outbreak trajectories during the pandemic.^[51] These applications highlight VAR's utility in multivariate forecasting for public health responses, with models predicting case surges based on lagged relationships among epidemiological variables.^[52] The 2010s marked significant growth in high-dimensional VAR applications for big data in biology, driven by advances in neuroimaging and genomics that generated vast multivariate time series. Hierarchical VAR models addressed the challenges of high dimensionality in brain network analysis, enabling inference on connectivity from thousands of regions while regularizing parameters to avoid overfitting.^[53] This expansion included extensions like time-varying VAR (TVAR) models, which capture evolving relationships in biological systems, such as changing neural connectivities over time in fMRI studies.^[54] TVAR adaptations have been particularly valuable for modeling non-stationary processes in fields like neuroscience, where dynamic parameter estimation reveals adaptive responses in biological networks.^[55]

Implementation

Software Packages

Several software packages facilitate the estimation and analysis of vector autoregression (VAR) models, catering to both commercial and open-source environments commonly used in econometric research.^[56]^[24]^[57] In R, the vars package provides comprehensive tools for VAR modeling, including estimation via the VAR() function, impulse response function computation, and forecasting.^[58] For handling cointegration in VAR frameworks, the urca package offers the ca.jo() function to perform the Johansen procedure, enabling vector error correction model specifications.^[59] Additional packages like bsvars support Bayesian structural VARs.^[60] Python's statsmodels library implements VAR models through the statsmodels.tsa.vector_ar.var_model.VAR class, supporting model fitting, order selection via information criteria, and diagnostic testing for multivariate time series.^[61] Extensions compatible with scikit-learn, such as the sktime package, further enhance VAR capabilities with forecasting interfaces that integrate seamlessly into machine learning workflows.^[62] As of 2025, PyMC provides tools for Bayesian VAR modeling.^[63] Julia offers packages like VectorAutoregressions.jl for VAR estimation and identification.^[64] MATLAB's Econometrics Toolbox includes the varm() function for specifying, estimating, and forecasting stable VAR models, with built-in methods for stability checks and impulse response analysis.^[56] Commercial software like EViews and Stata continues to be widely adopted in economics for VAR estimation and structural analysis, with EViews supporting Bayesian VARs and mixed-frequency models.^[57] Stata's var command handles multivariate time-series regression and post-estimation inference, and as of Stata 19 (released April 2025), the xtvar command supports panel-data VAR models.^[24]^[65] Since the 2010s, open-source tools in Python, R, and Julia have seen increasing adoption in econometric applications due to their flexibility and community-driven development.^[66]

Computational Considerations

Vector autoregression (VAR) models face significant computational challenges due to the curse of dimensionality, where the number of parameters increases rapidly with the number of variables n and lags p, leading to overfitting and unreliable estimates in finite samples.^[67] The total number of mean parameters in a VAR(p) model is n(1 + p n), which grows quadratically in n.^[68] For example, with n=10 variables and p=4 lags, this yields approximately 410 parameters, highlighting the need for sample sizes T substantially exceeding the parameter count—typically T > 100—to achieve estimation stability and asymptotic properties.^[68]^[69] To mitigate these issues, high-dimensional extensions incorporate regularization techniques such as shrinkage estimators. Lasso-based methods impose sparsity by penalizing large coefficients, reducing effective dimensionality while preserving interpretability in large VARs.^[70] Bayesian approaches, particularly those using the Minnesota prior, further address this by shrinking coefficients toward random walks for lags and toward zero for cross-variable effects, enabling estimation in settings where n approaches or exceeds T. Factor VAR models offer an alternative by projecting the system onto a lower-dimensional factor space extracted from the data, alleviating parameter proliferation in infinite-dimensional limits.^[71] Computational demands extend to simulation-based inference, where Monte Carlo methods generate distributions of impulse response functions (IRFs) by drawing from the posterior or asymptotic covariance, essential for nonlinear or high-dimensional cases.^[72] Bootstrap procedures for confidence intervals around IRFs or forecasts are particularly intensive in large models, often requiring parallel computing to handle the thousands of replications efficiently.^[73]

References

[1]
[PDF] Vector Autoregressive Models for Multivariate Time Series
The vector autoregression (VAR) model is one of the most successful, flexi- ble, and easy to use models for the analysis of multivariate time series. It is.
[2]
[PDF] Christopher A. Sims and Vector Autoregressions - NYU Stern
In a series of papers, Sims (1972, 1980a, 1980b) proposed the use of vector autoregressions (VARs). The most comprehensive and influential of these papers ...
[3]
Macroeconomics and Reality - jstor
Setting such coefficients to zero may be a justifiable part of the estimation process, but it does not aid in identification. Page 6. 6. CHRISTOPHER. A. SIMS.
[4]
[PDF] Vector Autoregressions - Princeton University
In the 1970s, these four tasks- data description, forecasting, structural inference and policy analysis-were performed using a variety of techniques. These ...
[5]
[PDF] Bayesian vector autoregressions - Portail HAL Sciences Po
Nov 30, 2021 · Bayesian VARs (BVARs) with macroeconomic variables were first employed in forecasting by Litterman (1979) and Doan et al. (1984). Since then, ...<|control11|><|separator|>
[6]
Vector Autoregression - an overview | ScienceDirect Topics
Vector autoregression (VAR) is defined as a statistical model that uses multiple time series variables to capture their linear interdependencies by modeling ...
[7]
[PDF] A Nobel Prize for Empirical Macroeconomics - Radboud Repository
Dec 12, 2012 · Only a year after the publication of the Lucas Critique, Sargent & Sims ... Sims' VAR approach 'dissented vigorously from the Cowles Commission ...<|control11|><|separator|>
[8]
[PDF] Chapter 47 - VECTOR AUTOREGRESSIONS AND COINTEGRATION
Vector autoregressions (VARs) were introduced into empirical economics by. Sims (1980), who demonstrated that VARs provide a flexible and tractable frame- work ...
[9]
[PDF] Prior selection for vector autoregressions - European Central Bank
In order to improve the forecasting performance of VAR models, Litterman (1980) and Doan, Litterman, and Sims (1984) have proposed to combine the likelihood ...
[10]
New Introduction to Multiple Time Series Analysis - SpringerLink
This reference work and graduate level textbook considers a wide range of models and methods for analyzing and forecasting multiple time series.
[11]
[PDF] New Introduction to Multiple Time Series Analysis
Feb 3, 2023 · It contains a discussion of structural vector autoregressive and vector error correction models which are by now also standard tools in applied ...
[12]
[PDF] CO-INTEGRATION AND ERROR CORRECTION ...
: Page 4. 254. ROBERT F. ENGLE AND C. W. J. GRANGER. The reduction in the order of integration implies a special kind of relationship with interpretable and ...
[13]
[PDF] Distribution of the Estimators for Autoregressive Time Series With a ...
Dickey and Fuller: Time Series With Unit Root. 429 and lim Win = wi nx -000 then Lil, wiZ, is well defined as a limit in mean square and n. 00 plim {LwZ} =W.
[14]
[PDF] Spurious regressions in econometrics - Gwern
The recent experience of one of us [see Box and Newbold (1971)] has indicated just how easily one can be led to produce a spurious model if sufficient care is.
[15]
[PDF] S0ren Johansen Statistical Analysis of Cointegration Vectors
Oct 12, 1987 · The purpose of this paper is to derive maximum likelihood estimators of the cointegration vectors for an autoregressive process with independent ...
[16]
[PDF] Vector Autoregressions Lecture 10 - Hedibert Freitas Lopes
Sims (1980) proposed that VARs should replace large-scale macroeconometric models inherited from the 1960s, because the latter imposed incredible restrictions, ...
[17]
[PDF] Vector Autoregression (VAR) Models - Time Series Analysis
A VAR is an n-equation, n-variable (linear) model in which each variable is explained by its own lagged values and lagged values of the remaining n-1 variables.
[18]
[PDF] MA Advanced Macroeconomics 2. Vector Autoregressions
These model were introduced to the economics profession by Christopher Sims. (1980) in a path-breaking article titled “Macroeconomics and Reality.”.Missing: motivation | Show results with:motivation
[19]
[PDF] Structural Vector Autoregressions - Harvard University
Jan 8, 2015 · This approach identifies R by imposing restrictions on the long run effect of one or more ε's on one or more Y's. Reduced form VAR: A(L)Yt = ut.
[20]
Structural Vector Autoregressive Analysis
Structural vector autoregressive (VAR) models are important tools for empirical work in macroeconomics, finance, and related fields.
[21]
Structural Vector Autoregressive Models
**Summary of Structural Vector Autoregressive Models (SVARs):**
[22]
[PDF] Advances in Using Vector Autoregressions to Estimate Structural ...
We review the algorithm developed by Baumeister and Hamilton (2015) for Bayesian analysis of structural VARs that takes advantage of natural conjugate ...
[23]
[PDF] The Dynamic Effects of Aggregate Demand and Supply Disturbances
This would also allow examination of different ques- tions from an alternative perspective, as in. Blanchard (1989). As one might expect, wage and price data ...
[24]
[PDF] Identification in Parametric Models - NYU Stern
A theory of identification is developed for a general stochastic model whose probability law is determined by a finite number of parameters.
[25]
[PDF] uhlig.jme.2005.pdf - Knowledge Base
Monetary policy shocks account for probably less than 25% of the variance for the 1-year or more ahead forecast revision of real output, and may easily account ...
[26]
https://arxiv.org/pdf/2309.12902
[27]
https://ex.hhs.se/dissertations/221920-FULLTEXT01.pdf
[28]
[PDF] Vector Autoregressions
Two decades ago, Christopher Sims (1980) provided a new macroeconometric framework that held great promise: vector autoregressions (VARs). A univariate.
[29]
[PDF] var — Vector autoregressive models - Description Quick start Menu
As discussed in Lütkepohl (2005) and Greene (2008, 696), performing linear regression on each equation produces the maximum likelihood estimates of the ...
[30]
Reduced-Rank Envelope Vector Autoregressive Model
Nov 15, 2023 · Our proposed REVAR model incorporates the strengths of both reduced-rank VAR and envelope VAR models and leads to significant gains in efficiency and accuracy.
[31]
[PDF] Reduced-rank Envelope Vector Autoregressive Models - arXiv
Sep 22, 2023 · The reduced-rank VAR (RRVAR) model is considered by imposing a low-rank structure on the coefficient matrix of model (2), i.e., rank (β) = d<q.
[32]
[PDF] Bias Approximation and Reduction in Vector Autoregressive Models
overparameterization is the reason for the inflated VAR bias; estimating two parameters (a. 2 and a3) which are not in the model reduces the degrees of freedom ...
[33]
[PDF] Bootstrapping impulse responses in VAR analyses - EconStor
Suggested Citation: Lütkepohl, Helmut (2000) : Bootstrapping impulse responses in VAR analyses,. SFB 373 Discussion Paper, No. 2000,22, Humboldt University ...
[34]
https://homepage.univie.ac.at/robert.kunst/var.pdf
[35]
[PDF] Investigating Causal Relations by Econometric Models and Cross ...
TiE OBJECT of this paper is to throw light on the relationships between certain classes of econometric models involving feedback and the functions arising in.
[36]
[PDF] Choice of Variables in Vector Autoregressions
309) Block-exogeneity: The group of variables repre- sented by yi is said to be block-exogenous (in the time series sense) with respect to the variables in yj ...
[37]
11.2 Vector autoregressions | Forecasting: Principles and Practice ...
A VAR model is a generalisation of the univariate autoregressive model for forecasting a vector of time series.
[38]
[PDF] Vector autoregressions
This course is exclusively based on the book “New Introduction to Multi- ple Time Series” by Helmut Lütkepohl. While the book's title indicates.
[39]
Using stochastic simulations to produce fan charts - OBR
We start by estimating a small vector autoregression (VAR) model that captures the dynamic interrelationships between real GDP, inflation, interest rates ...
[40]
[PDF] Small-Sample Confidence Intervals for Impulse Response Functions
Nov 5, 1996 · ments of bootstrap confidence intervals such as the percen- tile-t ... Runkle, David E., "Vector Autoregression and Reality," Journal of.
[41]
A Practitioner's Guide to Lag-Order Selection for Vector ...
In the context of VAR models, AIC tends to be more accurate with monthly data, HQIC works better for quarterly data on samples over 120 and SBIC works fine with ...
[42]
[PDF] estat sbcusum — Cumulative sum test for parameter stability - Stata
The test statistic based on the cusum of recursive residuals was introduced in Brown, Durbin, and. Evans (1975). Under the null hypothesis, the recursive ...Missing: autoregression | Show results with:autoregression
[43]
Monetary Policy Shocks: What Have We Learned and to What End?
Feb 1, 1998 · This paper reviews recent research that grapples with the question: What happens after an exogenous shock to monetary policy?
[44]
Chapter 2 Monetary policy shocks: What have we learned and to ...
This chapter reviews recent research that grapples with the question: What happens after an exogenous shock to monetary policy? We argue that this question ...
[45]
[PDF] Identifying the global transmission of the 2007-09 financial crisis in a ...
The paper analyses and compares the role that the tightening in liquidity conditions and the collapse in risk appetite played for the global transmission of the ...
[46]
(PDF) Application of Vector Autoregression (Var) on Modelling and ...
This paper tends to apply Vector auto-regressive on modelling and forecasting average monthly rainfall and temperature in Nigeria.
[47]
The distributional effects of climate change. An empirical analysis
First, we use a two-step approach: we estimate a panel VAR using just temperature and precipitation, using over 100 years of data for each country. We identify ...
[48]
Dynamic assessment of precipitation and temperature shifts in ...
Oct 17, 2025 · This study employs a Vector Autoregression (VAR) model to analyze the interdependence between three key climatic variables: rainfall (R), ...
[49]
Investigating brain connectivity using mixed effects vector ...
The goal in this paper is to develop a robust statistical model that describes effective connectivity between brain regions using functional MRI (fMRI) time- ...
[50]
Bayesian modelling of effective and functional brain connectivity ...
Mar 19, 2024 · We propose a novel Bayesian vector autoregression hierarchical model for analysing brain connectivity within resting-state functional magnetic resonance ...Bayesian Modelling Of... · 2 A Bayesian Var... · 5 Brain Connectivity In...
[51]
Intrinsic dynamic shapes responses to external stimulation in ... - eLife
Feb 27, 2025 · We apply a vector-autoregressive model with external input (VARX), combining the concepts of “functional connectivity” and “encoding models”, to ...Methods · Intracranial Eeg Recordings... · Discussion
[52]
Improved prediction of new COVID-19 cases using a simple vector ...
We propose vector autoregression (VAR) as an epidemiological county-level prediction model that captures this unique aspect of COVID-19 transmission.
[53]
Dynamic interactions of COVID-19 incidences, mobility, and social ...
Jul 7, 2025 · We employed the Vector Autoregression with Exogenous variables (VARX) model to analyze the interdependencies among COVID-19 incidences, mobility patterns, and ...
[54]
Forecasting COVID-19: Vector Autoregression-Based Model - PMC
Jan 3, 2022 · In this paper, we propose an approach to forecasting the spread of the pandemic based on the vector autoregressive model.Missing: epidemiology | Show results with:epidemiology
[55]
Hierarchical vector auto-regressive models and their applications to ...
In this paper, we develop a novel generalization of the VAR model that overcomes these limitations. To deal with the high dimensionality of the parameter space, ...
[56]
Bayesian Varying-Effects Vector Autoregressive Models for ... - arXiv
May 1, 2024 · In this paper, we develop an analytical approach for estimating brain connectivity networks that accounts for subject heterogeneity.2 Methods And Materials · 2.1 Statistical Model · 3 Results
[57]
Multivariate autoregressive model estimation for high-dimensional ...
Jul 1, 2022 · We show that high-dimensional MVAR models can be successfully estimated from long segments of data and yield plausible connectivity profiles.
[58]
Vector Autoregression (VAR) Models - MATLAB & Simulink
A vector autoregression (VAR) model is a multivariate time series model containing a system of n equations of n distinct, stationary response variables.
[59]
https://www.rdocumentation.org/packages/urca/versions/1.3-4/topics/ca.jo
[60]
[PDF] vars.pdf
Lütkepohl (2004), Structural vector autoregressive modeling and impulse responses, in H. Lütkepohl and M. Krätzig (editors), Applied Time Series Economet- rics, ...Missing: stationarity | Show results with:stationarity
[61]
ca.jo Johansen Procedure for VAR - RDocumentation
The Johansen procedure conducts the Johansen procedure on a dataset, reporting trace or eigen statistics, eigenvectors, and loading matrix.
[62]
Vector Autoregressions tsa.vector_ar - statsmodels 0.14.4
statsmodels.tsa.vector_ar contains methods that are useful for simultaneously modeling and analyzing multiple time series using Vector Autoregressions (VAR)<|control11|><|separator|>
[63]
VAR — sktime documentation
A VAR model is a generalisation of the univariate autoregressive model to multivariate time series.
[64]
[PDF] "Using Python for Introductory Econometrics"
“Using R for Introductory Econometrics”: R traditionally is the most widely used software package in statistics and there is a huge community and countless ...
[65]
[PDF] Infinite-dimensional VARs and factor models - European Central Bank
This paper proposes a novel way to deal with the curse of dimensionality by shrinking part of the parameter space in the limit as the number of endogenous ...
[66]
16.1 Vector Autoregressions | Introduction to Econometrics with R
A Vector autoregressive (VAR) model is useful when one is interested in predicting multiple time series variables using a single model.
[67]
[PDF] Autoregressions in small samples, priors about observables and ...
We address a long-standing problem in econometrics: how to estimate autoregressive models with few observations. Univariate and vector autoregressive (VAR) ...
[68]
[PDF] High Dimensional Forecasting via Interpretable Vector Autoregression
Abstract. Vector autoregression (VAR) is a fundamental tool for modeling multivariate time series. However, as the number of component series is increased, ...
[69]
Infinite-dimensional VARs and factor models - ScienceDirect.com
This paper proposes a novel approach for dealing with the 'curse of dimensionality' in the case of infinite-dimensional vector autoregressive (IVAR) models.
[70]
mc.irf function - RDocumentation
Simulates a posterior of impulse response functions (IRF) by Monte Carlo integration. This can handle Bayesian and frequentist VARs and Bayesian (structural) ...
[71]
Variational Inference for Large Bayesian Vector Autoregressions
Dec 5, 2023 · We propose a novel variational Bayes approach to estimate high-dimensional Vector Autoregressive (VAR) models with hierarchical shrinkage priors.