Fact-checked by Grok 2 weeks ago

Johansen test

The Johansen test is a maximum likelihood-based procedure for testing the presence and determining the number of cointegrating relations among a set of non-stationary, integrated time series variables in a vector autoregressive (VAR) model. Developed by Danish econometrician Søren Johansen in 1988, the test addresses cointegration, a concept where linear combinations of individually non-stationary series can form stationary processes, allowing for long-run equilibrium relationships despite short-run deviations. It is particularly suited for multivariate systems and extends earlier univariate approaches by providing a framework to estimate cointegration vectors and test hypotheses about their structure. The underlying model is a Gaussian VAR process of order k, expressed in error correction form as \Delta X_t = \sum_{i=1}^{k-1} \Gamma_i \Delta X_{t-i} + \Pi X_{t-k} + \mu + \epsilon_t, where X_t is a p \times 1 vector of variables integrated of order 1 (I(1)), \Pi = \alpha \beta' with \alpha and \beta being p \times r matrices of adjustment coefficients and cointegrating vectors (rank r < p), \mu accounts for deterministic terms like constants or linear trends, and \epsilon_t are i.i.d. Gaussian errors. Under the null hypothesis of no cointegration, \Pi = 0, implying the variables are I(1) without long-run relations; otherwise, r > 0 indicates cointegration. The maximum likelihood estimator for the cointegration space spanned by \beta is obtained from the eigenvectors corresponding to the largest canonical correlations between \Delta X_t and lagged levels X_{t-k}, adjusted for lagged differences. Central to the test are two likelihood ratio statistics for the rank r: the trace test, -T \sum_{i=r+1}^p \ln(1 - \hat{\lambda}_i), which tests the null of at most r cointegrating relations against the alternative of more than r; and the maximum eigenvalue test, -T \ln(1 - \hat{\lambda}_{r+1}), which tests exactly r against r+1. Here, T is the sample size, and \hat{\lambda}_i are the estimated eigenvalues. These statistics' asymptotic distributions are non-standard, depending on the inclusion of deterministic components, and are tabulated as functionals of s, such as the trace of squared integrals of processes. The sequential testing procedure starts with r=0 and increases until the null is not rejected, providing the cointegration rank. Widely applied in for analyzing economic relationships like or money demand, the Johansen test has influenced extensions for small samples, structural breaks, and non-Gaussian errors, maintaining its status as a for multivariate analysis.

Introduction

Definition and Purpose

The Johansen test is a statistical procedure developed by Søren Johansen for detecting among multiple that are integrated of order one, denoted as I(1). It specifically tests the rank r of the cointegration matrix, which represents the number of linearly independent long-run equilibrium relationships among the variables, allowing researchers to identify stationary linear combinations despite the individual non-stationarity of the series. Cointegration occurs when non-stationary I(1) time series share a stable long-run relationship such that certain linear combinations of them are stationary, or I(0), reflecting economic equilibria like those between prices and exchange rates. This contrasts with unit root tests, such as the Augmented Dickey-Fuller (ADF) test, which focus on assessing stationarity in a single time series or residuals from pairwise regressions, whereas the Johansen approach handles multivariate systems directly to uncover multiple cointegrating relations. The primary purpose of the Johansen test is to guide the specification of models that account for these long-run dependencies, enabling accurate inference in non-stationary data without the pitfalls of spurious correlations. In economics, it plays a vital role in analyzing relationships—such as between money demand and income—by avoiding invalid regressions that arise when non-cointegrated I(1) variables are analyzed together, thus ensuring reliable estimates of equilibrium dynamics. The test underpins the vector error correction model framework for capturing both short-term adjustments and persistent equilibria.

Historical Development

The concept of emerged from early work on error correction models and (VAR) frameworks in . introduced the foundational idea of cointegrated variables and their connection to error-correcting mechanisms in his 1981 paper, emphasizing how non-stationary could maintain long-run equilibrium relationships. This built upon Christopher Sims' 1980 advocacy for unrestricted VAR models as a flexible tool for analyzing multivariate without imposing rigid structural assumptions. In 1987, Robert Engle and Granger formalized a two-step procedure for testing and estimating in bivariate systems, involving residual-based followed by tests on the residuals, which demonstrated superconsistent estimation properties but was limited to single-equation settings. Søren Johansen extended these ideas to a full multivariate likelihood-based framework starting in 1988, deriving maximum likelihood estimators and likelihood ratio tests for vectors within Gaussian VAR processes integrated of order one. His seminal paper in provided a comprehensive procedure for estimating the cointegration rank and testing restrictions on cointegrating relations, enabling simultaneous inference across multiple variables and overcoming the limitations of the Engle-Granger approach by incorporating the full . This likelihood ratio method, often referred to as the Johansen test, marked a shift toward , system-wide analysis of . Following its introduction, the Johansen test gained widespread adoption in econometric research during the , becoming a cornerstone for analyzing long-run relationships in macroeconomic . Refinements addressed finite-sample biases, with Johansen proposing Bartlett-type corrections for the rank test in 2000 and further adjustments for hypothesis testing on cointegrating vectors in 2002. As of 2025, the test remains a standard feature in major software packages, such as R's urca library and Python's statsmodels, while ongoing extensions incorporate structural breaks in deterministic trends to handle regime shifts, as detailed in Johansen, Mosconi, and Nielsen's 2000 framework.

Theoretical Background

Cointegration Concept

Cointegration describes a situation in which two or more , each of which is non-stationary, possess a stable long-run equilibrium relationship such that a of them is . This concept is particularly relevant for integrated processes of order one, denoted I(1), where individual series require first differencing to become , but the combination achieves stationarity without differencing. The intuition behind cointegration is that while the series may drift apart in the short run due to stochastic trends, they are tethered by an underlying equilibrium, with any deviations tending to correct themselves over time. For instance, aggregate consumption and disposable income in an economy often wander individually but maintain a proportional long-run relationship, reflecting economic theory. A time series is said to be integrated of order d, or I(d), if it must be differenced d times to induce stationarity; cointegration among I(1) series implies that the system's effective order of integration is reduced to zero for the equilibrium error. The Engle-Granger representation theorem establishes that if variables are cointegrated, they can be modeled in an error correction form, which links short-run changes to adjustments toward the long-run equilibrium. This differs fundamentally from spurious correlation, where regressing unrelated I(1) series produces high R-squared values and significant coefficients by chance, without implying any true relationship, as highlighted in early critiques of such regressions. The vector error correction model serves as the primary framework for operationalizing cointegration in multivariate settings.

Vector Error Correction Model

The Vector Error Correction Model (VECM) serves as the foundational framework for analyzing multivariate , reformulating a vector autoregressive () process to explicitly account for both short-run fluctuations and long-run equilibrium constraints. In this model, integrated variables of order one (I(1)) are modeled such that deviations from their long-run relationships trigger corrective adjustments, ensuring the process remains in differences while respecting the structure. This representation is essential for systems where individual series exhibit unit roots but linear combinations do not, allowing researchers to capture the mechanics of economic or financial equilibria that persist over time. The VECM arises as a reparameterization of a standard model specified in levels for I(1) series, transforming the unrestricted differenced form into one that includes lagged error correction terms derived from the . This equivalence ensures that the VECM imposes no additional restrictions beyond those implied by , preserving the full information content of the underlying while highlighting the equilibrium-correcting dynamics. Key elements include the adjustment , typically denoted α, which quantifies how each responds to disequilibria in the , and the β, which defines the long-run relationships among the . The rank r, representing the dimension of the space spanned by β, determines the number of such equilibria. The VECM relies on several core assumptions to facilitate and . Errors are assumed to be identically and independently distributed as Gaussian with zero mean and a constant , ensuring no serial correlation and enabling . In certain applications, weak exogeneity is imposed on subsets of variables, meaning their marginal processes do not depend on the cointegrating errors, which simplifies conditional modeling without loss of efficiency for parameters of interest. These assumptions underpin the model's suitability for likelihood-based procedures. This structure makes the VECM particularly amenable to testing for , as it supports direct inference on the rank of the cointegration space through maximum likelihood methods, forming the basis for procedures like the Johansen test. By the long-run relations within a dynamic error-correcting framework, the VECM provides a unified approach to both estimation and hypothesis testing in non-stationary systems.

Mathematical Formulation

VECM Representation

The vector error correction model (VECM) provides the foundational representation for analyzing cointegrated time series within the Johansen test framework. For a p \times 1 vector of integrated of order one, I(1), variables Y_t, the VECM is derived from a model of order k and takes the form \Delta Y_t = \sum_{i=1}^{k-1} \Gamma_i \Delta Y_{t-i} + \Pi Y_{t-k} + \varepsilon_t, where \Delta Y_t = Y_t - Y_{t-1} is the first difference, \Pi is a p \times p matrix capturing the long-run equilibrium relationships, \Gamma_i are p \times p matrices representing short-run dynamics for i = 1, \dots, k-1, and \varepsilon_t is a p \times 1 vector of white noise errors assumed to be i.i.d. Gaussian with mean zero and covariance matrix \Omega. Under the cointegration hypothesis, the matrix \Pi admits a reduced rank factorization \Pi = \alpha \beta', where \alpha is a p \times r matrix of adjustment speeds (indicating the rate at which the system corrects deviations from equilibrium) and \beta is a p \times r matrix whose columns are the r cointegrating vectors (defining the stationary linear combinations of the variables), with $0 < r < p. The reduced rank r of \Pi implies that there are r linearly independent long-run equilibrium relations among the p variables, ensuring that while individual series may be non-stationary, certain combinations \beta' Y_t are stationary. The \Gamma_i matrices capture the contemporaneous and lagged short-run adjustments to deviations from equilibrium. The lag length k-1 in the VECM corresponds to the order k of the underlying VAR model in levels, which is typically selected using information criteria such as the (AIC) or (BIC) to balance model fit and parsimony while ensuring residuals are approximately white noise. For identification of the cointegrating relations, the matrix \beta is normalized such that one element in each column is set to unity (often the coefficient on a variable treated as numeraire), ensuring uniqueness up to rotations within the column space spanned by \beta. This normalization facilitates interpretation of the equilibrium relations while the rank r can be tested via the eigenvalues of \Pi, as detailed in subsequent sections on test statistics.

Cointegration Relations and Rank

In the vector error correction model (VECM), the cointegration matrix \Pi captures the long-run equilibrium relationships among the variables, expressed as \Pi = \alpha \beta', where \alpha and \beta are p \times r matrices of full column rank, and r denotes the cointegration rank, representing the number of linearly independent cointegrating relations. The rank r of \Pi is reduced (with $0 < r < p) when the variables are integrated of order 1 but share stationary linear combinations, ensuring that the process is not explosive and allows for error correction dynamics. The columns of \beta form the cointegrating vectors, spanning the space of stationary linear combinations \beta' Y_t, where each such combination is integrated of order 0 despite the individual variables Y_t being I(1). These vectors define the long-run equilibria, while \alpha represents the adjustment speeds toward those equilibria. To characterize the non-stationary components, the orthogonal complements \alpha_\perp and \beta_\perp—p \times (p - r) matrices satisfying \alpha' \alpha_\perp = 0 and \beta' \beta_\perp = 0—span the directions of the I(1) common trends driving the system. The determination of r involves a sequential hypothesis testing procedure, starting with the null H_0: r \leq 0 against H_1: r > 0, and proceeding to test H_0: r \leq m against H_1: r > m (trace test) or H_0: r = m against H_1: r = m+1 (maximum eigenvalue test) for m = 1, \dots, p-1, until the null is not rejected, thus estimating the rank as the smallest m where the null r \leq m is not rejected. This approach ensures identification of the dimension of the cointegrating space. Interpretation of r varies by boundary: if r = 0, no cointegration exists, implying the system follows a pure vector autoregression in first differences without long-run relations; if r = p, all variables are stationary, reducing the model to a VAR in levels.

Test Statistics

Trace Test

The trace test, one of the two main likelihood ratio tests in the Johansen framework, examines the that the of the matrix \Pi in the vector error correction model (VECM) is at most r, denoted H_0: \rank(\Pi) \leq r, against the alternative that the exceeds r, H_1: \rank(\Pi) > r. The is given by \lambda_{\trace}(r) = -T \sum_{i=r+1}^{p} \log(1 - \hat{\lambda}_i), where T is the sample size, p is the number of variables in the , and \hat{\lambda}_i (with i = 1, \dots, p and \hat{\lambda}_1 \geq \cdots \geq \hat{\lambda}_p) are the estimated eigenvalues obtained from the of the VECM. This statistic arises from comparing the likelihood of the unrestricted VECM to the restricted model under the , capturing the joint contribution of the smaller eigenvalues to deviations from the null. To determine the cointegration rank, the trace test is applied sequentially starting from r = 0: the null H_0: \rank(\Pi) \leq 0 (no ) is tested against H_1: \rank(\Pi) > 0; if rejected, the process advances to r = 1 (H_0: \rank(\Pi) \leq 1 vs. H_1: \rank(\Pi) > 1), continuing until the null is not rejected at a chosen significance level, with the last rejected null indicating the estimated . This procedure leverages the nested nature of the hypotheses, providing a systematic way to identify the dimension of the cointegrating space without testing all possible ranks simultaneously. A key advantage of the trace test is its ability to assess the presence of multiple cointegrating vectors in a single joint test against alternatives where the true rank may substantially exceed the hypothesized value, making it suitable for systems with potentially rich long-run relationships. Regarding power properties, the trace test generally demonstrates superior performance compared to the maximum eigenvalue test when the true cointegration rank is distant from the null (e.g., higher ranks or stronger deviations), as it aggregates information across multiple eigenvalues, though it may exhibit size distortions in finite samples. For instance, in an applied analysis of p=3 economic time series, if the trace statistic rejects the null for r=0 (indicating at least one cointegrating relation) but fails to reject for r=1, the procedure concludes that the cointegration rank is exactly 1, implying a single long-run equilibrium among the variables. The trace test is often used complementarily with the maximum eigenvalue test to cross-validate rank estimates.

Maximum Eigenvalue Test

The maximum eigenvalue test, also known as the maximal eigenvalue test, is one of the two primary likelihood ratio statistics proposed by Johansen for determining the in vector autoregressive models. It specifically tests the that the of the cointegration Π is equal to r (i.e., there are exactly r cointegrating relations) against the alternative that the is r+1, by examining the largest remaining eigenvalue after accounting for the first r relations. The is given by \lambda_{\max}(r) = -T \log(1 - \hat{\lambda}_{r+1}), where T is the sample size and \hat{\lambda}_{r+1} is the (r+1)-th largest estimated eigenvalue of the matrix associated with the long-run equilibrium parameters. The procedure involves sequential testing, starting with r=0 and increasing r until the null hypothesis is not rejected, similar to the trace test but with a narrower focus on each incremental change in rank. This approach provides greater precision in identifying the exact point of transition from non-cointegration to cointegration, making it particularly useful for pinpointing the precise number of stable relations in the system. Compared to the trace test, which assesses the overall significance of all eigenvalues up to a certain for broader testing, the maximum eigenvalue test is sharper for determining the exact cointegration , as it isolates the contribution of the next largest eigenvalue. It exhibits advantages in terms of better size control and , especially for alternatives close to the and in finite samples, where the trace test may suffer from greater distortions. In , rejection of the at a given r indicates the presence of at least r+1 , prompting continuation of the sequential process; non-rejection at that r establishes it as the estimated cointegration rank, beyond which no further relations exist. This test's emphasis on individual eigenvalue significance enhances its utility for applications requiring accurate rank estimation in economic analysis.

Asymptotic Properties

Null Hypothesis Distribution

Under the null hypothesis of at most r cointegrating relations in a vector error correction model (VECM), the Johansen test statistics do not follow a standard chi-squared distribution due to the non-stationarity induced by unit roots in the underlying vector autoregressive process. Instead, they exhibit non-standard asymptotic distributions that are functionals of multidimensional stochastic processes, specifically involving Brownian motions, which arise from the integration and properties of the system. Johansen (1991) establishes that, under the null hypothesis, both the trace test and maximum eigenvalue test statistics converge in distribution to expressions comprising sums of functions derived from Brownian motions, potentially with drift terms depending on the model specification. These limiting distributions account for the spurious regression phenomena inherent in non-stationary systems, ensuring that the tests maintain proper size asymptotically without relying on nuisance parameter adjustments beyond the deterministic components. The form of these asymptotic distributions is influenced by the treatment of deterministic trends in the VECM, leading to distinct cases: Case I assumes no constant term; Case II incorporates a restricted constant within the cointegrating space; and Case III allows an unrestricted constant in the error correction term. Each case modifies the Brownian motion processes—such as introducing location shifts or drifts—resulting in different limiting distributions for the test statistics. Critical values for these distributions are obtained through simulations tailored to specific model dimensions, such as the number of variables p (often denoted as k), and the chosen case, providing tabulated quantiles that approximate the non-standard limits for practical inference. In the special case where the hypothesized cointegrating r equals the system dimension p (i.e., the case with no unit roots), the test statistics become asymptotically chi-squared distributed, aligning with standard likelihood ratio testing principles under full conditions.

Critical Values and Corrections

The Johansen test accommodates different specifications for deterministic terms in the underlying vector (VECM), which influence the distribution of the test statistics and thus the appropriate critical values. These specifications, often denoted as cases I through V, account for the presence and location of intercepts and trends. Case I assumes no deterministic terms (no intercept or trend) in either the cointegrating equations or the model. Case II includes a restricted intercept in the cointegrating equations but no trend. Case III features an unrestricted intercept in the but none in the cointegrating equations, with no trend. Cases IV and V incorporate trends: Case IV has an unrestricted intercept and linear trend in the , while Case V restricts the intercept to the cointegrating equations and allows an unrestricted linear trend in the . Critical values for the and maximum eigenvalue test statistics under these cases are derived from the asymptotic distributions involving integrals and are tabulated in seminal works. Johansen and Juselius (1990) provide initial tables for the and maximum eigenvalue statistics across various cases, based on simulations with 6,000 replications, covering dimensions up to four variables and significance levels of 90%, 95%, and 99%. Osterwald-Lenum (1992) extends these by offering more precise quantiles of the asymptotic distributions, computed with higher accuracy using response surface approximations, which are widely implemented in econometric software for direct access during testing. These tables ensure reliable inference by accounting for the non-standard distributions under the of no or reduced rank. In finite samples, the asymptotic critical values often lead to oversized tests, particularly for the trace statistic, prompting the development of small-sample corrections. The Bartlett correction adjusts the trace statistic by multiplying it by the factor \frac{T - d k}{T}, where T is the sample size, d is the number of variables, and k is the lag length; this degrees-of-freedom adjustment reduces size distortion, as derived from higher-order asymptotic expansions. For the maximum eigenvalue test, the Reinsel-Ahn correction similarly scales each eigenvalue by \frac{T - d k}{T}, improving approximation to the under the null and enhancing power in small samples around 50-100 observations. These corrections are particularly effective when the true cointegrating rank is low and are routinely applied in empirical analyses to mitigate bias. As alternatives to analytical corrections, bootstrap methods offer improved finite-sample accuracy by resampling the empirical distribution of the data. Parametric bootstraps simulate VECM residuals under the to generate empirical critical values, while non-parametric versions resample the actual residuals to preserve dependence structure; both approaches reduce size distortions more effectively than asymptotic values in samples under 200 observations, especially in cases with near-unit roots or structural breaks. Studies show bootstrapped p-values yield rejection rates closer to nominal levels across various cases, making them suitable for robust inference when model misspecification is suspected. The choice of case significantly affects test power and inference validity, as misspecification of deterministic terms can inflate or deflate rejection probabilities. Guidelines recommend starting with Case III for economic lacking clear trends, escalating to Case IV or V if unit root tests on residuals indicate non-stationarity after de-trending; visual inspection of data plots and auxiliary tests for trends in the levels versus differences help select the parsimonious specification that aligns with economic theory and avoids over-parameterization.

Implementation

Estimation Procedure

The estimation procedure for the Johansen test begins with pre-testing the time series to confirm they are integrated of order one, I(1), which is a prerequisite for cointegration analysis. This involves applying unit root tests, such as the Augmented Dickey-Fuller (ADF) test, to the levels of the series to check for non-stationarity and to the first differences to verify stationarity. If the series are not I(1), the Johansen framework does not apply, as cointegration requires non-stationary but linearly combinable series. Next, specify the lag order p for the underlying (VAR) model in levels by estimating unrestricted VAR models of varying lags and selecting the order that minimizes information criteria such as the (AIC), Schwarz Information Criterion (SIC), or Hannan-Quinn Information Criterion (HQIC). This step ensures the model captures the short-run dynamics adequately without . With the lag order determined, reparameterize the VAR into its vector error correction model (VECM) form and estimate the parameters using (MLE) under the reduced restriction on the long-run \Pi = \alpha \beta'. This involves a reduced regression approach: first, obtain residuals from regressing the differenced variables \Delta x_t and the lagged levels x_{t-1} on the lagged differences to concentrate the likelihood, then solve for the cointegrating parameters via the eigenvalues of the resulting cross-product matrices. Ordinary least squares (OLS) can be applied to the differenced equations for initial unrestricted estimation, but MLE is essential for imposing the restriction and joint estimation of adjustment speeds \alpha and cointegrating vectors \beta. The eigenvalues \hat{\lambda}_i are then computed from the eigenvalue decomposition of the matrix derived from these cross-products, ordered in descending magnitude, to quantify the strength of the . The number of significant eigenvalues informs the rank r. To determine r, apply sequential testing using either the trace test or the maximum eigenvalue test, starting from the of rank zero and proceeding upward until the test fails to reject, under the appropriate deterministic case (e.g., no intercept, intercept in cointegrating relations, or linear trend). This yields the estimated rank r. Post-estimation, normalize the cointegrating vectors \beta by setting one coefficient to unity (typically on a variable of interest) to identify the relations economically, and if theoretical restrictions are imposed on \beta, test them using likelihood ratio statistics to assess overidentifying constraints.

Software Tools

The Johansen test is implemented in various econometric software packages, enabling researchers to perform analysis on multivariate data. These tools typically support both and maximum eigenvalue statistics, with options for specifying lag lengths, deterministic trends, and model cases (e.g., no intercept, intercept in cointegrating equations). In , the urca package provides comprehensive functionality for the Johansen procedure through the ca.jo() , which conducts the on a VAR model and reports and eigenvalue statistics along with eigenvectors and loading factors; it supports all standard cases including unrestricted intercepts and trends. The package also includes cajorls() for estimating the VECM based on the results, offering restricted least-squares estimation and diagnostic outputs. As of version 1.3-4 (released May 2024), urca remains a standard open-source option for reproducible analysis in . Python's statsmodels implements the Johansen test via the coint_johansen() in the statsmodels.tsa.vector_ar.vecm , which returns a JohansenTestResult object containing test statistics, critical values, and p-values for rank determination. This integrates seamlessly with pandas for data input and manipulation, facilitating preprocessing of like differencing or lag selection. Version 0.14.4 (released July 2025) includes enhancements to modules, improving for large datasets. MATLAB's Econometrics Toolbox features the jcitest() function for the Johansen cointegration test, which tests multiple ranks sequentially and outputs decisions, statistics, and maximum likelihood estimates for VECM parameters. It allows customization of lags (e.g., via the Lags parameter) and deterministic components (e.g., H1* for unrestricted trends), supporting table or timetable inputs for flexible data handling. Commercial software like and offer built-in commands tailored for econometric workflows. In 14, the Johansen test is accessed through the cointegration estimation menu, with enhancements allowing finer control over deterministic trends and exogenous variables in long-run relations, such as restricting them to cointegrating vectors only. 's vecrank command performs the Johansen rank test using trace and maximum eigenvalue statistics, with options for lag specification, trend inclusion, and information criteria like for model selection; it integrates with vec for subsequent VECM estimation. Open-source alternatives include OxMetrics (via its PcGive module), which supports the full Johansen procedure for I(1) and I(2) testing in models, including instability and normality diagnostics. provides the johansen_test() function for rank determination, available in both and script modes, with support for small-sample corrections via add-on packages like johansensmall. These tools emphasize reproducibility, with 's scripting enabling automated simulations. Recent updates across these packages as of November 2025, such as those in statsmodels 0.14.4 (July 2025) and 14 (2025), focus on expanded model specifications and computational efficiency, though no widespread adoption of specifically for Johansen simulations has been documented as of late 2025. Users should consult package changelogs for compatibility with large-scale applications.

Applications

Economic and Financial Uses

In economics, the Johansen test is widely applied to examine long-run equilibrium relationships, such as testing for purchasing power parity (PPP) across currencies, where it assesses whether deviations from parity between exchange rates and price levels are stationary. For instance, seminal analyses have used the test to evaluate PPP between the Australian and U.S. dollars, revealing I(2) cointegration processes that support long-run parity under specific trend assumptions. Similarly, the test is employed to investigate long-run money demand functions, confirming stable relationships between money supply, income, and interest rates even amid high inflation episodes in countries like Greece and Japan. In , the Johansen test facilitates pairs trading strategies by identifying stock pairs, enabling traders to exploit temporary divergences from their long-run for mean-reversion profits. This approach is particularly valuable in constructing market-neutral portfolios, where assets reduce exposure, as demonstrated in empirical studies on pairs using multivariate vectors. Higher rank r from the test signals stronger interdependencies among assets, informing the selection of robust pairs for such strategies. Within , the test models in exchange rates, often revealing long-run relationships influenced by macroeconomic fundamentals like GDP and s. It is also used to analyze term structure , where yields across maturities exhibit equilibrium ties, aiding in forecasting dynamics. In , the Johansen test evaluates fiscal by testing between government debt (or expenditures) and GDP (or revenues), with evidence from countries supporting under cointegrated constraints. Empirically, the test enhances (VAR) forecasting by incorporating , improving accuracy in economic projections through error-correction mechanisms derived from estimated cointegrating vectors.

Practical Example

To illustrate the application of the Johansen test, consider quarterly U.S. macroeconomic data on real gross domestic product (GDP), real personal expenditures, and real gross private domestic , sourced from the . These variables are typically expressed in logarithmic form to capture growth rates and potential long-run equilibria implied by economic theory, such as balanced growth paths where and adjust to GDP fluctuations. The analysis begins with lag selection for the underlying vector autoregressive () model, often using information criteria like the (AIC). The test is conducted under a specification allowing an unrestricted intercept in the cointegrating relation to account for drifts in the levels, as is common for macroeconomic series. The Johansen procedure then estimates the of the Π via maximum likelihood. In such analyses, the and maximum eigenvalue tests are used to determine the cointegration r. A finding of r=1 would indicate a single long-run equilibrium among the variables, where deviations from it are mean-reverting. The estimated cointegrating vector might imply an equilibrium relation consistent with growth models where and shares stabilize relative to GDP over time. Adjustment coefficients reveal the speeds of reversion for each variable, often showing faster adjustment for . Visualizations aid interpretation: time-series plots of the raw log variables show non-stationary trends with common movements, suggesting potential . The derived cointegrating residual appears stationary, as confirmed by unit root tests like the Augmented Dickey-Fuller test. Adjustment dynamics can be depicted via impulse response functions from the vector (VECM), showing shocks to the residual decaying over several quarters.

References

  1. [1]
    Estimation and Hypothesis Testing of Cointegration Vectors in ...
    Nov 1, 1991 · Econometrica: Nov, 1991, Volume 59, Issue 6. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models.
  2. [2]
    Statistical analysis of cointegration vectors - ScienceDirect.com
    The asymptotic distribution of these test statistics are found and the first is described by a natural multivariate version of the usual test for unit root in ...
  3. [3]
  4. [4]
    [PDF] Cointegration
    Johansen proposes a sequential testing procedure that consistently deter- mines the number of cointegrating vectors. First test H0(r0 = 0) against. H1(r0 > 0).
  5. [5]
    Some properties of time series data and their use in econometric ...
    An introduction to long-memory time series and fractional differencing. Journal of Time Series Analysis, V.1 (1981)
  6. [6]
    Macroeconomics and Reality - jstor
    Setting such coefficients to zero may be a justifiable part of the estimation process, but it does not aid in identification. Page 6. 6. CHRISTOPHER. A. SIMS.
  7. [7]
    Co-Integration and Error Correction: Representation, Estimation ...
    The relationship between co-integration and error correction models, first suggested in. Granger (1981), is here extended and used to develop estimation ...
  8. [8]
    Estimation and Hypothesis Testing of Cointegration Vectors in ... - jstor
    likelihood ratio test, Gaussian VAR models. 1. INTRODUCTION AND SUMMARY. A LARGE NUMBER OF PAPERS are devoted to the analysis of the concept of cointegration ...
  9. [9]
    [PDF] Cointegration. Overview and Development
    In this article cointegration is modelled by the vector autoregressive model ... Johansen S (2002) A small sample correction of the test for cointegration rank.
  10. [10]
    A Small Sample Correction of the Test for Cointegrating Rank
    Søren Johansen & Rocco Mosconi & Bent Nielsen, 2000. "Cointegration analysis in the presence of structural breaks in the deterministic trend," Econometrics ...<|control11|><|separator|>
  11. [11]
    Cointegration analysis in the presence of structural breaks in the ...
    We propose a cointegration model with piecewise linear trend and known break points. Within this model it is possible to test cointegration rank, restrictions ...
  12. [12]
    New Introduction to Multiple Time Series Analysis - SpringerLink
    This reference work and graduate level textbook considers a wide range of models and methods for analyzing and forecasting multiple time series.
  13. [13]
    Likelihood-Based Inference in Cointegrated Vector Autoregressive ...
    This monograph is concerned with the statistical analysis of multivariate systems of non‐stationary time series of type I(1).
  14. [14]
    [PDF] Maximum eigenvalue versus trace tests for the cointegrating rank of ...
    May 1, 2025 · The trace tests tend to have more distorted sizes whereas their power is in some situations superior to that of the maximum eigenvalue tests.
  15. [15]
    MAXIMUM LIKELIHOOD ESTIMATION AND INFERENCE ON ...
    With an Application to the Demand for Money in Denmark and Finland ...
  16. [16]
    [PDF] a small sample correction for the test of cointegrating rank in the ...
    The referees have done an extremely good job with the first version of the paper. ... totic I 1 critical values (Johansen (1996, Table 15.4)) for the trace test ...
  17. [17]
    [PDF] Estimation and Hypothesis Testing of Cointegration Vectors in ...
    This paper presents maximum likelihood estimators and likelihood ratio tests for cointegration vectors in Gaussian vector autoregressive models, including ...
  18. [18]
    jcitest - Johansen cointegration test - MATLAB - MathWorks
    jcitest treats each test as separate from all other tests. Each row of all outputs contains the results of the corresponding test. For example, jcitest(Tbl, ...
  19. [19]
    urca: Unit Root and Cointegration Tests for Time Series Data
    ### Summary of `ca.jo` and `cajorls` Functions in `urca` Package
  20. [20]
    [PDF] urca: Unit Root and Cointegration Tests for Time Series Data
    Methods object = "ca.jo" Displays the test statistic of the Johansen procedure. object = "cajo.test" Displays the test statistic of a restricted VAR with ...
  21. [21]
    statsmodels 0.14.4
    ### Summary of Johansen Cointegration Test in statsmodels
  22. [22]
    EViews 13 New Econometrics and Statistics: Testing and Diagnostics
    EViews 13 features improvements to Johansen cointegration testing, including: ... Cross-section bounds test for cointegration; Similarity tests; Symmetry tests.
  23. [23]
    [PDF] vecrank — Estimate the cointegrating rank of a VEC model - Stata
    1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59: 1551–1580. https://doi.org/10.2307/ ...<|control11|><|separator|>
  24. [24]
    [PDF] PcGiveTM 16 Volume II OxMetrics 9 - Jurgen A Doornik
    I(1) cointegration test for VAR estimated in levels;. "I2". I(2) cointegration test for VAR estimated in levels;. "instability" instability tests;. "normal ...
  25. [25]
    gretl
    ### Summary of Johansen Test Implementation in GRETL
  26. [26]
    [PDF] Gretl User's Guide
    User-friendly Gretl offers an intuitive user interface; it is very easy to get up and running with econometric analysis. Thanks to its association with the ...
  27. [27]
    [PDF] An 1(2) Cointegration Analysis of the Purchasing Power Parity ...
    O. Introduction. This paper is an illustration of a technique for analyzing time series data that allows for processes that are integrated of order 2.
  28. [28]
    Cointegration tests of purchasing power parity - ScienceDirect.com
    This paper explores the relationship between non-traded goods relative prices and the real exchange rate. We apply the Johansen maximum likelihood procedure ...
  29. [29]
    High inflation rates and the long-run money demand function
    Tests based on the Johansen method of cointegration reveal strong support for a stationary money demand function in the long run in all three countries.
  30. [30]
    Testing for Long Run Money Demand Functions in Greece Using ...
    Long run money demand functions for M1 and M3 are tested by means of the cointegration methodology developed by Johansen (1988).
  31. [31]
    Pairs Trading: A Cointegration Approach - SeS Home
    This study uses the Johansen test for cointegration to select trading pairs for use within a pairs trading framework.
  32. [32]
    Evaluation of Dynamic Cointegration-Based Pairs Trading Strategy ...
    Sep 22, 2021 · This research aims to demonstrate a dynamic cointegration-based pairs trading strategy, including an optimal look-back window framework in the cryptocurrency ...
  33. [33]
    Johansen Cointegration Test: Learn How to Implement it in Python
    Dec 11, 2023 · The Johansen Cointegration Test is a statistical procedure used to analyse the long-term relationships between multiple time series variables.
  34. [34]
    Cointegration between macroeconomic factors and the exchange ...
    Jan 31, 2019 · They revealed the results of the Johansen cointegration technique show that the interest rate and GDP have a negative effect on exchange rate ...
  35. [35]
    A cointegration test of the impact of foreign exchange rates on U.S. ...
    A cointegration test of the impact of foreign exchange rates on U.S. stock market prices ... Johansen. Statistical Analysis of Cointegrating Vectors. Journal of ...
  36. [36]
    [PDF] panel cointegration and structural breaks in OECD countries
    Regarding the sustainability of fiscal policy the empirical literature usually tests for the possibility of both public expenditures and government revenues ...
  37. [37]
    [PDF] Fiscal Sustainability of the German Laender - Time Series Evidence
    In a second step, Johansen cointegration tests are performed to test for cointegration between expenditure and revenue in each Land. The lag lengths are ...
  38. [38]
    [PDF] Testing for Cointegration Using the Johansen Methodology when ...
    Since the critical values used for the maximum eigenvalue and trace test statistics are based on a pure unit-root assumption, they will no longer be correct ...Missing: original | Show results with:original
  39. [39]
    [PDF] Cointegration Tests and the Classical Dichotomy
    I provide several illustrations of how failure to follow this approach results in cointegration not being detected between either GDP and consumption; short- ...