Fact-checked by Grok 2 weeks ago

Autoregressive integrated moving average

The Autoregressive model is a statistical technique for understanding and predicting data, particularly non-stationary series that display trends or other patterns requiring transformation to stationarity. It integrates three core components—autoregression (), integration (I), and ()—to capture linear dependencies in data over time, enabling forecasts based on historical patterns. The model is denoted as ARIMA(p, d, q), where p represents the number of lagged observations included in the autoregressive term, d indicates the number of differencing steps applied to remove trends and achieve stationarity, and q specifies the order of the moving average term, which models the influence of past forecast errors. Developed and popularized by statisticians and Gwilym M. Jenkins, ARIMA emerged as a cornerstone of analysis through their seminal 1970 work, which emphasized iterative model identification, estimation, and diagnostic checking for practical forecasting applications. ARIMA models assume the underlying time series can be represented as a of its own past values, differenced values, and past errors, making them suitable for univariate in fields such as , , and . The autoregressive part (AR(p)) posits that the current value depends linearly on previous values plus a term, expressed as y_t = c + \phi_1 y_{t-1} + \cdots + \phi_p y_{t-p} + \epsilon_t, where \phi_i are parameters and \epsilon_t is . The moving average component (MA(q)) incorporates lagged forecast errors to account for short-term , given by y_t = c + \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}. Differencing (the "I" in ARIMA) transforms non-stationary data by subtracting consecutive observations d times, often once for series with constant trends, to stabilize the mean and variance before fitting an ARMA model to the resulting series. While effective for short-term predictions, ARIMA's performance relies on proper parameter selection via methods like autocorrelation function (ACF) and (PACF) plots, and it may require extensions like seasonal ARIMA (SARIMA) for periodic data.

Model Fundamentals

Core Components

The Autoregressive Integrated Moving Average () model derives its name from the combination of three core statistical components: autoregressive (), integrated (I), and moving average (), forming a framework designed for univariate . This encapsulates a that extends traditional ARMA models to handle trends and non-stationarity inherent in many real-world datasets, such as economic indicators or inventory levels. The autoregressive (AR) component captures the linear dependency of the current value in a on its own lagged values, essentially modeling how past observations influence the present one within a . For instance, in an AR process, the value at time t is expressed as a weighted sum of previous values plus random , allowing the model to account for or persistence in the data. The integrated (I) component addresses non-stationarity by applying successive differencing operations to the original time series, transforming it into a stationary series where statistical properties like mean and variance remain constant over time. This differencing removes trends or drifts, making the data suitable for subsequent AR and MA modeling, with the order of integration indicating the number of differencing steps required. The (MA) component models the impact of past forecast errors or shocks on the current observation, representing the series as a of current and previous error terms from the model's predictions. It helps capture short-term fluctuations and smooth out irregularities that the AR component might not fully explain. Collectively, ARIMA integrates these elements to provide a flexible approach for modeling non-stationary time series: the I part achieves stationarity through differencing, after which the AR and MA components jointly describe the underlying dynamics of the transformed series for accurate forecasting. This combination enables ARIMA to handle a wide range of temporal patterns without assuming external regressors.

Historical Background

The foundations of autoregressive integrated moving average (ARIMA) models trace back to early 20th-century developments in time series analysis. Autoregressive (AR) models were first introduced by George Udny Yule in 1927, who applied them to analyze periodicities in sunspot data, demonstrating how current values could depend linearly on past values to capture apparent cycles in seemingly random series. This work laid the groundwork for modeling serial correlation. In 1931, Gilbert Thomas Walker extended Yule's ideas by exploring correlations across related time series, further refining the theoretical framework for AR processes and introducing methods to estimate parameters via what became known as the Yule-Walker equations. Moving average (MA) models emerged shortly thereafter, with Herman Wold's 1938 study on providing a rigorous decomposition theorem that represented any as an infinite of uncorrelated innovations. Wold's contribution established the as a fundamental building block for capturing the effects of past shocks on current observations. The and components were combined into autoregressive (ARMA) models for , first described by Peter Whittle in 1951. The formalization of ARIMA occurred in 1970 with George E. P. Box and Gwilym M. Jenkins' seminal book, Time Series Analysis: Forecasting and Control, which integrated , , and differencing (the "I" in ) into a unified methodology for identifying, estimating, and with . This Box-Jenkins approach revolutionized practical analysis by emphasizing iterative model building and diagnostic checks. Following its publication, the Box-Jenkins methodology gained widespread adoption in the 1970s and , as empirical studies demonstrated 's superior forecasting accuracy over large-scale econometric models for short-term univariate predictions. By the , had become a standard tool in econometric and statistical software, such as the Time Series Processor (TSP) versions from the early and SAS/ETS procedures introduced around the same period, enabling broader application in , , and .

Mathematical Foundations

General Formulation

The autoregressive integrated moving average (ARIMA) model of order (p, d, q) provides a framework for modeling time series data that exhibit non-stationarity, which is addressed through differencing. The general formulation is given by the equation \phi(B) (1 - B)^d y_t = \theta(B) \epsilon_t, where B denotes the backshift operator such that B y_t = y_{t-1}, \phi(B) = 1 - \phi_1 B - \cdots - \phi_p B^p is the autoregressive operator polynomial of degree p, \theta(B) = 1 + \theta_1 B + \cdots + \theta_q B^q is the moving average operator polynomial of degree q, and \{\epsilon_t\} is a white noise process with mean zero and constant variance \sigma^2. This equation originates from the foundational work on time series modeling, where the integration component (1 - B)^d transforms a non-stationary series into a stationary one suitable for ARMA analysis. In this notation, the parameter p specifies the number of lagged observations included in the autoregressive component, d indicates the of non-seasonal differencing required to achieve , and q represents the number of lagged forecast errors incorporated in the component. The model assumes in the relationships, that the d-th differenced series \nabla^d y_t = (1 - B)^d y_t is (with and autocovariance structure depending only on lag), and that the moving average \theta(B) is invertible, meaning its roots lie outside the unit circle in the to ensure the model can be expressed in an infinite AR form. Additionally, the autoregressive \phi(B) must have roots outside the unit circle to guarantee of the differenced process. The formulation extends the autoregressive (ARMA) model by incorporating the integration step: an ARMA(p, q) model \phi(B) z_t = \theta(B) \epsilon_t is applied to the differenced series z_t = (1 - B)^d y_t, yielding the full ARIMA structure for handling trends and non-stationarity without separate deterministic components. For with periodic patterns, a seasonal ARIMA (SARIMA) variant is denoted as ARIMA(p, d, q)(P, D, Q)_s, where lowercase parameters apply to the non-seasonal components, uppercase (P, D, Q) to the seasonal autoregressive, differencing, and moving average orders, and s is the length of the seasonal cycle (e.g., 12 for monthly data).

Stationarity and Integration

In time series analysis, weak stationarity refers to a stochastic process where the mean, variance, and autocovariance are constant over time, meaning E(y_t) = \mu for all t, \text{Var}(y_t) = \sigma^2 for all t, and \text{Cov}(y_t, y_{t+k}) = \gamma_k independent of t for any lag k. This condition ensures that the statistical properties of the series do not change systematically, allowing for reliable modeling of dependencies. To assess stationarity, statistical tests such as the Augmented Dickey-Fuller (ADF) test are commonly employed. The ADF test extends the original Dickey-Fuller test by including lagged difference terms to account for higher-order autoregressive processes, testing the of a (non-stationarity) against the alternative of stationarity. The is compared to critical values; if it is more negative than the critical value (e.g., at 5% significance), the null is rejected, indicating the series is stationary. Conversely, failure to reject suggests non-stationarity, often requiring transformation. The integrated component in ARIMA models addresses non-stationarity by applying differencing to make the series stationary. A series is integrated of order d, denoted I(d), if it becomes stationary after d differences; the parameter d specifies the number of differencing operations needed. First-order differencing, the most common case, is mathematically represented as \Delta y_t = y_t - y_{t-1}, which removes a linear trend by subtracting consecutive observations. Higher-order differencing applies this recursively, such as second-order \Delta^2 y_t = \Delta(\Delta y_t) = (y_t - y_{t-1}) - (y_{t-1} - y_{t-2}), until stationarity is achieved. Over-differencing, where d exceeds the true order, introduces unnecessary noise and an invertible moving average component of order 1, leading to inefficient parameter estimates and inflated forecast variances, though the model remains identifiable. Under-differencing, applying insufficient differences, leaves residual non-stationarity, resulting in invalid inferences, spurious autocorrelations, and biased forecasts that fail to capture the true dynamics. Proper selection of d is thus essential for model validity and predictive accuracy.

Model Building Process

Order Selection

Order selection in ARIMA models involves determining the appropriate values for the autoregressive order p, the degree of differencing d, and the order q, guided by the . This iterative process begins with , where the analyst examines the to assess stationarity and tentative orders, followed by and diagnostic checking to refine the choice. The emphasizes empirical examination of the data's structure through graphical tools and statistical tests, ensuring the selected model captures the underlying patterns without unnecessary complexity. To select the integration order d, tests play a crucial role in determining the number of differences needed to achieve stationarity. The augmented Dickey-Fuller () test, for instance, evaluates the of a unit root against the alternative of stationarity by estimating an and testing the significance of the coefficient on the lagged dependent variable. If the test rejects the null, no differencing (d=0) is required; otherwise, differencing is applied until stationarity is confirmed, typically with d=1 or d=2 sufficing for most series. Once stationarity is established, the autoregressive order p and moving average order q are identified using autocorrelation function (ACF) and (PACF) plots of the differenced series. For a pure AR(p) process, the ACF decays gradually or shows a sinusoidal , while the PACF cuts off abruptly after p, indicating significant partial correlations up to p and insignificance thereafter. Conversely, for a pure MA(q) process, the PACF decays gradually, but the ACF cuts off after q. Mixed ARMA models exhibit more complex , such as decaying ACF and PACF beyond the respective s, requiring tentative fitting and . To objectively compare candidate models suggested by ACF and PACF, information criteria such as the (AIC) and (BIC) are employed. The AIC is calculated as \text{AIC} = -2 \ln L + 2k, where L is the maximized likelihood of the model and k is the number of estimated parameters, balancing goodness-of-fit with model complexity by penalizing excessive parameters. The BIC, given by \text{BIC} = -2 \ln L + k \ln n with n as the sample size, imposes a stronger penalty on complexity, favoring more parsimonious models especially in larger samples. Lower values indicate better models, with BIC often preferred for its consistency in selecting the true order asymptotically. Throughout order selection, the principle of is essential to mitigate overfitting risks, where overly complex models capture rather than signal, leading to poor out-of-sample forecasts. The Box-Jenkins approach advocates selecting the simplest model that adequately fits the , as redundancy in AR and MA terms can cause parameter cancellation and instability; for example, higher-order terms may mimic lower-order ones, inflating variance without improving predictions. Overfitting is particularly evident when ACF/PACF suggest high p or q, but criteria like select lower orders to ensure generalizability.

Differencing Techniques

Differencing is a fundamental transformation in modeling used to convert a non-stationary time series into a one by eliminating trends and stabilizing the . The order of differencing, denoted by d in the (p, d, q) framework, indicates the number of times the series must be differenced to achieve approximate stationarity. differencing, the most approach, computes the differences between consecutive observations and is defined as \Delta y_t = y_t - y_{t-1}. This effectively removes linear trends but may not suffice for series with higher-degree polynomials or seasonal patterns. Higher-order differencing extends this process iteratively to address more complex non-stationarities, such as quadratic trends. For second-order differencing, the transformation is applied to the first differences: \Delta^2 y_t = \Delta (y_t - y_{t-1}) = (y_t - y_{t-1}) - (y_{t-1} - y_{t-2}), which simplifies to y_t - 2y_{t-1} + y_{t-2}. Subsequent orders follow similarly, with the k-th difference expressed using coefficients: \Delta^k y_t = \sum_{i=0}^k (-1)^i \binom{k}{i} y_{t-i}. While effective for trends, excessive differencing can amplify high-frequency noise and inflate variance, complicating model interpretability by shifting focus from original levels to changes in changes. Practitioners are advised to use the minimal necessary to avoid over-differencing. Seasonal differencing targets periodic fluctuations by subtracting observations from the same season in prior periods, defined as \Delta_s y_t = y_t - y_{t-s} where s is the seasonal period (e.g., 12 for monthly ). This can be combined with non-seasonal differencing, as in the SARIMA extension, to handle both trend and simultaneously. For instance, in the classic monthly international passenger (1949–1960), first-order non-seasonal differencing removes the upward linear trend, resulting in a series with stable mean but persistent annual cycles; applying additional seasonal differencing with s=12 yields a nearly with constant variance and no evident patterns. Such transformations enhance forecast reliability but require careful assessment to preserve the underlying dynamics. The order of differencing is typically determined through iterative visual inspection of time series plots. Starting with the original series, one differences and replots repeatedly, stopping when the resulting series exhibits no visible trend, stable variance, and random fluctuations around a constant mean—hallmarks of stationarity. This graphical approach, advocated in foundational methodology, allows practitioners to gauge the minimal d empirically, often supplemented by brief checks using stationarity tests for confirmation. In the airline passenger example, iterative plotting reveals that one non-seasonal and one seasonal difference suffice, balancing transformation efficacy with minimal distortion to variance and interpretive clarity. Over-differencing risks introducing unnecessary variability, which can degrade forecast accuracy and obscure economic or practical insights from the model.

Parameter Estimation

Estimation Methods

Parameter estimation in ARIMA models, after differencing to achieve stationarity, primarily involves fitting the underlying ARMA(p, q) parameters using (MLE) under the assumption of Gaussian-distributed errors. This approach maximizes the of the observed data given the model parameters, providing asymptotically efficient estimates. The log-likelihood is typically formulated based on the innovations algorithm or to account for the dependence structure. Two common variants of likelihood-based estimation are conditional least squares and exact maximum likelihood. Conditional least squares approximates the likelihood by conditioning on initial values for , often setting pre-sample errors to zero, which simplifies but introduces some , particularly for short series or high-order components. In contrast, exact maximum likelihood incorporates the full likelihood by properly accounting for initial conditions, yielding more accurate estimates at the cost of greater computational intensity; it is preferred for precise inference. For components, which depend on past errors, initial values are crucial and are often obtained via . This technique involves reversing the and fitting an ARMA model to generate estimates of pre-sample residuals, ensuring the likelihood computation starts from plausible values. Parameter optimization proceeds iteratively using numerical methods such as the Newton-Raphson algorithm, which updates estimates based on the and of the log-likelihood until criteria are met, typically defined by small changes in parameter values or likelihood (e.g., less than 0.1% relative change). When errors deviate from Gaussianity, such as in the presence of heavy tails or , standard MLE may still be applied as quasi-maximum likelihood , which remains consistent under mild misspecification but loses ; alternatively, full MLE can incorporate specified non-Gaussian distributions like Student's t, though this increases .

Model Diagnostics

Model diagnostics for autoregressive integrated moving average () models focus on validating the fitted model by scrutinizing the residuals to confirm that the model's assumptions are met and that no systematic patterns remain unexplained. These checks are essential after parameter to ensure the model adequately captures the dynamics without structure that could bias forecasts. Residual analysis forms the cornerstone of ARIMA diagnostics, aiming to verify that the residuals—defined as the differences between observed and fitted values—exhibit properties of , including zero mean, constant variance, and lack of serial correlation. Visual inspection begins with plotting the residuals over time to detect any trends, heteroscedasticity, or outliers, followed by the autocorrelation function (ACF) plot of the residuals. An adequate model produces an ACF plot where all autocorrelations beyond lag zero fall within the confidence bands (typically ±1.96/√n), indicating no significant linear dependencies remain in the data. To quantitatively assess the absence of autocorrelation in residuals, the Ljung-Box test is widely applied. This evaluates the joint hypothesis that the first h autocorrelations of the residuals are zero. The is calculated as Q = n(n + 2) \sum_{k=1}^{h} \frac{\hat{\rho}_k^2}{n - k}, where n is the effective sample size after differencing, h is the number of lags tested (often chosen as 10 or 20), and \hat{\rho}_k denotes the sample at lag k. For an ARIMA(p,d,q) model, Q asymptotically follows a with h - p - q degrees of freedom under the of residuals. A greater than a chosen significance level (e.g., 0.05) fails to reject the null, supporting model adequacy; conversely, low p-values signal remaining autocorrelation, prompting model refinement. Normality of residuals is another key assumption for valid inference and forecasting in ARIMA models, particularly when using . The Jarque-Bera test provides a formal evaluation by testing whether the sample and match those of a (zero and of 3). The test statistic is JB = \frac{n}{6} \left( S^2 + \frac{(K - 3)^2}{4} \right), where n is the sample size, S is the sample , and K is the sample of the residuals. Under , JB follows a with 2 . A non-significant (e.g., >0.05) indicates that the residuals are consistent with ; significant results may suggest the need for transformations or alternative models, though mild deviations are often tolerable for large samples. Overfitting in ARIMA models, where excessive parameters capture noise rather than signal, is detected through out-of-sample validation. This involves partitioning the into and validation sets, fitting the model on the , and assessing predictive on the unseen validation set using metrics such as or root . If out-of-sample errors substantially exceed in-sample errors, or if simpler models perform comparably on validation , overfitting is evident, and model orders should be reduced. In practice, diagnostic outputs from fitted ARIMA models integrate these checks for comprehensive interpretation. For instance, consider an model fitted to monthly sales data: the Ljung-Box Q-statistic for 12 lags might yield Q = 14.3 with a of 0.29, confirming no significant ; the Jarque-Bera statistic could be JB = 1.8 with 0.41, supporting normality; and an ACF plot of residuals showing all bars within confidence limits would visually affirm properties. Such results collectively validate the model for reliable , whereas deviations (e.g., significant Q ) would indicate inadequate specification.

Forecasting Applications

Forecast Computation

Forecasting with an ARIMA(p,d,q) model involves generating point estimates for future values of the y_t. The begins by applying the differencing operator d times to achieve stationarity, resulting in a differenced series z_t = \nabla^d y_t, which follows an ARMA(p,q) . Forecasts for z_t are then computed, and the original series forecasts are obtained by reversing the differencing through cumulative summation. This approach ensures that the forecasts account for the order in the model. For the one-step-ahead forecast of the differenced series, denoted \hat{z}_{t+1|t}, the ARMA(p,q) model provides: \hat{z}_{t+1|t} = \sum_{i=1}^p \phi_i z_{t+1-i} + \sum_{j=1}^q \theta_j e_{t+1-j}, where \phi_i are the autoregressive parameters, \theta_j are the parameters, z_{t+1-i} are observed past values of the differenced series, and e_{t+1-j} are past forecast errors (residuals) from the fitted model. If d=1, the one-step-ahead forecast for the original series is \hat{y}_{t+1|t} = y_t + \hat{z}_{t+1|t}. This formula originates from the in the ARMA representation, as detailed in the foundational work on models. Multi-step-ahead forecasts are generated recursively by applying the ARMA model iteratively, substituting previous forecasts for unavailable future observations. For the h-step-ahead forecast \hat{z}_{t+h|t}, the autoregressive terms use a combination of observed and previously forecasted differenced values, while the moving average terms incorporate past errors; however, the MA component decays to zero for horizons h > q, simplifying to a pure AR forecast beyond that point. For an integrated model with d=1, the h-step-ahead forecast is \hat{y}_{t+h|t} = y_t + \sum_{j=1}^h \hat{z}_{t+j|t}, cumulatively summing the differenced forecasts to reverse the . This recursive method ensures consistency with the model's structure. The variance of forecast errors generally increases with the forecast horizon, reflecting accumulating uncertainty from model predictions and unobserved shocks. For models, the one-step-ahead variance is approximately the variance \sigma^2, but for longer horizons, it grows due to the propagation of errors through the autoregressive structure and the , often linearly for d=1 models. A practical example is quarterly (GDP) using an (1,1,1) model, commonly applied to economic exhibiting trends and mild . Fitting the model to historical quarterly GDP data, the forecasts incorporate recent differenced values and residuals, with multi-step forecasts showing the influence of the term diminishing over time, integrated back to yield absolute GDP levels. Such applications demonstrate ARIMA's utility in macroeconomic .

Interval Estimation

Interval estimation in ARIMA models involves constructing prediction intervals around point forecasts to quantify the uncertainty associated with future predictions. These intervals provide a range within which the actual future values are expected to lie with a specified probability, typically assuming normality of the forecast errors. The width of these intervals increases with the forecast horizon, reflecting growing uncertainty over time. The forecast standard error for the h-step ahead prediction is derived from the variance of the forecast error, given by \sigma_h^2 = \sigma^2 \left(1 + \sum_{i=1}^{h-1} \psi_i^2 \right), where \sigma^2 is the innovation variance and \psi_i are the coefficients of the infinite moving average (MA(∞)) representation of the ARIMA model. The \psi_i weights capture the cumulative effect of past shocks on future values and are obtained by inverting the ARIMA model into its MA(∞) form. This formula accounts for the accumulation of uncertainty from unobserved future errors. Approximate prediction intervals are commonly constructed assuming the forecast errors follow a . For a (1 - α) , the interval is \hat{y}_{t+h|t} \pm z_{\alpha/2} \sigma_h, where \hat{y}_{t+h|t} is the point forecast and z_{\alpha/2} is the upper α/2 of the standard . This approach is straightforward and widely used, particularly for large samples or longer horizons where the justifies normality. For short forecast horizons, exact prediction intervals can be computed by directly deriving the of the , often using the model's or recursive methods, which avoid the normality assumption. However, for longer horizons, these exact methods become computationally intensive, and the normal approximation is preferred for practicality. Fan charts offer a graphical of these probabilistic forecasts, displaying a series of nested intervals that fan out over time to illustrate increasing . Each shaded region corresponds to a cumulative probability , such as 10%, %, up to 90%, providing a comprehensive view of the forecast distribution rather than just point estimates and bounds. This is particularly effective for communicating in applications like . As an example, consider quarterly stock prices with an (1,1,0) model fitted to historical data. The 95% prediction intervals for the next four quarters widen over time, highlighting how uncertainty accumulates and informs in financial .

Extensions and Implementations

Advanced Variations

The basic model assumes stationarity after integer differencing and captures linear dependencies through autoregressive and moving average components, but it often fails to adequately model real-world exhibiting , external influences, long-range dependencies, or abrupt changes. Advanced variations extend the framework to address these limitations, enhancing flexibility and accuracy for complex data patterns. These extensions maintain the core structure while incorporating additional parameters or mechanisms, as originally outlined in foundational methodologies. Seasonal ARIMA (SARIMA) models address the limitation of basic in handling periodic patterns, such as monthly or quarterly cycles in economic or environmental data. The SARIMA(p,d,q)(P,D,Q)_s formulation combines non-seasonal components with seasonal autoregressive (P), differencing (D), and (Q) terms, where s denotes the seasonal period (e.g., s=12 for monthly data). The general model is expressed as: \phi_p(B) \Phi_P(B^s) (1 - B)^d (1 - B^s)^D y_t = \theta_q(B) \Theta_Q(B^s) \epsilon_t Here, \phi_p(B) and \theta_q(B) are the non-seasonal polynomials, \Phi_P(B^s) and \Theta_Q(B^s) are the seasonal counterparts evaluated at lag s, and \epsilon_t is . This structure allows SARIMA to capture both short-term and seasonal autocorrelations, improving fits for data like airline passenger counts or retail sales. SARIMA was developed as part of the Box-Jenkins methodology to model multiplicative seasonal effects without assuming independence across cycles. ARIMAX extends by incorporating exogenous variables, addressing cases where the is influenced by external factors such as weather, policy changes, or marketing efforts, which basic ARIMA ignores. The model integrates a to link the input series x_t to the output y_t, typically formulated as an ARIMA process for the plus a filtered exogenous component: y_t = \nu(B) x_t + \frac{\theta_q(B)}{\phi_p(B)} a_t, where \nu(B) is the (often \nu(B) = \frac{\omega(B)}{\delta(B)} for numerator \omega and denominator \delta of orders r and s), and a_t is the . This allows quantification of causal impacts, such as how temperature affects electricity demand. The approach builds on transfer function-noise models introduced in the for dynamic . Fractionally integrated ARIMA (ARFIMA) tackles the limitation of integer-order differencing in basic , which cannot model long-memory processes where shocks persist indefinitely with decay rather than exponential. In ARFIMA(p,d,q), the parameter d is fractional (typically $0 < d < 0.5 for stationarity with ), generalizing the differencing operator to (1 - B)^d = \sum_{k=0}^\infty \binom{d}{k} (-1)^k B^k. The model becomes \phi_p(B) (1 - B)^d y_t = \theta_q(B) \epsilon_t, enabling capture of persistent autocorrelations in series like river flows or financial volatility. ARFIMA was introduced to formalize dynamics, distinguishing them from short-memory ARIMA processes. Intervention analysis extends to detect and model structural breaks or external shocks, such as implementations or disasters, which basic treats as unmodeled noise. It augments the with an term: y_t = \frac{\omega(B)}{\delta(B)} \xi_t + N_t, where \xi_t is a deterministic input (e.g., a I_t for permanent shifts or for temporary), \omega(B)/\delta(B) is the capturing dynamic response, and N_t is the ARIMA noise process. This method estimates the magnitude and duration of impacts, as applied to economic like tax changes. The technique was developed to assess point or gradual disruptions in time series. These variations collectively overcome key shortcomings of standard , such as inability to handle (via SARIMA), external drivers (ARIMAX), slow decay in correlations (ARFIMA), and sudden changes (intervention analysis), thereby broadening applicability in fields like , , and .

Software Tools

Several software tools and libraries facilitate the implementation, estimation, and forecasting of ARIMA models across various programming languages, each offering distinct features for time series analysis. In R, the base stats package provides the arima() function for fitting ARIMA models to univariate time series data, supporting maximum likelihood estimation and handling differenced models with missing values. This function allows specification of the ARIMA order (p, d, q) and includes options for seasonal components via SARIMA extensions. For forecasting, the forecast() function from the companion forecast package wraps around arima() to generate predictions and confidence intervals, enabling drift terms not available in the base implementation. Python's statsmodels library implements through the statsmodels.tsa.arima.model.[ARIMA](/page/Arima) class, which serves as the primary interface for univariate models, including support for exogenous regressors and seasonal components. It performs estimation using methods like conditional or maximum likelihood and provides methods for , , and . For automated selection, the pmdarima library's auto_arima() function identifies optimal (p, d, q) parameters by minimizing information criteria such as AIC or , returning a fitted model compatible with statsmodels. MATLAB offers the arima object in the Econometrics Toolbox for creating and estimating ARIMA(p, D, q) models, where users specify orders and parameters like non-zero means or seasonal lags. The estimate method fits the model to data using maximum likelihood, while forecast computes multi-step predictions with optional simulation for uncertainty quantification. Additional functions like infer for residuals and simulate for generating response paths support comprehensive model diagnostics and validation. In Julia, the StateSpaceModels.jl package supports ARIMA modeling within a state-space , allowing of ARIMA(p, d, q) models via functions like fit_arima for parameter optimization and forecasting. It leverages Julia's for efficient handling of large datasets and includes tools for and residual analysis, though it requires familiarity with state-space representations. Comparisons across these tools highlight differences in ease of use, , and visualization. MATLAB excels in ease of use for beginners due to its integrated environment and intuitive syntax for ARIMA specification, making it ideal for quick prototyping without extensive coding. R and Python offer comparable accessibility, with R's arima() providing straightforward integration with functions and Python's statsmodels benefiting from extensive , though Python's auto_arima simplifies selection more than R's manual approaches. Julia's StateSpaceModels.jl is less beginner-friendly due to its state-space but offers superior for large-scale computations thanks to Julia's just-in-time and parallelization capabilities. For visualization and diagnostics, R stands out with seamless integration to ggplot2 for plotting residuals, ACF/PACF, and forecast intervals, while Python relies on matplotlib or seaborn for similar tasks, often requiring additional setup; MATLAB provides built-in plotting functions, and Julia uses Plots.jl effectively but with a steeper .

References

  1. [1]
    9.5 Non-seasonal ARIMA models | Forecasting - OTexts
    ARIMA is an acronym for AutoRegressive Integrated Moving Average (in this context, “integration” is the reverse of differencing).Missing: definition | Show results with:definition
  2. [2]
    Autoregressive Integrated Moving Average Model - ScienceDirect.com
    Box and Jenkins call these processes autoregressive, integrated, moving average (ARIMA) models. One of the most important ARIMA models is that of a ...
  3. [3]
    Autoregressive Integrated Moving Average ARIMA(p, d, q) Models ...
    In this article we are going to discuss an extension of the ARMA model, namely the Autoregressive Integrated Moving Average model, or ARIMA(p,d,q) model.
  4. [4]
    Time series analysis; forecasting and control : Box, George E. P
    Apr 8, 2019 · Time series analysis; forecasting and control. by: Box, George E. P. Publication date: 1970. Topics: Feedback control systems -- Mathematical ...
  5. [5]
    ARIMA (Box-Jenkins Models): Autoregressive Integrated Moving ...
    The approach was first proposed by Box and Jenkins (1970) ... Nonseasonal Autoregressive Integrated Moving Average models are classified by three factors:.
  6. [6]
    Introduction to ARIMA: nonseasonal models - Duke People
    The acronym ARIMA stands for Auto-Regressive Integrated Moving Average. Lags of the stationarized series in the forecasting equation are called "autoregressive ...Missing: history | Show results with:history
  7. [7]
    [PDF] Lecture 6: Autoregressive Integrated Moving Average Models
    1 AR models. • The autoregressive (AR) model is one of the foundational legs of ARIMA models, which we'll cover. bit by bit in this lecture. (
  8. [8]
    Box-Jenkins Forecasting - Overview and Application
    Aug 19, 2021 · In 1970 George Box and Gwilym Jenkins popularized ARIMA (Autoregressive Integrated Moving Average) ... Box and Jenkins to identify the proper form ...
  9. [9]
    6.4.4.5. Box-Jenkins Models - Information Technology Laboratory
    The most general Box-Jenkins model includes difference operators, autoregressive terms, moving average terms, seasonal difference operators, seasonal ...
  10. [10]
    On periodicity in series of related terms - Journals
    An important extension of our ideas regarding periodicity was made in 1927 when Yule pointed out that, instead of regarding a series of annual sunspot ...Missing: origins | Show results with:origins
  11. [11]
    A Study In The Analysis Of Stationary Time Series : Herman Wold
    Jan 25, 2017 · A Study In The Analysis Of Stationary Time Series ; Publication date: 1938 ; Topics: IIIT ; Collection: digitallibraryindia; JaiGyan ; Language ...Missing: URL | Show results with:URL
  12. [12]
    [PDF] 25 Years of Time Series Forecasting - Rob J Hyndman
    Jan 6, 2006 · This paper reviews 25 years of time series forecasting research, highlighting progress and areas needing further development, and summarizes ...
  13. [13]
    [PDF] Econometric Software: The first Fifty Years in Perspective*
    Sep 15, 2003 · This paper introduces a special issue of the Journal of Economic and Social Measurement on the development and evaluation of computer ...
  14. [14]
    Time Series Analysis: Forecasting and Control - Google Books
    The book is concerned with the building of models for discrete time series and dynamic systems. It describes in detail how such models may be used to obtain ...
  15. [15]
    Time Series Analysis | Wiley Series in Probability and Statistics
    Jun 12, 2008 · A modernized new edition of one of the most trusted books on time series analysis. Since publication of the first edition in 1970, Time Series Analysis has ...
  16. [16]
    Distribution of the Estimators for Autoregressive Time Series With a ...
    Aug 9, 2025 · This test builds upon the original Dickey-Fuller (DF) test, introduced by Dickey and Fuller (1979) , which is a unit root test that examines ...
  17. [17]
    [PDF] The Effect of Over-Differencing on Model Validity - SAS Publishers
    Nov 9, 2022 · We examine the effect of unnecessary differencing (over-differencing) on the appropriateness of the proposed models.
  18. [18]
    [PDF] Effects of Resampled Data on Time Series Forecasting Accuracy
    Over-differencing a series will produce poor near term forecasting perfor- mance where under-differencing were shown to have greater losses from mis-fitting.Missing: consequences scholarly
  19. [19]
    Identifying the orders of AR and MA terms in an ARIMA model
    The PACF plot is a plot of the partial correlation coefficients between the series and lags of itself. In general, the "partial" correlation between two ...ACF and PACF plots · model for the UNITS series... · Alternative model for the...
  20. [20]
    A new look at the statistical model identification - IEEE Xplore
    Dec 31, 1974 · A new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced.
  21. [21]
    Estimating the Dimension of a Model - Project Euclid
    March, 1978 Estimating the Dimension of a Model. Gideon Schwarz · DOWNLOAD PDF + SAVE TO MY LIBRARY. Ann. Statist. 6(2): 461-464 (March, 1978). DOI: 10.1214/aos ...
  22. [22]
    [PDF] Mathematical structure of ARIMA models - Duke People
    Oct 30, 2014 · ARIMA models use the backshift operator (B) to shift data backwards. AR terms mimic differencing, MA terms moderate it, and AR/MA terms can ...
  23. [23]
    Identifying the order of differencing in ARIMA models - Duke People
    The first (and most important) step in fitting an ARIMA model is the determination of the order of differencing needed to stationarize the series.
  24. [24]
    [PDF] The Box-Jenkins Method - NCSS
    Box - Jenkins Analysis refers to a systematic method of identifying, fitting, checking, and using integrated autoregressive, moving average (ARIMA) time ...Missing: formulation | Show results with:formulation
  25. [25]
    On the efficiency of procedures for estimation of parameters in Arima ...
    The paper discusses the implementation of the Newton-Raphson iterative method of estimation of parameters in the autoregressive integrated moving average (ARIMA) ...
  26. [26]
    [PDF] Maximum Likelihood Estimates of Non-Gaussian ARMA Models
    We consider an approximate maximum likelihood algorithm for estimating parameters of possibly non-causal and non-invertible autoregressive moving average ...
  27. [27]
    6.4.4.8.1. Box-Ljung Test - Information Technology Laboratory
    The test is applied to the residuals of a time series after fitting an ARMA( p , q ) model to the data. The test examines m autocorrelations of the residuals.Missing: original paper
  28. [28]
    Jarque, C.M. and Bera, A.K. (1987) A Test for Normality ... - Scirp.org.
    In this paper, we consider general Jarque-Bera tests for any distribution function (df) having at least 4k finite moments for k ≥ 2.
  29. [29]
    Forecasting: Principles and Practice (3rd ed) - OTexts
    This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be ...
  30. [30]
    (PDF) ARIMA: The Models of Box and Jenkins - ResearchGate
    May 16, 2016 · ... (Box and Jenkins, 1970). While the forecasting. technique they ... Autoregressive Integrated Moving Average (ARIMA) [43] , as well as ...
  31. [31]
    [PDF] Predictive analysis of GDP by using ARIMA approach
    Apr 12, 2023 · We will test several different models, among others, ARIMA (1,2,1),(2,1,2), and (1,1,1)Each model's performance will be assessed using ...
  32. [32]
    [PDF] Time Series Forecasting
    In general, the variance is var(Xt+j) = sigma2[ 1 + sum(i = 1 to j-1) psi2 i]. The weights, psi, depend on just what sort of model you've fit. For AR(1) models, ...
  33. [33]
    Methods and formulas for ARIMA - Minitab - Support
    To calculate the 95% prediction interval for the forecast, first you have to calculate the weights. ... Time Series Analysis: Forecasting and Control , 5th ...
  34. [34]
    Visualization of probabilistic forecasts - Rob J Hyndman
    Nov 21, 2014 · Probabilistic forecasts can be visualized using prediction intervals (80%, 95%, or 50%, 95%), fan charts, or highest density regions (HDR). ...Missing: ARIMA | Show results with:ARIMA
  35. [35]
    ARIMA Model - Complete Guide to Time Series Forecasting in Python
    Feb 9, 2019 · The right order of differencing is the minimum differencing required to get a near-stationary series which roams around a defined mean and the ...Missing: via | Show results with:via
  36. [36]
    AN INTRODUCTION TO LONG‐MEMORY TIME SERIES MODELS ...
    Abstract. The idea of fractional differencing is introduced in terms of the infinite filter that corresponds to the expansion of (1-B)d.
  37. [37]
    Intervention Analysis with Applications to Economic and ...
    Apr 5, 2012 · Abstract. This article discusses the effect of interventions on a given response variable in the presence of dependent noise structure.
  38. [38]
    Time Series analysis tsa - statsmodels 0.14.4
    statsmodels.tsa contains model classes and functions that are useful for time series analysis. Basic models include univariate autoregressive models (AR), ...Descriptive Statistics and Tests · Estimation · ARMA Process · TSA Tools
  39. [39]
    arima - Create univariate autoregressive integrated moving average ...
    The arima function returns an arima object specifying the functional form and storing the parameter values of an ARIMA(p,D,q) linear time series model.Creation · Properties · Object Functions · Examples
  40. [40]
    ARIMA Modelling of Time Series - R
    arima is very similar to arima0 for ARMA models or for differenced models without missing values, but handles differenced models with missing values exactly.
  41. [41]
    [PDF] Forecasting Functions for Time Series and Linear Models
    Apr 8, 2025 · Largely a wrapper for the arima function in the stats package. The main difference is that this function allows a drift term. It is also ...
  42. [42]
    statsmodels.tsa.arima.model.
    This model is the basic interface for ARIMA-type models, including those with exogenous regressors and those with seasonal components.ARIMA. - fit · ARIMA. - predict · ARIMAResults
  43. [43]
    pmdarima.arima.auto_arima - alkaline-ml
    The auto-ARIMA process seeks to identify the most optimal parameters for an ARIMA model, settling on a single fitted ARIMA model.
  44. [44]
    estimate - Fit univariate ARIMA or ARIMAX model to data - MATLAB
    This model stores the estimated parameter values resulting from fitting the partially specified ARIMA model Mdl to the observed univariate time series y by ...
  45. [45]
    forecast - Forecast univariate ARIMA or ARIMAX model responses ...
    This MATLAB function returns the numperiods-by-1 numeric vector of consecutive forecasted responses Y and the corresponding numeric vector of forecast mean ...
  46. [46]
    StateSpaceModels.jl - Julia Packages
    StateSpaceModels.jl is a package for modeling, forecasting, and simulating time series in a state-space framework. Implementations were made based on the ...
  47. [47]
    Which numerical computing language is best: Julia, MATLAB ...
    Jul 9, 2018 · Thus, in terms of ease of use, especially for novice users, MATLAB is the best. R and Python trail behind slightly, with Julia having some way ...
  48. [48]
    Is python or julia the best choice for time series forecasting? | by Katy
    Provides statistical tools for analyzing and forecasting time series. StatsBase.jl — Implements statistical models, including ARIMA.<|control11|><|separator|>