Fact-checked by Grok 2 weeks ago

Moving average

A moving average is a statistical technique used to analyze time series data by computing the average of a subset of consecutive data points, thereby smoothing out short-term fluctuations and highlighting longer-term trends or patterns.^[1] This method involves sliding a fixed-size window over the dataset, recalculating the average at each step as the window advances, which makes it particularly useful for identifying underlying cycles in noisy data such as economic indicators or sales figures.^[1] There are several types of moving averages, each differing in how they weight the data points within the window. The simple moving average (SMA) calculates an equal-weighted arithmetic mean of the prices or values over a specified period, such as the average closing price of a stock over 50 days.^[2] In contrast, the exponential moving average (EMA) assigns greater weight to more recent data points using a smoothing factor, typically computed as EMA = (current price × smoothing factor) + (previous EMA × (1 - smoothing factor)), where the smoothing factor is often 2/(n+1) for n periods; this responsiveness to new information makes EMA preferable for detecting rapid trend changes.^[3] Another variant, the weighted moving average (WMA), applies linearly increasing weights to recent observations, providing a balance between simplicity and recency bias.^[1] In finance and trading, moving averages serve as key technical indicators for determining trend direction, support and resistance levels, and potential buy or sell signals; for instance, a short-term moving average crossing above a long-term one, known as a "golden cross," signals bullish momentum.^[2] Popular periods include the 50-day and 200-day SMAs, which traders monitor to confirm uptrends (rising averages) or downtrends (declining averages).^[3] Beyond finance, moving averages are applied in statistics for forecasting and noise reduction, such as using a 7-day SMA to analyze daily retail sales and mitigate weekly variations.^[1] However, all types exhibit a lag due to their reliance on historical data, which can lead to delayed signals in volatile or sideways markets.^[2]

Fundamentals

Definition

A moving average is a statistical calculation used to analyze data points by creating a series of averages from different subsets of a full data set, typically applied to time series to smooth variations in sequential observations.^[4] This technique computes the mean of successive smaller sets of data, advancing one period at a time, which helps in producing a smoothed representation of the underlying pattern.^[5] In its simplest form, for a data sequence \{a_1, a_2, \dots, a_n\}, the moving average at time t with window size k is given by

\frac{1}{k} \sum_{i=t-k+1}^{t} a_i,

where the average is taken over the most recent k values up to time t.^[5] This formulation assumes equal weighting for the simple case, focusing on arithmetic means of contiguous subsets.^[6] The primary purpose of a moving average is to reduce short-term noise and fluctuations in time series data, thereby highlighting longer-term trends or cycles for better pattern recognition and forecasting.^[7] By smoothing out peaks and troughs, it provides a clearer view of the data's directional movement without altering the overall sequence.^[4] It finds applications in fields such as finance for trend analysis and signal processing for noise reduction.^[2] The concept of moving averages originated in the early 20th century within statistics, with early uses documented in economic data analysis around 1901 by R.H. Hooker, later termed "moving averages" by G. Udny Yule in 1927.^[8]

Properties

Moving averages exhibit a smoothing effect by functioning as low-pass filters in signal processing, which attenuate high-frequency variations such as noise while preserving underlying low-frequency trends in data sequences.^[9] This property arises because the filter's frequency response passes low frequencies with minimal amplitude reduction but severely attenuates higher frequencies, as seen in the amplitude response of a simple two-point moving average given by |H(\omega)| = |\cos(\omega/2)|, where low \omega values experience little damping compared to values near the Nyquist frequency.^[9] Consequently, applying a moving average reduces jaggedness in time series data, leveling out short-term fluctuations without substantially altering long-term patterns.^[10] In statistical estimation, simple moving averages serve as unbiased estimators of the underlying signal mean when the data follows a constant trend plus white noise, meaning their expected value equals the true parameter under such assumptions.^[11] However, their variance decreases inversely with the window size k, approximated as V[\hat{f}(x)] \approx \sigma^2 / (2k + 1) for a two-sided average with noise variance \sigma^2, leading to higher variability for smaller windows and smoother but potentially over-smoothed outputs for larger ones.^[11] This creates a fundamental bias-variance trade-off: smaller windows minimize bias by closely tracking local changes but amplify variance due to noise sensitivity, whereas larger windows reduce variance through averaging but introduce bias by oversmoothing, particularly near peaks or troughs where the estimate flattens, with bias scaling as \frac{1}{6} f''(x) k (k + 1) for smooth functions f.^[11]^[12] Moving averages contribute to stationarity in non-stationary time series through differencing operations, where first-order differencing—equivalent to a moving average with kernel weights [1, -1]—stabilizes the mean by removing linear trends and level shifts.^[13] In ARIMA modeling frameworks, such differencing transforms integrated processes into stationary ones, allowing subsequent moving average components to model the residuals effectively without time-varying statistical properties.^[13] This approach ensures constant mean, variance, and autocovariance over time, a prerequisite for reliable time series analysis.^[14] Mathematically, moving averages can be represented as discrete convolutions of the input sequence with a kernel that defines the weights, such as a uniform kernel of ones divided by the window length for the simple moving average.^[10] For a window of size M, the output at index i is y = \frac{1}{M} \sum_{j=0}^{M-1} x[i-j], which corresponds to convolving the signal with a rectangular pulse kernel, enabling efficient computation via fast convolution algorithms and highlighting the filter's linear, time-invariant nature.^[10] This convolution view also reveals the frequency-domain behavior, where the kernel's Fourier transform determines the low-pass characteristics.^[10] Edge effects arise in moving average computations near the boundaries of finite data sequences, where the sliding window cannot fully overlap due to insufficient preceding or following points, potentially leading to biased or incomplete estimates at the start and end.^[15] Common handling strategies include using partial windows that average only available points within the boundary vicinity, or applying padding techniques such as zero-padding, edge replication, or reflection to extend the sequence artificially and maintain full window coverage.^[15] These methods trade off between preserving data integrity and introducing minimal artifacts, with partial windows often preferred for avoiding artificial extensions in short series.^[10]

Basic Types

Simple Moving Average

The simple moving average (SMA) is a fundamental smoothing technique in time series analysis that computes the arithmetic mean of a fixed number of consecutive observations, assigning equal weight to each value within the specified window. This method applies uniform weights of \frac{1}{k} to the most recent k observations, where k is the window size, making it particularly suitable for identifying underlying trends by reducing short-term fluctuations in data.^[4]^[16] The formula for the SMA at time t is given by:

\text{SMA}_t = \frac{1}{k} \sum_{i=1}^{k} a_{t-i+1}

where a_{t-i+1} represents the observation at the corresponding past time point. This rolling calculation updates as new data enters the window and the oldest observation exits, providing a sequence of averages that track changes over time.^[4] For illustration, consider a dataset of values [1, 2, 3, 4, 5] with k = 3. The first SMA is the average of 1, 2, and 3, yielding 2; the second is the average of 2, 3, and 4, yielding 3; and the third is the average of 3, 4, and 5, yielding 4. Thus, the SMA values are [2, 3, 4]. This example demonstrates how the SMA progressively incorporates newer data while maintaining a fixed window length.^[4] One key advantage of the SMA is its computational simplicity, requiring only basic addition and division, which makes it straightforward to implement and interpret even for large datasets. It also provides uniform smoothing that effectively highlights persistent trends by averaging out random noise, minimizing mean squared error in stationary data without trends.^[17]^[16] However, the SMA has notable disadvantages, including a tendency to lag behind actual trends due to its equal weighting of all observations in the window, which delays responsiveness to recent changes. Additionally, it can be sensitive to outliers within the window, as each value influences the average equally, potentially distorting the smoothed result in volatile datasets.^[17]^[16]^[18] The selection of the window size k is crucial, as smaller values increase responsiveness to recent data but introduce more noise and variability, while larger values enhance smoothness and trend visibility at the cost of greater lag and reduced sensitivity to shifts. This trade-off must be balanced based on the data's characteristics and the desired level of smoothing versus timeliness.^[4]^[17]

Cumulative Average

The cumulative average, also referred to as the running average or expanding average, computes the mean of all data points from the start of a dataset up to the current observation, resulting in a progressively expanding window size with each new data point.^[19]^[20] This approach accumulates historical information without discarding earlier values, making it suitable for scenarios where overall progress or long-term trends are prioritized over short-term fluctuations. The formula for the cumulative average at time t, denoted \text{CA}_t, for a sequence of observations a_1, a_2, \dots, a_t is:

\text{CA}_t = \frac{1}{t} \sum_{i=1}^t a_i

^[19]^[20] For example, given the data sequence [1, 2, 3], the cumulative averages are \text{CA}_1 = 1, \text{CA}_2 = 1.5, and \text{CA}_3 = 2.^[19] As the number of observations t grows, \text{CA}_t converges to the overall mean of the full dataset, providing a stable estimate that becomes less sensitive to recent changes due to the increasing influence of accumulated prior data.^[20] This contrasts with fixed-window averages by emphasizing historical accumulation rather than recency. In applications such as quality control, the cumulative average monitors ongoing performance metrics, such as defect rates or measurement consistency, by tracking deviations within specified limits over time.^[21] It is also widely used in learning curve analysis for production processes, where it models the average cost or time per unit as output accumulates, typically decreasing by a constant percentage with each doubling of quantity produced.^[22] For computational efficiency, the cumulative average supports incremental updates without recalculating the entire sum: \text{CA}_t = \text{CA}_{t-1} \cdot \frac{t-1}{t} + \frac{a_t}{t}, which facilitates real-time tracking in streaming data environments.^[19]

Weighted Types

Weighted Moving Average

A weighted moving average (WMA) assigns varying weights to the data points within a fixed-size window, allowing for greater emphasis on specific observations, such as more recent ones, compared to the uniform weighting in simple moving averages. This flexibility makes WMAs particularly useful in time series analysis for smoothing data while prioritizing relevant trends.^[23] The general form of a WMA at time t for a window of size k is given by

\text{WMA}_t = \frac{\sum_{i=1}^{k} w_i a_{t-i+1}}{\sum_{i=1}^{k} w_i},

where a_{t-i+1} are the observed values in the window, and w_i are the non-negative weights assigned to each position, with the denominator ensuring normalization so that the weights sum to 1 if desired for unbiased averaging.^[23] Normalization is crucial to maintain the scale of the original data and prevent bias in the estimate, as the sum of weights acts as a scaling factor.^[24] Weights can be assigned in various ways depending on the application; for instance, linear weights increase progressively toward recent data (e.g., w_i = i for i = 1 to k, with the highest weight on the newest observation), or triangular weights peak in the middle for centered smoothing.^[23] Such assignments allow customization to domain-specific needs, like emphasizing recency in financial forecasting or sales predictions where recent patterns are more indicative of future behavior.^[23] Compared to the simple moving average, the WMA offers advantages in responsiveness, as higher weights on recent data reduce the lag in detecting shifts or trends, leading to more timely signals in volatile series.^[23] This can improve forecast accuracy in applications requiring quick adaptation, though it may amplify noise if weights overly favor outliers.^[24] For example, consider a time series with values a_1 = 1, a_2 = 2, a_3 = 3 and a window size k = 3 using linear weights w_1 = 1, w_2 = 2, w_3 = 3 (oldest to newest). The WMA is calculated as \frac{1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3}{1 + 2 + 3} = \frac{14}{6} = \frac{7}{3} \approx 2.333, which weights the latest value more heavily than the simple average of 2.^[23] Weight selection criteria typically rely on the problem's context, such as using higher weights for recent data in short-term forecasting to capture evolving patterns, while balancing smoothness and sensitivity through empirical testing or domain expertise.^[23] Exponential moving averages represent a special case of weighted averages with geometrically decreasing weights, often extending beyond finite windows.^[24]

Exponential Moving Average

The exponential moving average (EMA) is a recursive method for estimating the local mean of a time series, assigning exponentially decaying weights to past observations to emphasize recent data. It is defined by the formula

\text{EMA}_t = \alpha \, a_t + (1 - \alpha) \, \text{EMA}_{t-1},

where a_t is the new observation at time t, \alpha is the smoothing factor satisfying $0 < \alpha < 1, and \text{EMA}_{t-1} is the previous EMA value.^[25]^[26] This recursive structure ensures that the EMA incorporates the entire history of data, with the weight on the i-th past observation given by the geometric sequence w_i = \alpha (1 - \alpha)^{i-1}, normalized to sum to 1.^[25]^[27] Initialization of the EMA typically sets \text{EMA}_0 to the first observation a_1, the mean of an initial set of observations, or a target value such as zero or the historical mean, depending on the context to avoid undue bias from arbitrary starting points.^[25]^[26] The choice of initialization affects early estimates but has diminishing impact as more data accumulates due to the exponential decay.^[27] A key advantage of the EMA lies in its computational efficiency, requiring only the previous EMA value and the current observation for updates, thus using constant memory regardless of history length.^[27] This recursive form enables rapid adaptation to shifts in the underlying process, outperforming fixed-window methods in responsiveness while still smoothing noise through the infinite but decaying influence of past data.^[25]^[26] Unlike finite moving averages, it avoids abrupt resets from sliding windows, providing a continuous estimate suitable for online processing.^[27] The smoothing factor \alpha relates to the half-life n, the time span over which weights decay to half their initial value, via the formula

\alpha = 1 - e^{-\ln 2 / n}.

This interpretation allows practitioners to select \alpha based on desired memory length, where larger n corresponds to smaller \alpha and greater smoothing.^[27] For example, with \alpha = 0.2 and initial \text{EMA}_0 = 0, the sequence begins as \text{EMA}_1 = 0.2 \times 10 + 0.8 \times 0 = 2 for a_1 = 10, and \text{EMA}_2 = 0.2 \times 20 + 0.8 \times 2 = 5.6 for a_2 = 20, illustrating the gradual incorporation of new values.^[25] Parameter selection for \alpha trades off between sensitivity and stability: values near 1 yield high responsiveness to recent changes, ideal for volatile series, whereas values near 0 emphasize smoothing and historical trends, reducing sensitivity to outliers.^[25]^[26] Optimal \alpha is often determined by minimizing forecast error metrics like mean squared error on validation data.^[25]

Other Weightings

In addition to simple and exponential weightings, moving averages can employ specialized non-geometric weight functions tailored to domain-specific requirements, such as emphasizing central data points or adapting to signal characteristics. These approaches provide enhanced smoothing while mitigating issues like edge effects or sensitivity to noise variations.^[28] Gaussian weighting applies a bell-shaped kernel to the data window, assigning higher weights to points near the center and tapering off symmetrically. The weights are defined by the Gaussian function w_i = e^{-(i - m)^2 / (2\sigma^2)}, where i is the position in the window, m is the center, and \sigma controls the spread. This method is particularly effective for preserving local features while reducing high-frequency noise, as implemented in signal processing toolboxes like MATLAB's smoothdata function, which uses a default window size of 4 elements unless specified otherwise.^[29] In audio processing, Gaussian-weighted moving averages facilitate noise reduction by blurring impulsive disturbances without overly distorting the underlying waveform, as seen in applications for smoothing acoustic signals in real-time systems.^[30] Hann and Hamming windows, borrowed from signal processing, introduce tapered weighting to minimize boundary artifacts in the averaged output. The Hann window weights are given by w_i = 0.5 \left(1 - \cos\left(\frac{2\pi i}{k+1}\right)\right) for i = 0 to k, creating smooth transitions at the window edges that reduce sidelobe leakage compared to uniform weighting. The Hamming variant modifies this with an added constant term for slightly broader main lobe response: w_i = 0.54 - 0.46 \cos\left(\frac{2\pi i}{k}\right). These windows achieve sidelobe suppression up to -32 dB for Hann, significantly smoother than the -13.5 dB of simple moving averages, making them suitable for cycle detection in oscillatory data.^[28] In financial time series analysis, such tapered weights help in trend filtering by dampening abrupt changes at window boundaries, improving indicator stability during volatile periods. Adaptive weighting schemes dynamically adjust weights based on local data properties, such as volatility, to allocate higher emphasis to stable segments and lower to turbulent ones. Kaufman's Adaptive Moving Average (KAMA), for instance, computes a smoothing constant from the efficiency ratio—measuring directional movement relative to total variation—and applies it to recent observations, effectively increasing weights during low-volatility trends and decreasing them amid clustering volatility.^[31] This approach addresses volatility clustering in finance, where periods of high fluctuation follow each other, by customizing the moving average to track persistent trends more responsively without excessive lag.^[31] Compared to uniform weighting, these specialized schemes—Gaussian for central emphasis, windowed for edge tapering, and adaptive for volatility response—reduce artifacts like ringing or oversensitivity, though they may introduce minor phase distortion in transient signals. Gaussian and windowed methods yield smoother outputs with less spectral leakage, while adaptive variants excel in non-stationary environments by maintaining adaptability over fixed windows.^[28] Implementation requires normalizing weights so their sum equals 1 to ensure the average remains unbiased, often via division by the kernel integral or sum. These methods incur higher computational costs than simple averages due to per-point weight calculations—O(n) per window for finite kernels—but optimizations like precomputed tables or recursive approximations mitigate this in real-time applications.^[29]

Specialized Variants

Continuous Moving Average

The continuous moving average of a real-valued function f(t) over a time window of fixed length \tau > 0 ending at time t is defined as

y(t) = \frac{1}{\tau} \int_{t-\tau}^{t} f(s) \, ds.

This formulation provides a uniform weighting across the interval [t-\tau, t], smoothing the function by averaging its values continuously. It serves as the continuous-time counterpart to the discrete simple moving average, emerging as the limit when the discrete sampling interval approaches zero and the number of points increases proportionally to maintain the window length \tau. A weighted variant analogous to the discrete exponential moving average arises in continuous time through the exponentially decaying kernel, yielding

X(t) = \frac{1}{\tau} \int_{0}^{\infty} f(t - s) e^{-s / \tau} \, ds,

where \tau > 0 determines the effective memory scale (with the normalization ensuring the weights integrate to 1).^[32] This expression solves the first-order linear ordinary differential equation

\frac{dX}{dt} = \frac{1}{\tau} \big( f(t) - X(t) \big),

with initial condition X(t_0) = f(t_0) at some starting time t_0; to verify, differentiate the integral form using the Leibniz rule for parameter-dependent limits and the fundamental theorem of calculus, substitute, and simplify to obtain the differential equation.^[32] Continuous moving averages find applications in control theory for mitigating noise in precision timing and frequency systems, where the integral form filters high-frequency fluctuations while preserving low-frequency trends.^[33] In physics, they enable baseline correction in signal processing for experimental setups, such as particle detectors, by averaging over short windows to subtract slow drifts from raw waveforms. These methods also approximate components in Kalman filtering for continuous-time stochastic processes, particularly self-similar ones like fractional Brownian motion, by representing moving average integrals as state updates in the filter equations.^[34] Specific properties distinguish continuous moving averages in analysis. If f(t) is differentiable, then y(t) is differentiable, with derivative y'(t) = \frac{1}{\tau} \big( f(t) - f(t - \tau) \big) obtained via the fundamental theorem of calculus applied to the integral bounds. For a constant function f(t) = c, the moving average remains y(t) = c, preserving the value exactly. For a linear trend f(t) = k t with k > 0, the moving average is y(t) = k \left( t - \frac{\tau}{2} \right); to derive this, compute the integral \int_{t-\tau}^{t} k s \, ds = k \left[ \frac{s^2}{2} \right]_{t-\tau}^{t} = k \left( \frac{t^2}{2} - \frac{(t - \tau)^2}{2} \right) = k \tau \left( t - \frac{\tau}{2} \right), then divide by \tau to yield the lagged form, introducing a phase delay of \tau / 2.

Moving Median

The moving median is a robust statistical technique used for smoothing data in a time series or sequence by applying the median within a sliding window of fixed size k. At each position i, it computes the median of the k consecutive observations centered around or including i, providing a non-parametric measure of central tendency that slides across the data to produce a smoothed series.^[35] To compute the moving median, the values within the window are sorted in ascending order; for odd k, the middle value (at position (k+1)/2) is selected as the median, while for even k, the average of the two central values (at positions k/2 and k/2 + 1) is taken. This process repeats for each overlapping window, typically requiring sorting at each step, which incurs a computational complexity of O(k \log k) per window in naive implementations.^[35] A primary advantage of the moving median is its insensitivity to outliers, with a breakdown point of 50%, meaning it remains reliable even if up to half the data in the window are contaminated, unlike the arithmetic mean's 0% breakdown point. This robustness makes it particularly effective for preserving sharp changes in the data while suppressing noise, as it relies on order statistics rather than summation.^[35]00130-R) However, the moving median's non-linearity complicates mathematical analysis, such as deriving closed-form properties or frequency responses, and its higher computational demands compared to moving averages can be a drawback for large datasets or real-time applications. Additionally, it may produce jagged smoothed curves and handle boundary points less effectively without specialized adjustments.^[35] For example, consider the data sequence [1, 10, 2, 3, 100] with window size k=3: the moving medians starting from the second position are 2 (median of 1, 10, 2), 3 (median of 10, 2, 3), and 3 (median of 2, 3, 100), effectively ignoring the outlier 100 and yielding a smoother trend of approximately [2, 3, 3].^[35] Variants include the weighted moving median, which assigns different weights to window elements before selecting the median (e.g., via weighted order statistics), and the running median in signal processing, optimized for efficient incremental updates in streaming data to reduce sorting overhead.^[36]^[37]

Applications in Modeling

Time Series Smoothing

Moving averages serve as fundamental tools for smoothing time series data, effectively reducing short-term fluctuations and noise to reveal underlying structures such as trends and cycles. By averaging values over a sliding window, these filters decompose a series into a smoothed component—often interpreted as the trend—and a residual component capturing irregular variations. This approach is particularly valuable in fields like economics and meteorology, where raw data often includes random errors that obscure meaningful patterns.^[25] In trend estimation, moving averages act as low-pass filters to isolate the long-term trend from a time series, enabling the decomposition y_t = T_t + R_t, where T_t represents the trend estimated via the moving average and R_t is the residual. For instance, a simple moving average applied symmetrically around each point provides an estimate of the trend-cycle component, which can then be subtracted from the original series to obtain residuals for further analysis. This method assumes the trend evolves gradually, making it suitable for stationary or slowly varying processes.^[38] For seasonal adjustment, moving averages are combined with differencing techniques to remove periodic fluctuations, as exemplified in the X-11 method developed by the U.S. Census Bureau. The X-11 procedure employs a series of symmetric moving averages—such as 3x3, 3x5, and 3x9 filters for monthly data—to estimate the trend and seasonal components iteratively, followed by differencing to stabilize the series and refine adjustments. This approach has been a standard for official statistics, though it has been succeeded by X-12-ARIMA and the current X-13ARIMA-SEATS method, which incorporates ARIMA modeling for improved forecasting and adjustment, enhancing the interpretability of economic indicators like unemployment rates.^[39]^[40] In anomaly detection, deviations from a moving average baseline signal potential outliers or unusual events in the time series, as points significantly exceeding a threshold (e.g., two standard deviations) indicate breaks from the expected smoothed behavior. This technique is applied in monitoring systems to detect anomalies by establishing a normal profile with the moving average and flagging deviations in residuals. A prominent application in finance involves the 50-day simple moving average (SMA) to gauge stock price trends, where sustained positions above this line suggest bullish momentum. Crossovers between short-term and long-term SMAs generate trading signals: a golden cross occurs when the 50-day SMA rises above the 200-day SMA, indicating potential upward trends, while a death cross—its inverse—signals bearish reversals, as observed in major indices like the S&P 500. These patterns aid investors in timing entries and exits, though empirical studies show mixed predictive power depending on market conditions.^[41] In signal processing, moving averages function as finite impulse response (FIR) filters to attenuate high-frequency noise while preserving lower-frequency components essential for analysis. A uniform-weight moving average of length N convolves the input signal with a rectangular kernel, effectively acting as a low-pass FIR filter with a frequency response that rolls off gradually, making it ideal for applications like audio denoising or sensor data cleaning.^[10] Despite their utility, moving averages have limitations, including over-smoothing that can obscure genuine short-term variations or structural breaks in the data. The choice between simple and exponential types depends on data stationarity: simple averages suit stable series but lag in responsiveness, while exponential variants weight recent observations more heavily for non-stationary data, though they may amplify noise if the decay parameter is poorly tuned.^[42] Software implementations facilitate widespread use of moving averages for time series smoothing. In Python, the pandas library provides the rolling() method for efficient computation of simple or weighted averages on DataFrames. R's forecast package includes the ma() function for straightforward application to univariate series. MATLAB offers the movmean() function in its core toolbox for vectorized operations on numeric arrays.

Moving Average Regression Model

In the context of ARIMA modeling for time series forecasting, the moving average process of order q, denoted MA(q), represents a stochastic model where the current observation is a linear combination of past error terms, or white noise innovations. The model is defined as

y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \dots + \theta_q \epsilon_{t-q},

where \{\epsilon_t\} is a sequence of white noise errors with mean zero and constant variance \sigma^2, and the \theta_i are the moving average parameters. This formulation interprets the time series y_t as depending on the current and previous q error terms, weighted by the parameters \theta_i, which can be positive or negative and do not necessarily sum to unity. It is analogous to a finite impulse response filter in signal processing, where the influence of shocks decays after q periods, distinguishing it from infinite-order processes like autoregressive models. Key properties of the MA(q) model include stationarity, which holds unconditionally as long as the white noise is stationary, and invertibility, requiring that the roots of the characteristic polynomial \theta(z) = 1 + \theta_1 z + \dots + \theta_q z^q = 0 lie outside the unit circle in the complex plane (i.e., |z| > 1 for all roots). The autocovariance function (ACF) of an MA(q) process cuts off abruptly after lag q, meaning \rho_k = 0 for k > q, while the partial autocorrelation function (PACF) tails off gradually. Estimation of the \theta_i parameters and the noise variance \sigma^2 is typically performed using maximum likelihood estimation, assuming normality of the errors, or via conditional sums of squares for initial approximations, with iterative optimization to refine the fit. For model identification, the order q is determined by examining the sample PACF, which should exhibit a sharp cutoff after lag q, complemented by information criteria like AIC or BIC to select among candidate models. In practice, software such as the statsmodels library in Python implements these steps through its ARIMA class, facilitating fitting and diagnostic checks.^[43] A simple example is the MA(1) model y_t = \epsilon_t + 0.5 \epsilon_{t-1}, where \theta_1 = 0.5; this generates a time series with positive autocorrelation at lag 1 (\rho_1 = 0.4) but zero thereafter, simulating residuals that exhibit short-term dependence due to lingering effects of past shocks. For invertibility in this case, |\theta_1| < 1. In forecasting, MA models extrapolate future values by weighting anticipated error terms beyond the observed data, assuming future errors are zero, which contrasts with retrospective smoothing techniques that average past observations directly. This predictive orientation aligns loosely with exponential smoothing methods, though MA(q) provides a parametric framework for error propagation rather than heuristic decay.

References

[1]
Moving Average [MA] - Statistics By Jim
A moving average is a way to smooth out short-term fluctuations in data and highlight longer-term trends or patterns. Analysts commonly use moving averages in ...Missing: authoritative sources
[2]
Moving Average (MA): Purpose, Uses, Formula, and Examples
A moving average (MA) is a statistic that captures the average change in a data series over time. In finance, MAs are often used by technical analysts to keep ...Missing: authoritative | Show results with:authoritative
[3]
Understanding Moving Averages - CME Group
A moving average is the average price of a futures contract or stock over a set period of time. Traders can add just one moving average or have many different ...Missing: authoritative | Show results with:authoritative
[4]
6.4.2.1. Single Moving Average - Information Technology Laboratory
A single moving average is a smoothing process that computes the mean of successive smaller sets of past data, advancing one period at a time.
[5]
Computing the Moving Average of a Sequence
The k-th moving average is calculated by summing elements x(i), x(i+1), ..., x(i+Window-1) and dividing by Window. The moving average sequence has n-k+1 elements.
[6]
[PDF] Forecasting with moving averages - Duke People
A moving average averages recent past values. The simple moving average (SMA) model uses the most recent m values, where m is an integer.
[7]
Time series and moving averages | ACCA Global
One method of establishing the underlying trend (smoothing out peaks and troughs) in a set of data is using the moving averages technique. Other methods, such ...
[8]
Who First Came Up With Moving Averages?
To our knowledge, PN (Pete) Haurlan was the first to use exponential smoothing for tracking stock prices.
[9]
Methods for Calculating Moving Averages - APHEO
Jan 6, 2003 · Moving averages were used for centuries prior to the current understanding of stochastic processes to stabilize estimates calculated from small ...Missing: origin | Show results with:origin
[10]
None
### Summary: Moving Average as Low-Pass Filter and Smoothing Effect
[11]
[PDF] Moving Average Filters
A moving average filter is a common DSP filter that reduces random noise while retaining a sharp step response by averaging input signal points.
[12]
[PDF] Moving averages - Rob J Hyndman
Nov 8, 2009 · A moving average is a time series constructed by taking averages of several sequential values ... with zero mean and variance σ2), the bias is ...
[13]
Notes on Trends, Smoothing, and Detrending
The estimator is unbiased when that difference is zero. So we have a simple equation for when a smoother gives us an unbiased estimate of the trend ...
[14]
Introduction to ARIMA: nonseasonal models - Duke People
ARIMA models are, in theory, the most general class of models for forecasting a time series which can be made to be “stationary” by differencing (if necessary).
[15]
9.1 Stationarity and differencing | Forecasting - OTexts
Differencing can help stabilise the mean of a time series by removing changes in the level of a time series, and therefore eliminating (or reducing) trend and ...
[16]
[PDF] Lecture 5 & 6: Image Processing
What about near the edge? • the filter window falls off the edge of the image ... Mean filtering/Moving average. Page 32. Mean filtering/Moving average. Page 33 ...
[17]
Simple Moving Average (SMA) Explained - Investopedia
The formula for calculating an SMA involves summing the asset's prices over the chosen period and dividing by the number of periods, making it a straightforward ...Simple Moving Average (SMA) · How It Works · Factors to Consider When...
[18]
6.4.2. What are Moving Average or Smoothing Techniques?
The "simple" average or mean of all past observations is only a useful estimate for forecasting when there are no trends. If there are trends, use different ...
[19]
Simple Moving Average (SMA) | Meaning, Application, Strategies
Rating 4.4 (11) Sep 7, 2023 · The advantage of using a Simple Moving Average (SMA) is its simplicity and ease of calculation, making it accessible to novice traders and ...What Is the Simple Moving... · Calculation of the Simple... · Advantages and...
[20]
Weighted moving average graphs - UTSA
Jan 30, 2022 · In statistics, a moving average (rolling average or running average) is a calculation to analyze data points by creating a series of averages ...
[21]
Cumulative Average
Jun 5, 2001 · That is, the cumulative mean of an element in a variable is simply the mean of all points in the variable up to and including that element.
[22]
Statistical Quality Control: Acceptance Sampling Plans in the Light of ...
Jul 22, 2023 · Statistical quality control ... (ii) The sampling will be continued, if cumulative average of the measurements on the quality characteristics lies ...
[23]
None
### Summary of Cumulative Average in Learning Curves
[24]
[PDF] CHAPTER'9 Time - Series and Forecasting
A weighted moving average may be expressed mathematically as !(Weight for period n) (Data value in period n). Weighted moving average W !Weights !"ample.
[25]
Weighted Moving Average | Real Statistics Using Excel
Tutorial on how to conduct a weighted moving average forecast in Excel. Examples and software provided. Describes use of Solver to optimize the forecast.
[26]
Moving average and exponential smoothing models - Duke People
Thus, we say the average age of the data in the simple moving average is (m+1)/2 relative to the period for which the forecast is computed: this is the amount ...Missing: cumulative | Show results with:cumulative
[27]
6.3.2.4. EWMA Control Charts - Information Technology Laboratory
The Exponentially Weighted Moving Average (EWMA) is a statistic for monitoring the process that averages the data in a way that gives less and less weight ...
[28]
[PDF] Exponentially Weighted Moving Models - Stanford University
Apr 24, 2024 · EWMMs generalize the well known and widely used exponentially weighted moving average (EWMA). When we fit the data with a constant model using a ...
[29]
[PDF] WINDOWING By John Ehlers Simple Moving Averages (SMA) are ...
A still smoother weighting function that is easy to program is called the Hann Window. The Hann Window is often described as a “Sine Squared” distribution, ...
[30]
smoothdata - Smooth noisy data - MATLAB - MathWorks
Smooth a vector of noisy data with a Gaussian-weighted moving average filter. Display the window size used by the filter. x = 1:100; rng(0,"twister") A ...
[31]
Signal Smoothing - MATLAB & Simulink Example - MathWorks
If we resample the signal at 17 * 60 Hz = 1020 Hz, we can use our 17 point moving average filter to remove the 60 Hz line noise.
[32]
Do Adaptive Moving Averages Lead To Better Results? - Investopedia
A common weighting value is 0.181, which is close to a 20-day simple moving average. Another is 0.10, which is approximately a 10-day moving average.
[33]
[PDF] A Framework for the Analysis of Unevenly Spaced Time Series Data
Jul 20, 2014 · 17The reasoning for this exponential moving average is similar to the proof of Proposition 8.8. 33. Page 34. 8.3 Continuous-Time Analog.
[34]
[PDF] The Generalized Autocovariance: A Tool for Clock Noise Statistics
May 15, 1999 · For example, the continuous moving-average filter Aτ defined by. Aτ x (t) = 1 τ. ∫ t t−τ x (u) du. (9) satisfies. Aτ = 1 τ. ∆τ J0. (10). 6. Page ...
[35]
https://pubs.usgs.gov/tm/04/a03/tm4a3.pdf
[36]
[PDF] Statistical Methods in Water Resources
May 1, 2020 · ... Statistical Methods in Water Resources. Chapter 3 of. Section A, Statistical Analysis. Book 4, Hydrologic Analysis and Interpretation.
[37]
Median and Weighted Median Smoothers - Wiley Online Library
Sep 10, 2004 · The median operation is nonlinear. As such, the running median does not possess the superposition property and traditional impulse response ...
[38]
[PDF] Median filtering is equivalent to sorting - arXiv
Jun 6, 2014 · We study the following problem, commonly known as the median filter, sliding window median, moving median, running median, rolling median, or ...
[39]
[PDF] STAT481/581: Introduction to Time Series Analysis - UNM Math
7 Forecasting and decomposition. 16. Page 19. Moving averages. The first step in a classical decomposition is to use a moving average method to estimate the.
[40]
[PDF] The X-11 Variant of the Census Method II Seasonal Adjustment ...
The X-11 technique of using the same moving average regardless of the value of I reduces revisions in seasonal factors when additional data are added to series.
[41]
Improving time series anomaly detection based on exponentially ...
Continuous anomaly detection in satellite image time series is important for studying spatial-temporal processes of land cover changes.<|separator|>
[42]
Golden Cross vs. Death Cross: What's the Difference? - Investopedia
A golden cross indicates that a long-term bull market is looming while a death cross signals a long-term bear market ahead.
[43]
Smoothing data with moving averages - Dallasfed.org
There is a downside to using a moving average to smooth a data series, however. Because the calculation relies on historical data, some of the variable's ...
[44]
statsmodels.tsa.arima.model.
This model is the basic interface for ARIMA-type models, including those with exogenous regressors and those with seasonal components.ARIMA. - fit · ARIMA. - predict · ARIMAResults