Change detection

Change detection, also known as change point detection, is the process of identifying times when the statistical properties or probability distribution of a data-generating process, such as a time series or signal, undergo a significant alteration.^[1] This involves both determining whether a change has occurred and estimating its location and nature, often in the presence of noise or limited data. The field draws from statistics, signal processing, and machine learning, with methods categorized as online (real-time) or offline (post-hoc) analysis.^[2] Change detection has wide applications, including monitoring environmental changes via remote sensing, detecting anomalies in financial time series, tracking scene alterations in computer vision, and analyzing shifts in natural language data.^[3] It also extends to cognitive and psychological studies, where it examines human perceptual limits in detecting visual changes. These approaches highlight the interdisciplinary nature of the topic, addressing challenges in dynamic systems across domains.

Fundamentals

Definition and scope

Change detection, also referred to as change point detection, is the statistical process of identifying points in time when the probability distribution or key properties—such as the mean, variance, or higher-order moments—of a data-generating process undergo abrupt or gradual shifts.^[4] This methodology applies to sequential data streams, where the goal is to pinpoint transitions that may indicate underlying structural alterations in the system producing the observations.^[5] The approach is foundational in analyzing time series data, enabling the segmentation of homogeneous periods separated by these transition points. The scope of change detection extends to both univariate cases, involving single-variable sequences, and multivariate scenarios, where multiple correlated variables are monitored simultaneously for coordinated shifts.^[4] It differs from related techniques like anomaly detection, which targets isolated deviations or outliers from a stable norm, whereas change detection emphasizes persistent, systemic modifications to the data's generative mechanism.^[6] For instance, in monitoring a constant signal with known statistical properties, a simple threshold-based method can flag a change when variability exceeds predefined bounds, signaling a transition to a new regime.^[5] Originating in signal processing during the 1930s to address real-time monitoring in industrial production and defense systems, change detection has evolved into an interdisciplinary tool.^[7] Today, it informs applications in artificial intelligence for detecting concept drift in machine learning models.^[4]

Historical development

The origins of change detection trace back to the early 20th century, when Walter A. Shewhart developed control charts in 1924 at Bell Telephone Laboratories to monitor variation in industrial manufacturing processes and distinguish between common and special causes of defects.^[8] These charts laid foundational principles for detecting shifts in process means, influencing quality control practices worldwide. Efforts during World War II, including Abraham Wald's development of the sequential probability ratio test for real-time decision-making in military applications, further advanced statistical approaches to monitoring changes.^[7] In the mid-20th century, sequential analysis advanced the field with E. S. Page's introduction of the cumulative sum (CUSUM) procedure in 1954, designed for efficient detection of small shifts in continuous production processes through cumulative deviations from a target value.^[9] During the 1960s, Albert N. Shiryaev pioneered Bayesian methods for quickest change-point detection, formulating optimal stopping rules under uncertainty in independent and identically distributed sequences, which extended sequential probability ratio tests to probabilistic frameworks.^[10] The 1980s and 1990s marked an expansion into time series and signal processing, culminating in Michèle Basseville and Igor V. Nikiforov's 1993 book Detection of Abrupt Changes: Theory and Application, which unified detection algorithms, performance analysis, and applications across engineering domains like fault diagnosis and econometrics.^[5] Following the Cold War, the proliferation of high-velocity data streams in the 2000s spurred adaptations for online environments, as seen in Charu C. Aggarwal and colleagues' 2004 meta-algorithm for identifying distributional shifts in streaming data, addressing scalability in database and network monitoring.^[11] From the 2000s onward, machine learning integrations gained prominence, with hidden Markov models applied to model non-stationary processes and detect regime shifts in sequential data, building on Bayesian foundations for hidden state transitions.^[12] Post-2010, deep learning has driven innovations, enabling end-to-end feature learning for change detection in complex, high-dimensional data such as images and time series, with convolutional and recurrent networks improving accuracy in non-i.i.d. settings.^[13]

Theoretical foundations

Statistical models

Statistical models in change detection provide probabilistic frameworks to characterize data distributions before and after a change, often assuming parametric forms such as Gaussian distributions for the pre-change and post-change regimes.^[4] These models typically posit that the data-generating process is piecewise stationary, meaning the statistical properties remain constant within segments separated by change points.^[4] For instance, in univariate settings, observations X_1, \dots, X_n are modeled as independent and identically distributed (i.i.d.) from a Gaussian distribution \mathcal{N}(\mu_0, \sigma_0^2) before the change and \mathcal{N}(\mu_1, \sigma_1^2) after, facilitating inference on shifts in mean or variance.^[14] Change types are distinguished as abrupt, involving a single change point \tau where the distribution shifts instantaneously, or gradual, encompassing multiple change points or trending shifts that evolve smoothly over time.^[4] The detection problem is commonly formulated as a hypothesis test: H_0 posits no change (stationary process with parameters \theta), while H_1 assumes a change at \tau with distinct pre- and post-change parameters \theta_{\text{pre}} and \theta_{\text{post}}.^[15] A key statistic is the likelihood ratio for a candidate change point \tau,

\Lambda(\tau) = \frac{\sup_{\theta_{\text{pre}}, \theta_{\text{post}}} L(\mathbf{X} \mid \theta_{\text{pre}}, \theta_{\text{post}}, \tau)}{\sup_{\theta} L(\mathbf{X} \mid \theta)},

where L denotes the likelihood function, and changes are detected by thresholding \Lambda(\tau) or its supremum over \tau.^[14] In Bayesian formulations, the posterior probability of a change at \tau is computed by integrating over parameter priors, such as P(\tau \mid \mathbf{X}) \propto P(\mathbf{X} \mid \tau) P(\tau), often using conjugate priors for Gaussian parameters to yield closed-form updates.^[16] Multivariate extensions incorporate dependencies among variables using vector autoregressive (VAR) models, where the data follow \mathbf{X}_t = \Phi \mathbf{X}_{t-1} + \boldsymbol{\epsilon}_t with \boldsymbol{\epsilon}_t \sim \mathcal{N}(\mathbf{0}, \Sigma) before the change, and altered transition matrix \Phi or covariance \Sigma after.^[17] Low-rank or sparse structures on \Phi are assumed to handle high dimensionality, enabling detection of changes in temporal dependencies.^[17] These models rely on stationarity within pre- and post-change segments, which simplifies likelihood computations but limits applicability to non-stationary or heavy-tailed data.^[4] Handling non-i.i.d. observations requires extensions like autoregressive assumptions or kernel-based density estimation, though these increase computational demands and sensitivity to model misspecification.^[4]

Performance evaluation

Performance evaluation of change detection techniques centers on metrics that balance timely detection of true changes against the risk of false alarms, ensuring reliability in diverse scenarios. A core metric is the average run length (ARL) to alarm, which quantifies both false alarm propensity and detection speed: the in-control ARL (ARL_0) represents the expected number of observations before a false alarm under stable conditions, ideally large to minimize Type I errors (false positive rate), while the post-change ARL (ARL_1) measures the expected delay in detecting a true change, ideally small to maximize detection power by minimizing Type II errors (missed detections). These metrics originated in sequential analysis for quality control and are foundational for assessing algorithms like the cumulative sum (CUSUM) test, where ARL_0 is tuned to a target value (e.g., 370 for industrial standards) to control false alarms at approximately 1/370 per run.^[18] To facilitate threshold selection and performance comparison, evaluation frameworks utilize receiver operating characteristic (ROC) curves, which plot the true positive rate (sensitivity) against the false positive rate across varying thresholds, revealing inherent trade-offs in algorithm design. The area under the ROC curve (AUC) summarizes this graphically, with AUC values exceeding 0.9 indicating strong performance in distinguishing changes from noise, as demonstrated in benchmarks of methods like relative unconstrained least-squares importance fitting (RuLSIF) achieving AUCs of 0.94–0.97 on simulated and real datasets.^[4] Simulation-based testing under known change scenarios, such as abrupt mean shifts in Gaussian processes or gradual drifts in time series, provides a controlled environment to compute ARL and ROC metrics, often using Monte Carlo methods to average over multiple realizations for statistical reliability. Standard benchmarks include abrupt shift simulations mirroring industrial monitoring, where offline methods process entire sequences post-hoc, while online variants adhere to sequential probability ratio test (SPRT) bounds for real-time applicability.^[19]^[4] Comparative criteria extend beyond error rates to include computational complexity and robustness, with efficient algorithms targeting linear O(n time and space for streaming data versus O(n^2) for exhaustive offline scans, enabling scalability in large-scale applications. Robustness is gauged by performance degradation under noise addition or model misspecification, such as violating normality assumptions in parametric tests, where nonparametric alternatives maintain stable ARL across perturbations.^[4] A persistent challenge lies in high-dimensional settings, where the curse of dimensionality amplifies false alarms, necessitating metrics that penalize over-sensitivity while preserving specificity, often addressed through dimensionality reduction or ensemble evaluations in simulation frameworks.^[19]

Algorithms

Online change detection

Online change detection encompasses algorithms designed for processing sequential data streams in real time, where observations arrive incrementally and decisions must be made without revisiting prior data. This approach is grounded in sequential hypothesis testing, which balances the trade-off between detecting changes promptly and minimizing false alarms by accumulating evidence from partial observations. Unlike batch methods, it enables continuous monitoring by updating statistics on-the-fly, making it ideal for applications requiring immediate responses.^[20] A foundational algorithm is the Cumulative Sum (CUSUM) procedure, introduced by Page in 1954 for detecting shifts in process means. The CUSUM statistic is recursively computed as
S_t = \max\left(0, S_{t-1} + (x_t - \mu_0) - \kappa\right),
where x_t is the observation at time t, \mu_0 is the pre-change mean, and \kappa > 0 is a reference value defining the minimum detectable shift (indifference zone). A change is declared when S_t exceeds a predefined threshold, signaling a deviation from the null hypothesis of no change. This method is minimax optimal for certain detection criteria, minimizing the worst-case detection delay under a constraint on false alarm rates.^[21]^[22] Another key procedure is the Shiryaev-Roberts (SR) algorithm, which extends Bayesian quickest detection frameworks originally developed by Shiryaev in the 1960s. The SR statistic incorporates a prior probability on the change time, updating as R_t = (1 + R_{t-1}) \cdot \frac{f_1(x_t)}{f_0(x_t)}, where f_0 and f_1 are pre- and post-change densities, starting from an initial R_0 > 0. It signals a change upon crossing a threshold and is particularly effective when the change time follows a geometric prior, offering improved performance over CUSUM in Bayesian settings by accounting for uncertainty in change onset.^[10]^[23] Extensions to these core methods address more complex scenarios. Adaptive thresholds adjust dynamically to varying change magnitudes or noise levels, improving robustness by estimating optimal boundaries from incoming data rather than fixed values. For instance, procedures that recalibrate thresholds based on recent statistics reduce sensitivity to initial assumptions. Kernel-based variants, such as the Kernel CUSUM (KCUSUM), enable non-parametric detection by mapping data into a reproducing kernel Hilbert space, allowing detection of arbitrary distributional shifts without assuming specific forms like normality; this was demonstrated to achieve near-optimal detection delays in high-dimensional settings.^[24] The primary advantages of online change detection algorithms include low computational latency, as updates require only constant time per observation, and memory efficiency, storing only the current statistic rather than the entire history. These properties make them suitable for high-velocity streams, such as fraud detection in financial transactions, where rapid identification of anomalous patterns prevents losses— for example, scalable methods like MMDEW have been applied to detect shifts in transaction streams with polylogarithmic runtime. Performance can be evaluated using metrics like average run length (ARL), with CUSUM and SR often achieving ARLs under 100 for moderate shifts while maintaining false alarm constraints.^[25]^[26] However, these methods are sensitive to parameter tuning, particularly the reference value \kappa in CUSUM, which defines the indifference region and can lead to delayed detection if misspecified for the expected change size, or excessive false alarms if too small. Similarly, threshold selection in SR requires careful calibration to balance delay and error rates, often necessitating simulation-based design that increases implementation complexity.^[27]

Offline change detection

Offline change detection involves retrospective analysis of complete datasets to identify change points where the underlying statistical properties of the data shift, allowing for global optimization over the entire sequence rather than sequential processing.^[28] Unlike online methods that provide initial alerts on streaming data, offline approaches leverage full data access to achieve precise localization of multiple change points through techniques such as dynamic programming or recursive segmentation. The core principles center on formulating the problem as a segmentation task that minimizes a cost function measuring segment homogeneity, often solved via global optimization methods. For a sequence of length n, change point estimation typically minimizes a cost C(\tau) for a candidate change point \tau, defined as C(\tau) = \sum_{i=1}^{\tau} \text{cost}(y_{1:\tau}) + \sum_{i=\tau+1}^{n} \text{cost}(y_{\tau+1:n}), where \text{cost}(\cdot) quantifies deviation from a model assumption within each segment, such as the sum of squared errors for mean shifts.^[28] This extends to multiple change points by summing costs over all segments plus a penalty for the number of changes to balance fit and complexity. Non-parametric tests, like the Wilcoxon rank-sum test, are employed to detect distributional shifts without assuming normality, comparing ranks between potential pre- and post-change segments.^[29] Key algorithms include binary segmentation, a recursive approach that first identifies the most significant change point across the entire sequence by maximizing a test statistic (e.g., the likelihood ratio or cost reduction), then splits the data at that point and repeats on subsegments until no further significant changes are found. This method is computationally efficient with O(n \log n) complexity but can miss closely spaced change points due to its greedy nature.^[28] Another prominent algorithm is PELT (Pruned Exact Linear Time), which uses dynamic programming to exactly solve for the optimal segmentation under a linear penalty, pruning unlikely partial solutions to achieve O(n) time complexity even for multiple change points.^[30] Offline methods offer higher accuracy for detecting multiple changes compared to online counterparts, as they avoid sequential decision errors and enable refined global estimates. For instance, in genomic sequence analysis, circular binary segmentation—a variant of binary segmentation—has been applied to array-based DNA copy number data to identify regions of aberrant copy number, demonstrating robust performance in segmenting cancer-related alterations.^[31] However, these approaches incur high computational costs for long sequences, particularly with exhaustive dynamic programming (O(n^2) without optimizations), though techniques like pruning in PELT mitigate this by discarding suboptimal paths early.^[30]

Machine learning and deep learning approaches

Recent advances as of 2025 have integrated machine learning (ML) and deep learning (DL) into change detection, particularly for handling high-dimensional, non-stationary data in complex scenarios. These methods often leverage neural networks to learn change patterns from data, improving performance over classical techniques in domains like remote sensing and time series forecasting.^[32] For online detection, approaches such as random Fourier features enable kernel-based methods with optimal detection properties in streaming settings, achieving low-latency performance without full data access.^[33] DL models, including recurrent neural networks (RNNs) and transformers, have been adapted for real-time anomaly detection by predicting future states and flagging deviations, showing superior results in multivariate streams compared to CUSUM.^[32] In offline settings, DL-based segmentation uses convolutional or graph neural networks to identify multiple change points by modeling temporal dependencies, often outperforming PELT in noisy, irregular data like genomic sequences or climate records. For example, autoencoder-based methods detect distributional shifts by reconstructing segments and measuring reconstruction errors. These techniques address limitations of parametric assumptions but require larger training datasets and computational resources.^[34]^[32]

Applications

Time series and signal processing

In time series and signal processing, change detection involves identifying abrupt or gradual shifts in the statistical properties of sequential data, such as means, variances, or spectral characteristics, which is essential for monitoring dynamic systems like economic indicators or environmental signals. This domain emphasizes handling the inherent dependencies in data, where observations are not independent but exhibit temporal correlations. Autoregressive Integrated Moving Average (ARIMA) models are widely used to address autocorrelation by modeling the data as a combination of autoregressive (AR) components for past value dependencies, differencing for integration to achieve stationarity, and moving average (MA) terms for error correlations. Changes can manifest as shifts in trend (long-term direction), level (baseline mean), or seasonality (periodic patterns), often requiring intervention analysis within ARIMA frameworks to isolate and quantify these alterations post-detection. Key techniques for detecting localized changes include wavelet-based decomposition, which transforms the signal into time-frequency representations to isolate transient events without assuming global stationarity. For instance, wavelet footprints enable efficient identification of change points by capturing multi-scale features in noisy time series. Adaptations of the Fourier transform, such as the Short-Time Fourier Transform (STFT), are employed to detect frequency shifts by providing a time-localized spectral analysis, allowing for the identification of evolving periodic components in non-stationary signals. These methods complement general algorithms like CUSUM or PELT by focusing on the spectral domain, enhancing sensitivity to subtle shifts in oscillatory patterns.^[35] Practical applications demonstrate the utility of these approaches in real-world scenarios. In seismic signal processing, STFT is applied to detect earthquakes by analyzing spectral changes in waveform energy, where sudden increases in high-frequency components signal event onset, enabling rapid alerts for high-speed rail safety. Similarly, in financial time series, volatility shifts are identified to pinpoint stock market crashes; for example, abrupt increases in return variance can flag regime changes using change point detection on indices like the S&P 500. A notable case is the 2008 financial crisis, where change point analysis on banking sector data revealed structural breaks starting in 2007Q4, with peak impacts from 2009Q3 to 2010Q2, highlighting volatility surges tied to subprime mortgage failures.^[36]^[37] Significant challenges arise from non-stationarity, where statistical properties evolve over time, complicating model assumptions, and from high noise levels that mask true changes, often requiring robust preprocessing like detrending or filtering. Integration with Kalman filters addresses these by providing state estimation in dynamic systems, recursively updating predictions to detect deviations indicative of changes while accounting for process and measurement noise.^[38] Post-detection outcomes include enhanced forecasting accuracy; for instance, segmenting time series at detected change points allows refitting of models like ARIMA to each regime.

Computer vision and remote sensing

In computer vision and remote sensing, change detection involves analyzing bi-temporal or multi-temporal images to identify alterations in visual scenes, such as land cover shifts or structural modifications, often at the pixel level to capture fine-grained spatial variations.^[39] This process typically compares co-registered image pairs acquired at different times, where pixel-level comparison quantifies differences in intensity or features between the baseline image I_1 and the subsequent image I_2. Key challenges include handling illumination variations, which can introduce false positives due to lighting changes, and registration errors from geometric misalignment between images, necessitating preprocessing steps like histogram matching or affine transformations.^[39] Traditional techniques for change detection in this domain emphasize simple algebraic operations on image pairs. Image differencing computes the difference map \Delta I = I_2 - I_1, followed by thresholding to classify changed pixels, a method effective for detecting abrupt changes but sensitive to radiometric inconsistencies. Change vector analysis extends this by projecting differences into a multi-dimensional feature space (e.g., spectral bands), where the magnitude and direction of the change vector indicate the type and extent of alterations, enabling differentiation between gradual and thematic changes.^[40] More recent advancements leverage deep learning, particularly Siamese convolutional neural networks (CNNs), which process paired images through shared-weight branches to extract hierarchical features and output semantic change maps, improving robustness to noise and enabling detection of complex, object-level transformations. Practical applications highlight the utility of these methods in environmental and urban monitoring. For instance, deforestation tracking in the Amazon rainforest has utilized Landsat satellite data since post-2010, where change vector analysis on multi-temporal imagery revealed annual losses exceeding 6,000 km² in some years, aiding policy enforcement through the Brazilian PRODES program.^[41] Similarly, urban expansion detection employs object-based methods that segment images into superpixels before applying change metrics, as demonstrated in studies of rapidly growing cities like those in China, where such approaches identified a 20-30% increase in built-up areas over decadal scales by integrating spectral and textural features.^[42] Significant challenges persist in operationalizing these techniques, particularly seasonal variations that alter vegetation phenology and cloud cover that obscures up to 50% of tropical imagery, requiring multi-date compositing or synthetic aperture radar integration to mitigate data gaps.^[43] Post-change classification further complicates analysis, as initial binary maps must be labeled into land-use categories (e.g., forest to agriculture) using supervised classifiers, often achieving accuracies around 80-85% but demanding ground-truth data for training.^[39] Advancements in multi-sensor fusion address these limitations by incorporating LiDAR data for 3D change detection, where point cloud differencing quantifies volumetric shifts in urban or forested environments, such as building height increases or canopy volume losses, with sub-meter precision in controlled studies.^[44] Validation of these methods commonly employs the Kappa coefficient, which measures agreement between detected and reference changes beyond chance, typically targeting values above 0.7 for reliable assessments in remote sensing applications.^[45]

Natural language processing

Change detection in natural language processing (NLP) focuses on identifying shifts in large text corpora, such as evolving patterns in language use or topic drifts across documents. These shifts can manifest in word distributions, syntactic structures, or sentiment polarities, reflecting societal, cultural, or temporal influences on communication. For instance, sudden increases in specific terminology may indicate emerging trends, while declines could signal obsolescence. Handling the inherent high dimensionality of text—often involving vocabularies exceeding tens of thousands of terms—relies on dimensionality reduction techniques like topic models to aggregate and monitor changes effectively.^[46] Key techniques include the Kullback-Leibler (KL) divergence for quantifying shifts in word or feature distributions between sequential text segments. Defined as

D(P \| Q) = \sum_{x} p(x) \log \left( \frac{p(x)}{q(x)} \right),

this measure compares empirical probability distributions P (from reference data) and Q (from current data), often applied to binned representations of high-dimensional text features to detect concept drift in streams.^[47] For monitoring topic emergence, Latent Dirichlet Allocation (LDA) generates topic distributions from documents, aggregates them into time-series vectors per time slot, and employs change-point algorithms like PELT to pinpoint breakpoints where topics rise or fall in prominence, as demonstrated in analyses of historical parliamentary debates.^[46] Sentiment shifts are detected via diachronic models that infer moral or emotional changes from corpora, using biases learned in word embeddings to track evolving polarities without explicit labeling.^[48] Syntactic changes, such as alterations in dependency patterns, are identified using pre-trained transformers like BERT to compare parse trees across eras, revealing gradual evolutions in grammatical complexity.^[49] Case studies illustrate practical applications. On social media platforms like Twitter (now X), change detection tracks slang evolution, such as semantic shifts in terms post-2020 events; for example, analyses of Reddit communities show that word meanings diverge faster in fragmented networks, correlating with subgroup dynamics and accelerating slang adaptation.^[50] In historical contexts, the Google Books Ngram corpus enables detection of linguistic shifts by aligning word embeddings across decades and applying Mean Shift clustering to time series, identifying significant changes like the semantic broadening of "gay" around 1920 or "plastic" in the 1950s, with FastText embeddings outperforming traditional methods in accuracy (Spearman ρ = 0.741).^[51] Challenges arise from polysemy, where words carry multiple related senses (e.g., "bank" as a financial entity or river feature), requiring contextual disambiguation to isolate true shifts from noise; advanced models like contextualized embeddings address this but demand large annotated corpora.^[52] Context dependency further complicates detection, as meanings vary by genre or region, necessitating normalization for historical texts and time-aware architectures like LSTMs for accurate tracking. Streaming detection in news feeds poses additional hurdles, involving real-time processing of unbounded data with techniques like incremental sentiment monitoring via MOA-TweetReader, which balances low-latency updates against false positives in volatile streams.^[53] Outcomes include applications in misinformation tracking, where distributional shifts in social media narratives—detected via NLP classifiers—flag emerging false claims by comparing topic or sentiment drifts against verified baselines, enhancing proactive moderation.^[54] Evaluation metrics emphasize perplexity scores, computed as the exponentiated average negative log-likelihood of a language model on test text; a model trained on baseline corpora yields higher perplexity on shifted data, quantifying divergence and validating detection efficacy in domain adaptation tasks.^[55]

Cognitive and psychological studies

Change detection in cognitive and psychological studies examines the human ability to notice alterations in visual stimuli, highlighting limitations in perception, attention, and working memory. A key phenomenon is change blindness, where individuals fail to detect substantial changes in scenes, such as the disappearance or replacement of objects, especially when the change coincides with a visual disruption like a saccade, blink, or flicker. This was demonstrated in seminal experiments using the flicker paradigm, where alternating displays of original and modified scenes separated by brief blanks led to poor detection rates for even large changes, underscoring that attention is necessary for perceiving changes. For instance, participants often missed object substitutions in real-world scenarios, like a person changing clothing during a conversation, revealing that visual representations are not as detailed or persistent as commonly assumed. The change detection paradigm has become a cornerstone for assessing visual working memory (VWM) capacity, typically involving brief presentations of item arrays followed by a test array where participants identify if a change occurred. Introduced by Luck and Vogel, this method revealed that healthy adults can accurately retain information from about four visual objects, regardless of whether the task involves simple features like color or more complex conjunctions of features.^[56] Capacity is quantified using formulas like K = N \times (hit - false\ alarm), where N is the set size, showing consistent limits around 3-4 items, with performance declining sharply beyond this due to overload.^[56] These findings imply that VWM stores integrated object representations rather than isolated features, influencing models of how the brain maintains transient visual information for comparison across time.^[56] Psychological research also explores metacognitive aspects of change detection, such as overconfidence in one's ability to notice changes, known as change blindness blindness. Studies show that people systematically overestimate their detection performance, rating themselves as likely to spot changes that they later miss, which persists even after exposure to demonstrations. Additionally, training on change detection tasks can enhance precision in localizing changes within VWM arrays, improving reliability without necessarily expanding capacity, as measured by reduced trial numbers needed for stable estimates.^[57] Factors like prior knowledge of change probability modulate detection, with probable changes noticed more readily, linking to attentional prioritization in dynamic environments. Overall, these studies emphasize change detection as a probe for core cognitive mechanisms, with implications for understanding perceptual illusions in everyday vision.