Fact-checked by Grok 2 weeks ago

Fault detection and isolation

Fault detection and isolation (FDI) is a systematic approach used to identify the occurrence of faults—unintended deviations from normal system behavior—and to pinpoint their specific type, location, and timing within complex dynamic systems, thereby enabling timely corrective actions to maintain and reliability. This methodology is essential in mission-critical applications such as , plants, automotive systems, and , where undetected faults can lead to catastrophic failures, economic losses, or hazards. FDI typically operates through two primary stages: fault detection, which monitors system performance to recognize anomalies as early as possible, and fault isolation, which determines the root cause by analyzing the affected components or subsystems. The core mechanism involves residual generation, where discrepancies between observed and predicted system outputs (based on mathematical models or patterns) signal potential issues; these residuals are then evaluated for to specific faults. Robustness to noise, disturbances, and model uncertainties is a key challenge, addressed through techniques like thresholding and statistical analysis. Broadly, FDI methods are categorized into model-based, model-free, and data-driven paradigms. Model-based approaches, such as state observers and Kalman filters, rely on analytical relations derived from to generate structured residuals for precise . Model-free methods employ physical , like multiple sensors, to compare outputs and detect inconsistencies without explicit modeling. Data-driven techniques, including artificial neural networks (ANNs), , and algorithms, leverage historical data for , offering adaptability in nonlinear or uncertain environments. In modern contexts, FDI extends to fault identification and recovery (FDIIR), particularly in autonomous systems like self-driving vehicles, where perception sensors (e.g., , cameras) must be monitored for environmental-induced faults such as or occlusion, with recovery strategies like software reconfiguration ensuring continued operation. Advances in intelligent algorithms, including and binary integer for selection, enhance fault solubility and diagnostic efficiency in large-scale systems. Overall, FDI's reflects increasing system complexity, with ongoing research emphasizing real-time implementation, integration with for fault prediction, and methods combining multiple paradigms for superior performance.

Overview

Definition and Objectives

Fault detection and isolation (FDI) is a subfield of focused on monitoring dynamic systems to identify anomalies, known as faults, and determine their specific locations or sources within the system. This process typically involves generating residuals—discrepancies between expected and observed system behaviors—to signal the presence of faults, followed by analytical techniques to pinpoint the affected components, such as sensors or actuators. Unlike fault identification, which aims to estimate the magnitude, type, or extent of a fault once detected, FDI emphasizes binary decisions on occurrence and localization to enable prompt intervention. The primary objectives of FDI are to enable early detection of faults, thereby preventing catastrophic system failures and minimizing downtime in critical applications like and processes. By isolating faults to specific subsystems, FDI facilitates targeted repairs or reconfigurations, reducing overall costs and enhancing operational . Furthermore, FDI integrates seamlessly with systems to maintain reliability, allowing for fault-tolerant designs that sustain performance even under degraded conditions. Key performance metrics for evaluating FDI systems include detection time, which measures the delay from fault onset to generation; false alarm rate, indicating the frequency of erroneous detections; isolation , assessing the precision in identifying fault locations; and sensitivity thresholds, which define the minimum detectable fault size. These metrics ensure that FDI schemes balance responsiveness with robustness against noise and modeling uncertainties. In the conceptual framework of FDI within feedback control loops, faults disrupt the closed-loop dynamics, such as by altering measurements or responses, prompting residual-based to restore nominal behavior. For instance, in a simple , a fault might bias speed , leading to unstable velocity tracking, while an fault could reduce output, causing deviations; FDI isolates these by comparing loop outputs against model predictions.

Historical Background

The field of fault detection and isolation (FDI) emerged in the early 1970s within , primarily driven by the need to enhance system reliability in and process industries. Richard V. Beard's 1971 dissertation introduced observer-based methods for failure accommodation in linear systems, laying foundational concepts for detecting and isolating faults through state estimation and self-reorganization. Complementing this, Howard L. Jones's 1973 thesis developed parity relations as a technique for failure detection in linear systems, enabling consistency checks on system measurements without explicit state observers. These early contributions established FDI as a distinct subfield, focusing on analytical methods to monitor dynamic systems proactively. The 1980s marked significant advancements, particularly with the formalization of model-based FDI. Edward Y. Chow and Alan S. Willsky's 1984 paper introduced analytical redundancy relations, which utilized mathematical models to generate residuals for robust detection and , decoupling fault signatures from system uncertainties. This work unified observer and approaches, establishing model-based FDI as a core paradigm and influencing subsequent designs for safety-critical applications. By the late , integration with techniques addressed real-world uncertainties, as exemplified by Paul M. Frank's comprehensive survey in 1990, which reviewed analytical and knowledge-based redundancy methods while proposing solutions for fault under disturbances. The saw FDI expand amid growing computational capabilities, with a rise in data-driven methods alongside model-based ones; Frank's ongoing contributions emphasized robustness to uncertainties, enabling applications in automotive and sectors. The further consolidated the field through influential surveys, such as Rolf Isermann's 2006 book, which provided a systematic overview of fault from detection to tolerance, highlighting process model-based estimation techniques. From the 2010s onward, FDI shifted toward integration, with methods post-2010 enabling in complex data, followed by applications like convolutional neural networks (CNNs) for fault pattern detection since around 2015. In the 2020s, emphasis has grown on FDI for cyber-physical systems, supported by recent IEEE standards such as IEEE 7009-2024 for design in autonomous systems, ensuring safety in interconnected environments.

Core Principles

Types of Faults

Faults in dynamic systems are broadly classified based on their , , , persistence, and , providing a foundational for fault detection and isolation (FDI) strategies. This classification helps in understanding how anomalies deviate from nominal system behavior, influencing the design of diagnostic approaches. Seminal works in FDI, such as those by Isermann, emphasize these categories to distinguish between external disturbances and internal degradations, enabling targeted monitoring in , , and automotive systems. Additive faults introduce an external offset or to signals or states, typically appearing as superimposed disturbances of the . For instance, a constant in a reading exemplifies an additive fault, where the adds a fixed value to the measured output regardless of the true signal magnitude. In contrast, multiplicative faults scale or alter the parameters proportionally to the operating conditions, such as degradation in an or efficiency loss in a motor, which multiplies the nominal response by a factor deviating from unity. This distinction is critical in model-based FDI, as additive faults affect residuals linearly while multiplicative ones introduce nonlinearities in the . Faults are further categorized by their temporal evolution: abrupt faults occur suddenly as step-like changes, often due to instantaneous events like component breakage or electrical short circuits, leading to immediate and significant deviations from normal operation. Incipient faults, however, develop gradually as drifting or ramp-like progressions, such as mechanical wear in bearings or slow in pipelines, which may remain subtle until accumulating to affect performance. These gradual faults pose unique challenges in early detection, as their signatures are often masked by process noise or variability. Component-specific faults are localized to particular elements within the system. Sensor faults manifest as inaccuracies, including , drift, or complete loss of signal, compromising the loop in systems. Actuator faults involve failures in signal delivery, such as partial blockage in a or in a servo motor, resulting in reduced or erroneous actuation. Process faults, also known as component or plant faults, arise from internal dynamic shifts, exemplified by sticking in mechanical components or parameter variations in chemical reactors, altering the core system equations. These categories—sensor, actuator, and process—form the basis for structured residual generation in FDI schemes. Regarding persistence, permanent faults endure until corrective intervention, causing sustained degradation like a fully broken wire leading to total signal loss. Intermittent faults, conversely, appear sporadically and self-resolve, often triggered by transient conditions such as loose connections or , complicating due to their non-reproducible nature. Environmental influences exacerbate these, with noise from acting as intermittent additive disturbances, while cyber-attacks in networked systems can induce both intermittent and permanent manipulations of or data. Fault severity is assessed by the extent of system impact: catastrophic faults precipitate immediate shutdown or , such as a risking total and hazards. Degradative faults, on the other hand, cause progressive performance loss without instant breakdown, like gradual insulation wear in electrical components leading to reduced efficiency over time. This severity spectrum guides prioritization in FDI, where high-severity events demand rapid response to avert disasters.

Detection, Isolation, and Identification

Fault detection and (FDI) encompasses three sequential processes: detection, which identifies the presence of a fault; , which localizes the fault to specific components; and , which characterizes the fault's nature. These steps form the core of diagnostic frameworks in dynamic systems, relying on discrepancies between observed and expected behaviors to ensure timely system supervision. Detection involves residuals, defined as the differences between actual measurements and those predicted by a nominal model, to flag anomalies indicative of faults. Residuals are generated through analytical methods, such as state observers or parity equations, capturing deviations caused by faults in actuators, sensors, or processes. To distinguish faults from noise or modeling uncertainties, residuals are evaluated against predefined ; for instance, a residual exceeding a ε signals a fault occurrence, where ε is typically set based on statistical bounds like three deviations of residual variance under fault-free conditions. This threshold-based approach ensures robustness while minimizing false alarms, as residuals remain close to zero in healthy operation but diverge significantly upon fault inception. Isolation follows detection and aims to pinpoint the affected subsystem or component using structured fault signatures derived from patterns. Fault signatures represent unique combinations of responses to specific faults, often encoded in diagnostic matrices where rows correspond to residuals and columns to potential fault candidates; a '1' indicates to a fault, while '0' denotes insensitivity. Decision logic, such as or rules, compares observed vectors against these signatures to identify the fault location—for example, if only certain residuals deviate in a manner matching a predefined column, the corresponding component is . This matrix-based method facilitates efficient in multi-variable systems by leveraging in measurements. Identification extends isolation by estimating the fault's quantitative attributes, including its magnitude, type (e.g., additive or multiplicative), and onset time. Techniques such as least-squares parameter adapt models to fit faulty data, yielding fault estimates without requiring full model inversion; for instance, an fault magnitude can be approximated by minimizing the error between predicted and measured outputs. This process often integrates prior isolation results to focus estimation on candidate faults, providing actionable insights for maintenance or reconfiguration. These processes are interdependent, with detection serving as a prerequisite for both isolation and identification, as undetected faults cannot be localized or characterized. In multi-fault scenarios, challenges arise from fault masking, where one fault's effects obscure another's, leading to ambiguous signatures and reduced isolability; simultaneous faults may produce composite residuals that mimic single-fault patterns, necessitating advanced strategies. Evaluation of FDI performance hinges on criteria like fault detectability and isolability. Detectability assesses the minimum detectable fault size, defined as the smallest fault magnitude that produces a deviation exceeding the despite disturbances, often quantified intrinsically by the fault's effect on system trajectories or performatively by detection delay metrics. Isolability evaluates the distinguishability of fault modes, requiring unique signatures for each fault to avoid ; for linear systems, this is ensured if fault directions in space are linearly independent. These criteria guide system design, ensuring faults are reliably addressed before propagation.

Model-Based FDI

Analytical Redundancy Relations

Analytical redundancy refers to the use of mathematical models of a to generate expected outputs from known inputs and compare them against actual measurements, thereby creating residuals that indicate discrepancies due to faults; this approach substitutes for physical redundancy by exploiting the inherent relationships within the model. In model-based fault detection and isolation (FDI), analytical redundancy enables the computation of relations—equations that must hold for fault-free operation—allowing faults to be detected when these relations are violated. For linear time-invariant systems described by the state-space model \dot{x} = Ax + Bu + Ld, y = Cx + Du + Ff, where x is the state vector, u the input, y the output, d disturbances, f faults, and L, F fault distribution matrices, the parity vector is constructed to form residuals insensitive to inputs and disturbances but sensitive to faults. A basic residual is generated as r = y - \hat{y}, where \hat{y} = C\hat{x} + Du and \hat{x} is an estimate derived from the model, often simplified in static cases to r = y - Cx - Du under full state knowledge, though practical implementations use past inputs and outputs to eliminate unmeasured states. The parity vector w satisfies w(s)(y(s) - G_u(s)u(s)) = 0 in the fault-free case, where G_u(s) is the input-output transfer function and s the Laplace variable, ensuring residuals r = w(s)(y(s) - G_u(s)u(s)) decouple from nominal behavior. The fault signature matrix, also known as the fault direction matrix, organizes for : its rows correspond to independent , and columns to potential faults, with entries indicating the effect of each fault on each (e.g., nonzero if the fault affects the ). Structured are designed such that each fault produces a unique pattern of nonzero , enabling ; for instance, if a fault in f_1 affects only r_1 (signature [1, 0]^T), while f_2 affects r_2 ([0, 1]^T), the observed uniquely identifies the fault. Generation of parity relations can be direct (static), using algebraic elimination of states from the equations for instantaneous residuals, or dynamic, incorporating functions or delay operators for time-series data to handle . In the dynamic approach, a left W(s) of the ensures residuals are zero under no faults, enhancing robustness to noise. A representative example is fault detection in a drive , modeled as \dot{x} = Ax + Bu + Lf, y = Cx, where analytical redundancy relations (ARRs) like R_m i_m + L_m \frac{di_m}{dt} + \mu_m \omega = v (with R_m, L_m motor parameters, i_m current, \omega speed, v voltage) generate residuals sensitive to faults in or ; the fault signature matrix then isolates, e.g., motor faults from gear faults by unique residual patterns. Analytical redundancy offers the advantage of avoiding duplication, relying instead on software-based model computations for cost-effective FDI, and provides explicit fault isolability through structured designs. However, it faces limitations in nonlinear systems, where deriving exact relations is challenging due to the lack of linear superposition, often requiring approximations or extensions like models, which may reduce robustness.

Observer-Based Approaches

Observer-based approaches to fault detection and isolation (FDI) in model-based frameworks utilize observers to estimate states from measurable outputs, generating residuals that signal deviations due to faults. These methods rely on the principle of analytical redundancy, where discrepancies between predicted and actual outputs indicate anomalies. The core idea involves designing an observer that asymptotically tracks the fault-free , allowing fault effects to manifest in the estimation error. The foundational observer for linear time-invariant systems is the Luenberger observer, proposed for estimation in deterministic systems described by \dot{x} = Ax + Bu, y = Cx. The observer dynamics are given by \dot{\hat{x}} = A\hat{x} + Bu + L(y - C\hat{x}), where \hat{x} is the estimated and L is the observer gain chosen to error . The is typically defined as r = y - C\hat{x}, which converges to zero in the fault-free case if the observer is stable. For fault detection, is extended to handle unknown inputs such as disturbances and faults through unknown input observers (UIOs). In UIO designs, the observer structure the effects of unknown inputs from the , ensuring sensitivity to faults like or malfunctions while remaining robust to process disturbances. For instance, generation for process faults involves modifying the observer to treat faults as additive terms in the state equation, where the r becomes non-zero only when faults occur, as derived from the error dynamics \dot{e} = (A - LC)e + E f, with e = x - \hat{x}, E the fault distribution matrix, and f the fault vector; is achieved by placing the eigenvalues of A - LC in the left half-plane via pole placement techniques for L. Seminal UIO formulations ensure the existence conditions, such as rank constraints on the output and fault matrices, to enable disturbance . Fault isolation in observer-based schemes employs structured configurations like the dedicated observer scheme (DOS), which uses a bank of observers—one dedicated to each potential fault hypothesis. In DOS, each observer is insensitive to all faults except the one it monitors, allowing isolation by identifying the unique that deviates from zero. Adaptive thresholds may be applied to to account for modeling uncertainties, enhancing reliability without false alarms. The L for each observer is designed independently using LMI-based or pole placement methods to guarantee asymptotic and fault . Extensions to nonlinear systems incorporate sliding mode observers (SMOs) for enhanced robustness against nonlinearities and matched uncertainties. SMOs enforce a sliding surface on the output , driving the estimation to zero in finite time and generating discontinuous signals equivalent to fault estimates. For example, in \dot{x} = f(x) + g(x)u + d(x) + E(x)f, the SMO uses a switching term \nu = -\rho \frac{r}{|r| + \delta} added to the correction, where \rho bounds the nonlinearity, ensuring robust generation for fault in applications like electric drives. These approaches maintain the principles while addressing nonlinear fault propagation.

Data-Driven FDI

Signal Processing Techniques

Signal processing techniques form a cornerstone of data-driven fault detection and isolation (FDI) by transforming raw time-series data into forms that reveal fault-induced anomalies without relying on system models. These methods emphasize filtering, , and to identify patterns such as transients, distortions, or non-stationary behaviors in signals from like accelerometers or probes. Widely applied in rotating machinery and electrical systems, they enable early detection of faults like bearing wear or winding shorts by analyzing or electrical signatures directly. In the , filters smooth noisy signals to highlight gradual or transient faults by averaging consecutive samples, reducing high-frequency noise while preserving fault-related trends. For instance, in brushless drives, a filter processes back signals to detect open-circuit faults by isolating deviations in the smoothed waveform. transforms extend this capability for transient detection, decomposing signals into time-localized frequency components using functions; Daubechies wavelets, known for their compact support and smoothness, excel at capturing abrupt changes like cracks in transmission lines or impacts in mechanical systems. These wavelets perform multi-resolution analysis, where higher-order Daubechies (e.g., db4) provide better approximation of sharp transients compared to simpler Haar wavelets, enabling isolation of fault events from healthy baselines. Frequency-domain analysis employs the (FFT) to convert time signals into spectra, revealing harmonic shifts indicative of faults; in bearing diagnosis, FFT identifies characteristic peaks at fault frequencies (e.g., ball pass frequencies) amid vibration spectra, where inner-race defects produce sidebands around the carrier frequency. This approach quantifies fault severity by measuring increases in specific harmonics, as demonstrated in rolling element bearings where outer-race faults manifest as elevated energy at the fault frequency multiplied by shaft rotation rate. Such spectral peaks allow isolation by comparing against healthy spectra, though FFT assumes stationarity and may smear transient events. For non-stationary signals, time-frequency methods like the (STFT) and (CWT) provide joint representations, balancing time and frequency resolution. STFT segments the signal into overlapping windows and applies FFT to each, producing spectrograms that track evolving fault frequencies in varying-speed machinery; however, its fixed window limits resolution for wideband transients. CWT overcomes this with scalable wavelets, offering variable resolution suited to non-stationary vibrations, such as in internal combustion engines where it localizes fault impulses in both time and scale domains for precise isolation. In rolling bearings, CWT scalograms highlight energy concentrations at fault scales, outperforming STFT for early-stage detection under speed fluctuations. Feature extraction from processed signals condenses information into scalar metrics for threshold-based detection rules, applied directly to raw data. (RMS) measures signal energy to detect increased levels from faults like misalignment; quantifies peakedness, rising above 3 for impulsive faults such as bearing spalls; and , the ratio of peak to RMS, signals transients by exceeding thresholds (e.g., >6 for healthy bearings). These time-domain features enable simple rule-based —e.g., >4 flags inner-race faults—without probabilistic modeling, though they are often combined for robustness in applications like monitoring. Such techniques process unmodeled streams in , facilitating FDI in settings like chemical plants or power grids.

Statistical and Parity Methods

Statistical and parity methods represent a class of data-driven fault detection and isolation (FDI) techniques that leverage historical process data to establish statistical models for identifying deviations indicative of faults. These approaches assume that normal operating conditions produce data following known statistical distributions, such as multivariate Gaussian, allowing residuals or test statistics to signal anomalies when they exceed predefined thresholds. By focusing on empirical correlations and variances from data, these methods avoid reliance on explicit physical models, making them suitable for complex systems where full modeling is impractical. In statistical process monitoring, multivariate data under normal conditions is often analyzed using Hotelling's T^2 statistic, which measures the squared of a new from the process , accounting for data . This statistic is defined as T^2 = (\mathbf{x} - \boldsymbol{\mu})^T \mathbf{S}^{-1} (\mathbf{x} - \boldsymbol{\mu}), where \mathbf{x} is the , \boldsymbol{\mu} is the estimated from historical , and \mathbf{S} is the sample matrix; under Gaussian assumptions, it follows a scaled chi-squared distribution, enabling threshold setting for fault detection. For instance, in manufacturing processes, T^2 charts have been applied to detect shifts in multiple sensor readings, isolating faults by examining contributions from individual variables to the statistic. Complementing T^2, the chi-squared (\chi^2) test is used for monitoring squared residuals from model predictions, assuming Gaussian noise, where the test statistic Q = \mathbf{e}^T \mathbf{e} (with \mathbf{e} as residuals) follows a \chi^2 distribution to detect non-conforming residual patterns. These tools are foundational in multivariate statistical process control for early fault alerting in industrial settings. Parity methods in a data-driven context generate parity vectors through dimensionality reduction techniques like principal component analysis (PCA), which decomposes historical data into principal components capturing normal variability, while residuals in the orthogonal space (non-principal directions) highlight faults. In PCA-based parity approaches, the parity vector is constructed as \mathbf{r} = \mathbf{P}^\perp \mathbf{y}, where \mathbf{P}^\perp is the projection matrix onto the residual subspace orthogonal to the principal components, and \mathbf{y} is the measurement vector; faults are isolated by identifying which variables contribute most to \|\mathbf{r}\|^2 exceeding thresholds. This method excels in high-dimensional systems, such as sensor networks, by reducing noise and isolating actuator or sensor faults through structured partial PCA on variable subsets. For example, in dynamic systems, PCA-derived parities have demonstrated effective isolation of multiple sensor failures by reconstructing fault signatures from residual patterns. Likelihood ratio tests provide a hypothesis-testing framework for FDI, comparing the likelihood of data under a (H_0: no fault, normal operation) against an alternative (H_1: fault present) using the \Lambda = 2 \ln \left( \frac{L(H_1)}{L(H_0)} \right), which under Gaussian assumptions approximates a for threshold decisions. In chemical es, such as columns, these tests have been applied to detect degradation or valve sticking by modeling fault-induced shifts in process variables, achieving detection rates above 95% in simulations while isolating faults via maximized likelihood under specific fault hypotheses. Covariance-based residuals in data-only setups focus on sequences—differences between observed and predicted values from empirical structures—without requiring a full state-space model. These residuals are generated as \mathbf{\nu}(k) = \mathbf{y}(k) - \hat{\mathbf{y}}(k|k-1), with their \mathbf{P}_\nu estimated directly from historical data to form test statistics like the generalized variance |\mathbf{P}_\nu|, tested against chi-squared thresholds to detect or anomalies. This approach, akin to innovations but purely data-driven, has been used in systems to monitor deviations, ensuring robustness to process noise in applications like power grids. For handling multiple faults, generalized likelihood ratio (GLR) tests extend standard likelihood ratios by jointly estimating fault parameters under H_1, using \Lambda_g = 2 \ln \left( \frac{\sup_{\theta \in \Theta_1} L(\theta)}{\sup_{\theta \in \Theta_0} L(\theta)} \right) to detect and isolate concurrent faults like multiple biases. In complex systems, such as controls, GLR has isolated multi-fault scenarios with low false alarms by partitioning the parameter space, outperforming single-fault methods in simulations with overlapping fault signatures.

Artificial Intelligence in FDI

Machine Learning Techniques

Machine learning techniques in fault detection and isolation (FDI) leverage algorithms to identify patterns in data or features, enabling the or grouping of faults without relying on explicit physical models. These methods are particularly valuable in complex s where data abundance allows for learning from historical or simulated fault scenarios, improving and reducing human intervention in diagnostics. Supervised and approaches form the core, with ensembles enhancing robustness, while and validation strategies ensure practical deployment. Supervised methods, such as support vector machines (SVMs), classify fault types by constructing hyperplanes in high-dimensional spaces to separate normal operations from various fault classes, maximizing the margin between them for improved generalization. SVMs have been effectively applied in FDI, where they detect and isolate and faults by training on and operational , achieving high classification accuracy in multi-fault scenarios. Similarly, k-nearest neighbors (k-NN) isolates faults by measuring proximity in space, assigning a point to the fault class of its closest labeled neighbors, which proves useful for nonlinear where fault boundaries are irregular. In process monitoring, k-NN rules have demonstrated robust isolation performance by adapting to distributions without assuming underlying models. Unsupervised methods address scenarios with limited labeled fault data by identifying anomalies through inherent data structures. K-means clustering partitions data into clusters representing normal and anomalous behaviors, detecting faults as points deviating from the dominant normal cluster, which has been utilized in industrial process monitoring to group sensor readings and flag outliers indicative of faults. For novelty detection in normal operations, one-class SVM constructs a hypersphere enclosing typical data points, flagging deviations as potential faults; this approach excels in engineering systems like machinery where only healthy data is abundant for training, enabling early isolation of unseen anomalies. Ensemble techniques, such as random forests, aggregate multiple decision trees to enhance fault isolation robustness, particularly in handling imbalanced datasets common in FDI where fault events are rare. By employing bagging to create diverse trees from bootstrapped samples and random feature subsets, random forests reduce and improve decision boundaries for classifying multiple fault types in unsteady-state processes. This method has shown superior performance in diagnosing faults in chemical plants by ranking feature importance and mitigating bias toward majority classes. Feature selection is crucial in FDI to manage high-dimensional data from sensors, with recursive feature elimination (RFE) iteratively removing least important features based on model performance to retain discriminative ones. In wind turbine fault classification, RFE combined with classifiers like random forests selects key and features, improving detection accuracy by focusing on fault-relevant signals and reducing computational overhead. paradigms in ML-based FDI emphasize through cross-validation, which partitions data into folds to evaluate model performance across subsets, preventing to specific fault instances. For imbalanced fault data, metrics like are prioritized over accuracy; precision measures the proportion of true faults among detected positives, while recall captures the fraction of actual faults identified, often averaged via macro or weighted schemes in k-fold validation to guide hyperparameter tuning in applications like turbine diagnostics.

Deep Learning Techniques

Deep learning techniques in fault detection and isolation (FDI) leverage hierarchical neural architectures to automatically extract intricate fault patterns from raw sensor data, surpassing traditional methods by handling non-linearities and high-dimensional inputs without manual feature engineering. These approaches, particularly neural networks with multiple layers, enable end-to-end learning of fault representations, improving accuracy in complex systems like rotating machinery and pipelines. Seminal works have demonstrated their efficacy in industrial applications, where vast datasets from vibrations, acoustics, or time-series signals allow models to generalize across fault types. Convolutional neural networks (CNNs) are widely applied in FDI for processing image-like representations, such as spectrograms derived from signals, to classify faults in components. The architecture typically includes convolutional layers that apply filters to detect local patterns like frequency peaks indicative of bearing , followed by pooling layers to reduce dimensionality and enhance invariance, culminating in fully connected layers with a softmax for multi-class fault . For instance, in rotating machinery diagnostics, CNNs have achieved over 95% accuracy by directly learning from raw time-frequency data, avoiding the need for hand-crafted features. Recurrent neural networks (RNNs), particularly (LSTM) variants, excel in FDI tasks involving sequential data, where they capture temporal dependencies in time-series signals for early fault detection. LSTMs mitigate vanishing gradient issues in standard RNNs through gating mechanisms that selectively retain relevant historical information, making them suitable for monitoring dynamic processes like fluid leaks in pipelines. In such applications, LSTM models trained on and data have detected anomalies with precision exceeding 90%, enabling of fault locations by analyzing patterns over time. Autoencoders provide an framework for in FDI by learning compressed representations of normal system behavior, flagging deviations as potential faults through thresholds. The encoder compresses input data into a , while the reconstructs it; high errors on test data indicate faults, facilitating without labeled examples. Variational autoencoders (VAEs) extend this by incorporating probabilistic modeling, where the follows a prior distribution (e.g., Gaussian), allowing generative sampling for fault and probabilistic in noisy . VAEs have shown robust performance in process monitoring by quantifying in fault likelihood, with fault detection rates around 83-87% in applications. Transfer learning addresses data scarcity in FDI, particularly for rare faults, by fine-tuning pre-trained models like ResNet on domain-specific datasets. ResNet's residual connections enable deep architectures to learn transferable features from large-scale image tasks, which are adapted for vibration spectrograms, yielding accuracies up to 98% even with limited fault samples in mechanical systems. This approach mitigates overfitting in scarce-data scenarios, such as infrequent actuator failures, by initializing with ImageNet weights and retraining only the classifier layers. Hybrid models, such as CNN-LSTM architectures, integrate spatial and temporal feature extraction for spatio-temporal FDI challenges, like fault in networked systems. CNN layers process local patterns in signal spectra, while LSTM layers model sequence evolution, enabling comprehensive of dynamic faults. Training employs to minimize loss functions tailored to FDI, such as for multi-class : L = -\sum_{i=1}^{C} y_i \log(\hat{y}_i) where y_i is the true label for class i among C fault types, and \hat{y}_i is the predicted probability from softmax. computes gradients via the chain rule, updating weights layer-by-layer to optimize fault discrimination; in bearing diagnostics, these hybrids have shown improved accuracy by fusing and temporal trends. Recent advances as of 2025 include autonomous agents for fault detection and self-healing in systems, as well as enhanced integration for diagnosing faults in electric vehicles, improving real-time adaptability and .

Robust and Advanced FDI

Handling Uncertainties and Disturbances

In robust fault detection and isolation (FDI), uncertainties arise from various sources that can degrade the of diagnostic schemes, including uncertainties due to modeling errors in system parameters, nonparametric uncertainties from unmodeled dynamics, and uncertainties manifested as measurement or process disturbances. These uncertainties must be explicitly addressed to ensure reliable generation and evaluation, as they can mimic fault signatures and lead to false alarms or missed detections. H∞ filtering provides a minimax approach to robust FDI by minimizing the worst-case energy gain from disturbances to residuals, thereby achieving disturbance rejection while maintaining sensitivity to faults. In this framework, residual generators are designed as H∞ filters that bound the influence of uncertainties, ensuring that the H∞ norm of the from disturbances to residuals remains below a prescribed level, often formulated as a standard filtering problem solvable via linear matrix inequalities (LMIs). This method extends basic observer-based techniques by incorporating robustness constraints, allowing for effective isolation even under bounded energy disturbances. Adaptive thresholds enhance robustness by establishing time-varying bounds on residuals that account for estimated disturbance levels, often integrated with unknown input decoupling in observer designs to eliminate the direct effect of disturbances on fault signatures. For instance, in linear parameter-varying (LPV) systems, interval observers generate adaptive thresholds that dynamically adjust based on uncertainty bounds, reducing false alarms without compromising fault detectability. This approach decouples unknown inputs—such as external disturbances—from the , ensuring that thresholds reflect only the residual's to faults. Fuzzy logic integration addresses nonlinear uncertainties in FDI by employing membership functions and rule bases to model vague or imprecise knowledge about system behavior under disturbances. Takagi-Sugeno fuzzy observers, for example, approximate nonlinear with local linear models weighted by fuzzy rules, generating residuals robust to parametric variations and unmodeled nonlinearities while isolating faults through defuzzified decision . This method is particularly effective for systems where uncertainties defy precise quantification, allowing rule-based compensation for disturbances in real-time applications. Performance guarantees in robust FDI involve explicit trade-offs between fault sensitivity—measured by the minimum gain from faults to residuals—and disturbance robustness, often quantified using condition numbers that assess the ill-conditioning of residual generators under uncertainty. Seminal analyses show that optimizing the H−/H∞ index balances these objectives, with higher condition numbers indicating vulnerability to disturbances that could mask faults, thus guiding filter design to achieve specified detection rates while bounding false alarm probabilities. Such guarantees ensure that robust FDI schemes maintain efficacy across operating regimes, prioritizing high-impact metrics like the disturbance-to-fault sensitivity ratio over exhaustive benchmarks.

Integrated Fault-Tolerant Systems

Integrated fault-tolerant systems embed fault detection and isolation (FDI) mechanisms directly into architectures to ensure continuous operation despite faults, enabling seamless transitions from nominal to degraded modes. (FTC) strategies are categorized into passive and active paradigms. Passive FTC relies on robust controllers designed a priori to tolerate predefined faults without requiring diagnosis, leveraging techniques like for inherent resilience against uncertainties. In contrast, active FTC incorporates FDI outputs to dynamically reconfigure the system, such as adjusting laws based on fault severity, which enhances adaptability but demands faster computation. The integration of FDI with facilitates real-time fault estimation that directly informs controller adjustments, minimizing performance degradation. In this framework, FDI modules estimate fault parameters—such as magnitude and location—using observer-based or data-driven methods, which are then fed into gains to compensate for anomalies. A prominent example is in flight control systems with redundant actuators, where FDI detects partial failures in hydraulic or electro-mechanical actuators, enabling the controller to redistribute commands among healthy units while preserving during maneuvers. This approach has been demonstrated in simulations of civil . Reconfigurable control within integrated FTC often employs (MPC) to adjust trajectories based on isolated faults, optimizing future states under constraints like limits. Upon fault isolation, MPC reformulates its to incorporate fault effects, such as reduced effector authority, ensuring and reference tracking. is rigorously guaranteed through Lyapunov analysis, where a —typically quadratic in state errors—is constructed to prove asymptotic convergence even under reconfiguration, with terminal constraints ensuring recursive feasibility. Such methods have shown robust performance in nonlinear systems. Hierarchical architectures in integrated FTC position the FDI layer above the control layer, allowing modular fault handling across system scales. The FDI layer processes raw data for detection and , passing refined fault signatures to the lower layer for reconfiguration, which promotes in complex systems like multi-agent networks. mechanisms enhance reliability in multi-sensor by aggregating outputs from redundant sensors—such as or —to isolate faulty readings, thereby improving FDI accuracy in noisy environments. This structure has been applied in distributed systems. Compliance with standards like is essential for automotive implementations, mandating , testing, and ASIL-rated architectures to achieve up to ASIL D. The standard requires verifiable through metrics like diagnostic coverage exceeding 99% for high-risk items, guiding the design of redundant electronics and software partitioning. As of 2024-2025, recent developments in FTC for unmanned aerial vehicles include reinforcement learning-based approaches for quadrotor and distributed control for swarms, improving adaptability to faults in dynamic environments.

Fault Recovery

Accommodation Strategies

Accommodation strategies in fault detection and isolation (FDI) focus on immediate of fault effects through software-based adjustments, enabling continued operation without structural changes until repairs can be performed. These techniques typically activate post-fault isolation, substituting or compensating for faulty components to maintain stability and performance. For instance, in faults, virtual sensors generate estimates to replace erroneous measurements, while faults may be addressed by adapting gains. Such approaches are essential in safety-critical systems, where rapid response minimizes downtime and prevents cascading failures. Sensor fault accommodation often employs virtual sensors, which use state estimation algorithms to substitute faulty readings with predicted values derived from system models and healthy sensor data. This method reconstructs the sensor output by integrating observer-based techniques, such as Kalman filters or sliding mode observers, to ensure continuity in feedback loops. In applications like wind turbines, virtual sensors have demonstrated effective fault hiding by maintaining control accuracy despite sensor degradation. Similarly, for grid-side converters, virtual sensors enable fault accommodation through estimation techniques. Actuator fault accommodation commonly involves gain scheduling, where control parameters are dynamically adjusted based on the identified fault magnitude to redistribute control effort among remaining . This technique leverages linear parameter-varying (LPV) models to interpolate gains that compensate for partial actuator losses, ensuring robust under varying operating conditions. In aeroengine , gain-scheduled robust controllers accommodate degradation by estimating fault impacts and optimizing thrust response. For networked systems, internal model (IMC)-based architectures facilitate actuator fault tolerance through scheduled gains, minimizing overshoot in response to faults up to 50% effectiveness loss. Isolation-based responses utilize pre-computed lookup tables that map identified fault modes to predefined accommodation actions, allowing swift implementation in systems. These tables store optimized adjustments for common fault scenarios, derived from offline simulations or historical , and are particularly effective in industries where computational resources are limited. In chemical plants, lookup tables enable rapid switching to backup laws upon fault isolation, such as adjusting valve positions to maintain reaction stability; studies report accommodation times under 1 second for multi-variable es. This approach reduces reliance on online optimization, enhancing reliability in environments with high fault predictability. Soft computing methods, such as model predictive fault accommodation, employ optimization to minimize a balancing fault impact and control effort. The objective is formulated as: \min J = \sum_{k=1}^{N} \left( \| \hat{y}(k) - y_{ref}(k) \|^2_Q + \| \Delta u(k) \|^2_R \right) + \sum_{k=1}^{N} \| f(k) \|^2_P where J incorporates predicted outputs \hat{y}, reference y_{ref}, control increments \Delta u, estimated fault f, and weighting matrices Q, R, P; this setup accommodates faults by constraining inputs to feasible sets while prioritizing performance recovery. In omni-directional mobile robots, such predictive schemes have achieved for wheel actuator failures, restoring tracking. For nonlinear systems like two-rotor aero-dynamical setups, neural network-enhanced MPC ensures accommodation without full reconfiguration. Despite their efficacy, accommodation strategies serve as temporary measures, bridging the gap to physical repairs, and are evaluated using metrics like response time from fault isolation to effective . Limitations include dependency on accurate fault estimation, potential performance degradation in severe faults, and increased computational load in optimization-based methods, which may exceed constraints in resource-limited settings. In practice, targets for rapid accommodation in critical systems help avoid safety violations. A representative example is fault bypassing in hydraulic systems via paths, where redundant routes activate upon detecting a blockage or leak in the primary actuator path. This software-mediated rerouting maintains and continuity, as seen in heavy-duty mobile machinery, where cylinder configurations rephase to compensate for single-path failures, preserving lifting with minimal speed loss (typically <10%). Such strategies highlight the role of in extending operational life without intervention.

System Reconfiguration

System reconfiguration in fault detection and isolation (FDI) involves dynamically altering the system's architecture after a fault has been detected and isolated to restore operational functionality and maintain performance objectives. This process contrasts with mere by emphasizing structural changes, such as rerouting resources or switching components, to adapt to the degraded state. Effective reconfiguration minimizes and ensures the system continues to meet safety and reliability requirements in critical applications like and . Hardware reconfiguration primarily relies on redundancy mechanisms to switch to backup components upon fault occurrence. In avionics systems, failover techniques enable seamless transition to redundant hardware, such as spare actuators or processors, to prevent mission failure. For instance, integrated modular avionics (IMA) employs multiprocessor reconfiguration algorithms that isolate faulty modules and redistribute tasks across healthy ones, enhancing overall fault tolerance. A prominent voting scheme is triple modular redundancy (TMR), where three identical hardware modules process inputs in parallel, and a majority vote determines the output, effectively masking single-point failures with a reliability improvement factor of up to 10^6 in radiation-prone environments. TMR has been integral to systems like the Apollo guidance computer, ensuring continued operation despite transient faults. Software reconfiguration focuses on updating control algorithms without hardware changes, often through adaptive mechanisms that modify system behavior in real-time. Adaptive control laws, updated via parameter identification, allow the system to compensate for faults by recalibrating gains or switching to alternative controllers. In cabin pressure control systems, simple adaptive control (SAC) reconfigures by incorporating a parallel compensator, maintaining during actuator partial failures (e.g., 50% loss) or sensor drifts without requiring explicit fault models. In , reconfiguration handles joint failures by redistributing tasks among redundant degrees of freedom; for a 2-DOF manipulator with a locked , the control law adapts to preserve workspace functionality using kinematic redundancy. Hybrid fault-tolerant control (FTC) in industrial robots combines passive robustness with active reconfiguration, improving recovery in multi-joint scenarios. Hybrid approaches integrate and software by modeling the as a , enabling topology changes for optimal fault recovery. -based models represent components as and connections as edges, allowing algorithms to identify and reroute paths post-fault. For industrial plants, directed weighted simulate fault and use genetic algorithms to activate switch , minimizing cascade effects while preserving service capacity (e.g., maintaining 80-90% of total service in failure simulations). computes shortest paths for rerouting in sparse , ensuring efficient ; in a 100- network, it reduces reconfiguration actions to 1-3 flips, boosting survival to 99%. These methods leverage at both levels, such as combining TMR with adaptive software overlays. Key challenges in system reconfiguration include managing time delays during transitions and guaranteeing post-reconfiguration . Detection and switching delays can destabilize the system, particularly in switched architectures where short dwell-times conflict with closed-loop requirements; delays exceeding 10-20% of the constant may lead to oscillations or . assurance often employs invariant sets, which define regions in state space where trajectories remain confined post-reconfiguration, ensuring bounded errors and . For switching systems, maximal controlled invariant sets are computed offline to verify safety specifications, with online set-membership tests minimizing computational overhead while providing global guarantees. Performance is evaluated using metrics like success rate and post-reconfiguration . success rate measures the percentage of faults where full or partial functionality is restored, often exceeding 95% in redundant with TMR but dropping to 70-80% in non-redundant without timely reconfiguration. Post-reconfiguration performance quantifies losses in metrics such as or throughput; these metrics highlight the between rapid and sustained , guiding for minimal impact (e.g., <5% in high-reliability applications).

Applications

Industrial and Mechanical Systems

In industrial and mechanical systems, fault detection and isolation (FDI) plays a crucial role in maintaining , particularly through mechanical fault diagnosis targeting common failure points such as gearboxes and bearings. analysis is a primary technique for diagnosing gearbox faults, employing time-domain methods like waveform analysis and statistical indices (e.g., and ) to detect anomalies such as gear wear or misalignment, as well as frequency-domain approaches including Fourier transforms to identify characteristic fault frequencies. For bearing faults, which account for over 41% of machine breakdowns, techniques such as (RMS) measurements, analysis, and spectral envelope methods enable early detection of defects like inner race cracks by isolating impulsive signals from . A representative in involves wind turbines, where FDI systems using monitoring for pitch system faults—responsible for up to 20% of downtime—have demonstrated reductions in unplanned outages by up to 12% through timely fault isolation and accommodation strategies. In process industries like chemical plants, model-based FDI approaches are widely applied to detect and isolate faults such as leaks, which can compromise safety and efficiency. These methods generate residuals from discrepancies between observed and predicted system behavior, using techniques like neural networks trained on performance metrics (e.g., , overshoot) to diagnose faults including leakage or supply issues without additional hardware. For instance, in a (FCC) pilot plant, a causal model-based diagnostic module employing and hitting-set algorithms isolated leaks between and column in 5 minutes, compared to 50 minutes via manual operator assessment, enhancing process reliability. Statistical methods complement these in batch processes, where multivariate (MSPC) techniques, such as (PCA) and partial least squares (PLS), monitor trajectory deviations to detect faults like inconsistent reaction rates, enabling isolation in chemical batch reactors by aligning historical data phases. Implementation of FDI in industrial settings often involves integration with supervisory control and data acquisition () systems, where FDI modules process sensor data to generate alarms and isolate faults, as demonstrated in machinery where SCADA-enabled FDI reduced downtime by identifying shearer drum overloads. A notable real-world example is ' deployment of FDI-enhanced systems in factories post-2010, such as the Electronics Plant, which uses AI-driven fault diagnostics integrated into production lines to achieve near-zero defect rates and , supporting 4.0 transitions. Challenges in these environments include sensor degradation due to harsh conditions like high temperatures, corrosive chemicals, and vibrations, which can introduce false positives in FDI signals and necessitate robust, high-temperature for reliable operation. However, Industry 4.0 advancements with (IoT) data mitigate these by enabling distributed sensing and cloud-based analytics, improving FDI accuracy through real-time fusion of multi-sensor inputs and reducing fault propagation in chains. Quantitative impacts of FDI in heavy machinery highlight significant cost savings from reduced unplanned outages, with predictive approaches yielding 15-30% lower maintenance expenses by minimizing reactive repairs and extending asset life, as evidenced in sectors like and .

Aerospace and Automotive Systems

In systems, fault detection and (FDI) is critical for maintaining operational safety in high-stakes environments like engine health monitoring, where model-based methods analyze data to identify anomalies in performance. For instance, employs ensemble-based hierarchical classifiers for diagnosing and isolating faults in Frame 9 , leveraging time-series data from to detect degradation in components such as compressors and turbines. Similarly, -developed architectures use model-based approaches for gas path FDI in engines, processing through Kalman filters and to achieve precise isolation of faults like biases or failures with minimal false alarms. These techniques ensure early detection, often within sub-second timelines, to support decision-making during flight. Flight systems in incorporate redundancies and fault-tolerant (FTC) to handle FDI seamlessly. The Boeing 787's primary flight computers feature triple-redundant fly-by-wire architecture, where faults in actuators or sensors trigger automatic reconfiguration to backup channels, maintaining stability even under multiple failures. This design achieves high reliability in fault isolation for critical flight phases, aligning with FAA and EASA requirements that mandate robust FDI validation through extensive simulation and to ensure system integrity under 14 CFR Part 25 and CS-25 standards. A notable case is the A380's implementation in the 2000s, where electrohydrostatic actuators (EHAs) in the hydraulic systems enable fault by switching to electrical backups upon detection of losses or leaks. In automotive applications, FDI focuses on real-time diagnostics for safety-critical components, with standards enabling of faults in and engines through standardized diagnostic trouble codes (DTCs) and protocols like ISO 15765-4 (CAN). OBD-II systems monitor parameters such as pressure and engine misfires, triggering via analysis to comply with emissions and safety regulations, often detecting issues in under a second to avert accidents. Advanced driver-assistance systems (ADAS) integrate for sensor fault detection, using neural networks to identify failures in cameras or radars, as seen in Tesla's evolutions since 2018, where models process fleet data to enhance accuracy and mitigate risks like phantom braking. Compliance with governs these systems, assigning Automotive Safety Integrity Levels (ASIL) from A to D based on severity, exposure, and controllability; for example, FDI typically requires ASIL D, demanding probabilistic metrics like ≥99% diagnostic coverage to prevent systematic failures. Electric vehicle (EV) battery management exemplifies 2020s FDI advancements, with model-based and data-driven methods isolating cell-level faults to prevent . Techniques such as electrochemical modeling and detect imbalances in voltage or temperature, isolating faulty modules via (BMS) algorithms to avert propagation, achieving sub-second response times critical for passenger safety. These approaches integrate with to ensure fault-tolerant operation under high-stress conditions like fast charging. Overall, and automotive FDI prioritizes sub-second detection and high reliability to meet stringent regulatory demands, enabling proactive recovery in dynamic transport scenarios.

References

  1. [1]
    (PDF) Overview on Fault Detection and Isolation - ResearchGate
    Jul 20, 2022 · This paper presents a detailed survey of fault detection and isolation methods and reviews of scientific researches in this field.
  2. [2]
  3. [3]
    Comparison of fault detection and isolation methods: A review
    ### Summary of Fault Detection and Isolation Methods Comparison
  4. [4]
  5. [5]
    [PDF] Adaptive Fault Detection and Isolation for DC Motor Input and Sensors
    This paper is devoted to the actuator and sensor adaptive fault detection and isolation for armature controlled direct current motors. Unknown input observers ...
  6. [6]
    A Review of Parity Space Approaches to Fault Diagnosis
    This paper reviews the state of the art in fault detection and isolation for dynamic systems, based on the parity space concept.
  7. [7]
    [PDF] Analytical Redundancy and the Design of Robust Failure Detection ...
    Using the concept of parity relations, residuals can be generated in a number of ways and the design of a robust residual generation process can be formulated ...
  8. [8]
    Fault-Diagnosis Systems - SpringerLink
    In stockThe book covers fault detection methods, including signal analysis and process models, and fault-diagnosis methods like classification and inference, and fault ...Missing: survey | Show results with:survey
  9. [9]
    Deep Learning Techniques in Intelligent Fault Diagnosis and ... - NIH
    This paper tries to give a comprehensive guideline for further research into the problem of intelligent industrial FDP for the community.
  10. [10]
    Model-based fault-detection and diagnosis – status and applications§
    Additive faults appear, e.g., as offsets of sensors, whereas multiplicative faults are parameter changes within a process. Now lumped-parameter processes are ...
  11. [11]
    Model-based fault-detection and diagnosis – status and applications
    Model-based methods of fault-detection were developed by using input and output signals and applying dynamic process models.<|separator|>
  12. [12]
    [PDF] MODEL-BASED FAULT DETECTION AND DIAGNOSIS
    Hence, parity equations are suitable for the detection of additive faults. They are simpler to design and to implement than output observer-based approaches and ...
  13. [13]
    A survey and classification of incipient fault diagnosis approaches
    Incipient faults almost occur gradually at a low rate in systems and usually are unnoticeable during their early stages. If diagnostic tools or proper ...
  14. [14]
    (PDF) Model-based Fault Diagnosis in Dynamic Systems Using ...
    The pro-posed fault diagnosis scheme has been tested on an real indus-trial chemical process in the presence of sensor, actuator and component faults. The ...
  15. [15]
    Detection of Additive and Multiplicative Faults - Parity Space vs ...
    Faults which appear in technical processes can often be described as additive or multiplicative faults with respect to the process model.
  16. [16]
    Fault Detection and Severity Level Identification of Spiral Bevel ...
    This study uses AI techniques, specifically ANN and KNN, to detect and identify the severity of faults in spiral bevel gears under different operating ...
  17. [17]
    [PDF] Fault Diagnosis and Fault Severity Prediction Based on ...
    While less severe faults might merely degrade performance, affecting the quality and quantity of output, serious faults can lead to complete system shutdowns,.
  18. [18]
    1. Introduction - SpringerLink
    Since without fault detection it is impossible to perform fault isolation and, consequently, fault identification, all efforts regarding the improvement of.
  19. [19]
  20. [20]
    Improved diagnosis of hybrid systems using instantaneous ...
    One approach to quantitative model-based fault detection and isolation (FDI) is based on analytical redundancy relations (ARRs) and fault signatures.
  21. [21]
    Fault diagnosis of machines via parameter estimation and ...
    The paper describes a general methodology for machines and other processes by using few measurements, dynamic process and signal models and parameter estimation
  22. [22]
  23. [23]
    On Fault Detectability and Isolability - ScienceDirect
    Several existing detectability and isolability definitions are reviewed. It is argued that two types of definitions have to be distinguished.
  24. [24]
    Fault diagnosis in dynamic systems using analytical and knowledge ...
    The paper reviews the state of the art of fault detection and isolation in automatic processes using analytical redundancy, and presents some new results.
  25. [25]
    [PDF] Generation of Analytical Redundancy Relations for FDI purposes
    Aug 13, 2008 · This paper presents the fundamental results obtained in this area. Keywords: redundancy relations, fault detection and isolation, model-based ...
  26. [26]
  27. [27]
    A Review of Signal Processing Techniques for Detection and ...
    This paper reviews signal processing techniques for fault detection, including Machine learning (ML), AI, Wide Area Measurement (WAM), and Phasor Measurement ...
  28. [28]
    A review of signal processing for fault diagnosis in systems with ...
    This paper reviews the performance of 133 variations of signal processing techniques, aiming to highlight those that present the most significant potential for ...
  29. [29]
    A Fault Diagnosis Method in BLDC Motor Drive Systems Using ...
    A Fault Diagnosis Method in BLDC Motor Drive Systems Using Moving Average Filter for Back Electromotive Force Signal Processing. Abstract: BLDC has the ...Missing: detection | Show results with:detection
  30. [30]
    Wavelet based rule for fault detection - ScienceDirect.com
    The paper presents wavelet based fault detection method and the analysis of error rates. The proposed method is based on signal representations in Daubechies ...
  31. [31]
    (PDF) Fault Isolation Based on Wavelets Transform - ResearchGate
    This paper evaluates how wavelet transform can be used to detect and isolate particular faults. The diagnostic method that is proposed is based on the ...
  32. [32]
    Frequency Energy Analysis in Detecting Rolling Bearing Faults
    This research explores a method of classifying rolling bearing faults utilizing the total energy gathered from the Power Spectral Density (PSD) of a Fast ...
  33. [33]
    Multi-fault diagnosis of ball bearing using FFT, wavelet energy ...
    The analysis results from ball bearing signals with six different faults in various working conditions show that the diagnosis approach based on using wavelet ...
  34. [34]
    Continuous wavelet transform technique for fault signal diagnosis of ...
    A fault signal diagnosis technique for internal combustion engines that uses a continuous wavelet transform algorithm is presented in this paper.
  35. [35]
    A Comparative Study of Time–Frequency Representations ... - MDPI
    Spectrograms are generated using the STFT, which analyzes non-stationary signals ... This makes CWT particularly well-suited for analyzing non-stationary signals ...
  36. [36]
    A Review of Feature Extraction Methods in Vibration-Based ... - MDPI
    This paper presents an empirical study of feature extraction methods for the application of low-speed slew bearing condition monitoring.
  37. [37]
    [PDF] Chapter 4 REVIEW OF VIBRATION ANALYSIS TECHNIQUES
    domain techniques, such as RMS, Crest Factor and Kurtosis, provide good bearing fault detection capabilities if the signal-to-noise ratio is sufficiently high.
  38. [38]
    Subspace method aided data-driven design of fault detection and ...
    This paper deals with data-driven design of fault detection and isolation (FDI) systems. The basic idea is to identify a primary form of residual generators ...
  39. [39]
    [PDF] Statistical Fault Detection with Applications to IMU Disturbances
    Often are sums of squared Gaussian residuals used as a test statistic and the result will then be chi-square distributed if the residuals are white and Gaussian ...
  40. [40]
    Sensor and actuator fault isolation by structured partial PCA with ...
    Partial PCA based on the link between PCA and parity relations is a useful method in fault isolation. By performing PCA on subsets of variables, a set of ...
  41. [41]
    Fault Detection and Isolation in Dynamic Systems Using Principal ...
    This paper proposes a decomposition of a global system in different subsystems by means of PCA framework, in respect to the IFATIS european project (EU-IST-2001 ...Missing: driven | Show results with:driven
  42. [42]
    Kernel Generalized Likelihood Ratio Test for Fault Detection of ...
    In this paper, we develop an improved fault detection (FD) technique in order to enhance monitoring abilities of nonlinear chemical processes.
  43. [43]
  44. [44]
    Testing the covariance matrix of the innovation sequence with ...
    Aug 6, 2025 · A new statistical fault detection technique based on the Kalman filter innovation covariance testing is proposed. The generalized variance ...<|separator|>
  45. [45]
    Fault detection of uncertain chemical processes using interval partial ...
    Therefore, this work addresses the problem of fault detection of uncertain chemical processes using interval input-output PLS-based generalized likelihood ratio ...
  46. [46]
    A fault detection, isolation, and identification technique for complex ...
    This paper presents a method, based on a generalized likelihood ratio test (GLRT), for combining fault detection, isolation, and identification in complex ...
  47. [47]
    Detection and Estimation of Multiple Fault Profiles Using ...
    This paper discusses a fault detection scheme based on a tunable generalized likelihood algorithm. We discuss the detector algorithm, and then demonstrate its ...
  48. [48]
    Convolutional Neural Network Based Fault Detection for Rotating ...
    Sep 1, 2016 · In this article we propose a feature learning model for condition monitoring based on convolutional neural networks.Missing: isolation seminal
  49. [49]
    Convolutional Neural Networks for Fault Diagnosis Using Rotating ...
    This paper will focus on developing a convolutional neural network (CNN) to learn features directly from frequency data of vibration signals and testing the ...Missing: isolation seminal
  50. [50]
    Application of long short-term memory recurrent neural networks for ...
    In this paper, long short-term memory recurrent neural networks were trained and tested to CH 4 leakage source in a chemical process module.
  51. [51]
    [PDF] Real-time pipeline leak detection and localization using an attention ...
    Apr 12, 2023 · The second step is to utilize the LSTM network to learn the temporal information and classify the sequential data. The recurrent Neural. Network ...
  52. [52]
    Exploiting Autoencoder-Based Anomaly Detection to Enhance ...
    The paper proposes a semi-supervised hybrid deep learning model using AE-GRU and anomaly detection to enhance cybersecurity in smart grids, accurately ...
  53. [53]
    Fault Detection and Diagnosis in Industrial Processes with ... - NIH
    Dec 29, 2021 · This work considers industrial process monitoring using a variational autoencoder (VAE). As a powerful deep generative model, the variational ...
  54. [54]
    Deep transfer learning strategy for efficient domain generalisation in ...
    Apr 24, 2023 · Wen et al. proposed a TL strategy that includes the use of a pre-trained ResNet-50 network to identify fault by fine-tuning just the fully ...
  55. [55]
    A Novel Mechanical Fault Diagnosis Based on Transfer Learning ...
    For fault diagnosis, convolutional neural networks (CNN) have been performing as a data-driven method to identify mechanical fault features in forms of ...
  56. [56]
    Machine Fault Detection Using a Hybrid CNN-LSTM Attention ... - NIH
    In this paper, the issue of predicting electrical machine failures by predicting possible anomalies in the data is addressed through time series analysis.Missing: spatio- | Show results with:spatio-
  57. [57]
    Bearing fault diagnosis with parallel CNN and LSTM - AIMS Press
    Jan 16, 2024 · We construct a fault diagnostic model based on convolutional neural network (CNN) and long short-term memory (LSTM) parallel network to extract their temporal ...
  58. [58]
    Standard H ∞ Filtering Formulation of Robust Fault Detection
    This paper studies the robust fault detection problem using the standard H∞ filtering formulation. With this formulation, the minimization of the ...Missing: seminal | Show results with:seminal
  59. [59]
    Robust fault detection based on adaptive threshold generation using ...
    Jul 6, 2011 · In this paper, robust fault detection based on adaptive threshold generation of a non-linear system described by means of a linear ...
  60. [60]
    [PDF] Robust fault detection based on adaptive threshold generation using ...
    SUMMARY. In this paper, robust fault detection based on adaptive threshold generation of a non-linear system described.
  61. [61]
    Using unknown input observers for robust adaptive fault detection in ...
    The purpose of this manuscript is to construct natural observers for vector second-order systems by utilising unknown input observer (UIO) methods.
  62. [62]
    Fuzzy‐Logic‐Based Control, Filtering, and Fault Detection for ...
    Oct 25, 2015 · This paper is concerned with the overview of the recent progress in fuzzy-logic-based filtering, control, and fault detection problems.
  63. [63]
    Fuzzy Model-Based Fault Detection Approach for Nonlinear Control ...
    This work uses Takagi-Sugeno fuzzy systems to approximate nonlinear processes, a fuzzy controller, image representations, and signal errors for fault detection.
  64. [64]
    Fuzzy Kalman observer for fault detection in nonlinear discrete ...
    Dec 10, 2015 · This paper presents an approach to design a fuzzy augmented state Kalman observer based on interval type-2 fuzzy logic for state estimation ...Missing: integration | Show results with:integration
  65. [65]
    Integrated trade‐off design of fault detection system for linear ...
    Feb 1, 2013 · Optimal trade-off between sensitivity to faults and robustness to disturbances does not guarantee optimal trade-off between FDR and FAR.
  66. [66]
    Norm-based design of robust FDI schemes for uncertain systems ...
    Aug 9, 2025 · To compare the fault sensitivity and robustness performances of ... disturbance. First, an internal dynamic variable is incorporated ...
  67. [67]
    [PDF] Distributionally robust trade-off design of parity relation based fault ...
    Inspired by robust control concepts, one worst-case approach employs system norms to describe robustness to disturbances and sensitivity to faults. It first ...
  68. [68]
    A Survey on Active Fault-Tolerant Control Systems - MDPI
    Fault, based on its location, can be classified to sensor, actuator, and plant (component or parameter) faults [17]. 2.1. Fault Types and Causes. In general, ...
  69. [69]
    Fault-tolerant control systems: A comparative study between active ...
    This paper demystifies active and passive fault-tolerant control systems (FTCSs) by examining the similarities and differences between these two approaches.Missing: seminal | Show results with:seminal
  70. [70]
    Integrated sensor/actuator FDI and reconfigurable control for fault ...
    Aug 6, 2025 · In this paper an approach to fault-tolerant flight control system design based on the integration of sensor/actuator fault detection and ...
  71. [71]
    Proactive fault-tolerant model predictive control - IEEE Xplore
    In this paper, we propose a proactive fault-tolerant Lyapunov-based model predictive controller (LMPC) that can effectively deal with an incipient control ...
  72. [72]
    (PDF) Hierarchical Design of Distributed Fault Tolerant Control ...
    PDF | This work deals with the description of a design procedure for hierarchical fault tolerant control (FTC) of large, distributed system. Following a.
  73. [73]
    Multi-Sensor Fault Detection, Identification, Isolation and Health ...
    This paper proposes a novel fault detection, isolation, identification and prediction (based on detection) architecture for multi-fault in multi-sensor systems ...
  74. [74]
    Model-based diagnosis and fault tolerant control for ensuring torque ...
    This paper studies the necessary steps for achieving functional safety using this model-based approach, and presents a case study on torque functional safety of ...Missing: unmanned | Show results with:unmanned
  75. [75]
    A Fault-Tolerant Control Architecture for Unmanned ... - Georgia Tech
    Like previous hierarchical fault-tolerant control architectures, this one is expandable vertically. However, this architecture separates itself from its ...
  76. [76]
    Angle of attack prediction using recurrent neural networks in flight ...
    Dec 17, 2021 · The best method to cope with faulty sensor measurements is to create trustworthy virtual sensors to replace them. The approach focuses on ...
  77. [77]
    Accommodation of actuator fault using local diagnosis and IMC-PID
    Oct 8, 2014 · This paper presents an Internal Model Control (IMC) based PID control system architecture that can tolerate faulty actuators in a networked ...
  78. [78]
    Model Predictive Fault Tolerant Control for Omni-directional Mobile ...
    May 27, 2019 · This paper describes the design of a Fault Tolerant Control scheme for an omni-directional mobile robot with four mecanum wheels.
  79. [79]
  80. [80]
    [PDF] Reliability of Fault Tolerant Control Systems: Part I 1
    This paper reports. Part. I of a two part effort, that is intended to delineate the relationship between reliabil- ity and fault tolerant control.
  81. [81]
    Reconfigurable Fault-tolerant Control: A Tutorial Introduction
    This paper provides a tutorial introduction to reconfigurable control and surveys recent advances on this topic.
  82. [82]
    [PDF] Fault-Tolerant Avionics - UNC Computer Science
    The Apollo guidance and control system employed proven, highly reliable components and triple modular redundancy (TMR) with voting to select the correct output.
  83. [83]
    Hardware reconfiguration algorithm in multiprocessor systems of ...
    Aug 10, 2025 · Reconfiguration of multiprocessor systems makes it possible to improve their failure-resistance that is especially important for the ...<|separator|>
  84. [84]
    Reliability analysis of the triple modular redundancy system under ...
    Sep 7, 2023 · Triple modular redundancy (TMR) is a robust technique utilized in safety-critical applications to enhance fault-tolerance and reliability.
  85. [85]
    Simple Adaptive Control‐Based Reconfiguration Design of Cabin ...
    Mar 30, 2021 · In particular, the reconfiguration system can update the control law online when the fault occurs without the system identification process.
  86. [86]
    Fault-tolerant control strategies for industrial robots: state of the art ...
    Aug 30, 2025 · For reconfiguration or control adaptation, an alternate control law is switched to maintain functionality of the robot. Reliable and fail-safe ...
  87. [87]
    [PDF] arXiv:2302.06473v1 [eess.SY] 10 Feb 2023
    Feb 10, 2023 · In this work we present a quantitative approach, employing directed graphs to the simulation and automatic reconfiguration of a fault in a ...<|separator|>
  88. [88]
    [PDF] Model-free reconfiguration mechanism for fault tolerance - HAL
    Jul 4, 2011 · Note that a short detection delay requirement will need a short dwell-time that clearly conflicts with the stability of the closed-loop switched ...
  89. [89]
    (PDF) Invariant Sets and Control Synthesis for Switching Systems ...
    Aug 7, 2025 · A structural procedure is proposed for solving the problem of maximal safe-set determination based on maximal controlled invari- ant sets.<|separator|>
  90. [90]
    [PDF] The Effect of Fault Detection, Diagnosis, and Recovery on ...
    Dec 22, 2023 · Faults cause system instability and performance degradation. ... system performance, sudden deterioration, and delayed recovery. In ...
  91. [91]
    An Overview of Vibration Analysis Techniques for the Fault ...
    This paper provides a fairly brief overview of methods and means for monitoring the condition and diagnosis of rolling bearings and also describes one of the ...
  92. [92]
    Effect of Drive and Power System Faults on Wind Turbine Shutdown ...
    Aug 5, 2025 · The proposed strategies are estimated to reduce unplanned downtime by up to 12%, potentially lowering maintenance costs by approximately 8%–10%, ...
  93. [93]
    Fault diagnosis and prognosis capabilities for wind turbine hydraulic ...
    Feb 1, 2025 · Defects in the pitch system are responsible for up to 20% of a wind turbine downtime. Thus, monitoring such defects is essential for avoiding it ...
  94. [94]
    Diagnosis of process valve actuator faults using a multilayer neural ...
    This paper investigates the ability of a multilayer neural network to diagnose actuator faults in a Fisher-Rosemount 667 process control valve.
  95. [95]
    Model Based Diagnostic Module for a FCC Pilot Plant
    It indicates that if the pressure stripper is low and the valve opening that regulates the pressure is 0%, then there is a leakage between the riser and ...
  96. [96]
    (PDF) Use of Multivariate Statistical Methods for Control of Chemical ...
    This thesis focuses on the study and application of multivariate statistical methods to control product quality in chemical batch processes. These multivariate ...
  97. [97]
    Fault Detection and Identification for Longwall Machinery Using ...
    Real-time fault detection and identification (FDI) offers maintenance personnel the ability to minimise, and potentially eliminate one or more of these factors, ...
  98. [98]
    Unlocking the Power of Artificial Intelligence in Manufacturing with ...
    Feb 19, 2024 · The example from electronics manufacturing is not an isolated case: Siemens already employs AI in numerous applications for quality ...
  99. [99]
    FDD: Get The Savings Rolling In! - Siemens Blog
    Sep 15, 2025 · Fault Detection & Diagnostics (FDD) applies rules, AI, and advanced analytics to continuously compare live system data against expected behavior ...Missing: factories | Show results with:factories
  100. [100]
    [PDF] Sensor Systems for Extremely Harsh Environments
    Dec 22, 2022 · Sensor systems for harsh environments include sensing elements, integrated electronics, and signal processing, designed for high temperatures, ...
  101. [101]
    Impact of IoT on Manufacturing Industry 4.0: A New Triangular ...
    Implementation of IoT has enabled the manufacturers to embrace digital transformations from multiple contexts such as customer focus, efficient productivity, ...Impact Of Iot On... · Industry 4.0 · 3. Findings
  102. [102]
    Predictive Maintenance Cost Savings | ATS
    Predictive maintenance can save 8-12% over preventive, up to 40% over reactive, and 18-25% in maintenance costs, with reduced downtime.Missing: FDI 15-30%
  103. [103]
    Is Predictive Maintenance Really Cost-Effective? - Infraspeak Blog
    Companies decreased maintenance costs by 12% (less than in the McKinsey study, which pointed to 18-25%) and availability improved by 9%. · Predictive maintenance ...Missing: FDI | Show results with:FDI<|control11|><|separator|>
  104. [104]
    Fault detection and isolation of gas turbine - ScienceDirect.com
    The paper proposes an ensemble-based hierarchical classifier to diagnose and isolate faults in GE Frame9 gas turbines.
  105. [105]
    [PDF] An Integrated Architecture for Aircraft Engine Performance ...
    The model-based approach to gas path fault detection and isolation presented in this paper is a promising architecture for the processing of streaming ...Missing: GE | Show results with:GE
  106. [106]
    Fault detection and isolation in aircraft gas turbine engines. Part 1
    Jun 2, 2008 · This two-part paper formulates and validates a novel methodology of degradation monitoring of aircraft gas turbine engines with emphasis on ...Missing: GE | Show results with:GE
  107. [107]
    Boeing 787 – Flight controls - SmartCockpit - Airline training guides ...
    The flight control system automatically operates in the normal mode and any system fault will automatically reset the PFCs (Primary Flight Computers). The ...
  108. [108]
    International Aircraft Certification - Federal Aviation Administration
    Feb 27, 2025 · International aircraft certification includes bilateral agreements, working procedures, European Aviation Safety Agency information, and import ...
  109. [109]
    [PDF] the a380 flight control electrohydrostatic actuators, achievements ...
    The hydraulic actuators are normally active while the electrically powered actuators are normally stand-by and become operative in the event of a failure of the ...
  110. [110]
    [PDF] In-flight uncontained engine failure Airbus A380-842, VH-OQA
    Nov 4, 2010 · The A380 had two independent hydraulic systems identified as the Green system ... In the event of a hydraulic system failure, the following ...
  111. [111]
    Automotive Diagnostic Standards - x-engineer.org
    This article delves into the core diagnostic communication standards governing modern automotive systems, including ISO 15031, ISO 27145, ISO 14229, ...
  112. [112]
    Introduction to the OBD-II Standard - Kvaser
    The OBD-II standard defines DTCs (Diagnostic Trouble Codes) as well as other diagnostic information that is usually passed through a gateway to the OBD II port.Missing: fault isolation
  113. [113]
    Safety assurance for automated systems in transport: A collective ...
    Tesla's Artificial Intelligence (AI) team uses data collected from Autopilot-equipped vehicles to continuously develop its ML models and algorithms, and ...
  114. [114]
    A Guide to Automotive Safety Integrity Levels (ASIL) - Jama Software
    ASIL is defined by the ISO 26262 standard, part nine, and is adapted from the Safety Integrity Level (SIL) guidance published in IEC 61508.
  115. [115]
    Fault Detection of Li–Ion Batteries in Electric Vehicles - MDPI
    A failure in just one cell can trigger thermal runaway, where the cell's temperature rises rapidly and uncontrollably, possibly causing nearby cells to overheat ...Missing: 2020s | Show results with:2020s
  116. [116]
    Thermal runaway prevention and mitigation for lithium-ion battery ...
    This paper provides a comprehensive review of the current understanding of thermal runaway and the various technologies to prevent or mitigate thermal runaway.Missing: isolation | Show results with:isolation
  117. [117]
    Fault-Tolerant Platforms for Automotive Safety-Critical Applications
    Fault-tolerant electronic sub-systems are becoming a standard requirement in the automotive industrial sector as electronics becomes pervasive in present cars.