Fact-checked by Grok 2 weeks ago

Uncertainty analysis

Uncertainty analysis, often referred to as (UQ), is the systematic process of identifying, evaluating, and expressing the uncertainties associated with results, model predictions, or simulations, characterizing the dispersion of values that could reasonably be attributed to the being assessed. This involves quantifying the possible distribution of errors arising from various sources, such as instruments, environmental factors, or modeling assumptions, to provide a realistic estimate of reliability rather than a single deterministic value. In scientific and contexts, uncertainty analysis is essential for ensuring the validity of experimental results and computational models, enabling informed by assessing how variations in inputs propagate to outputs. It addresses doubts in quantitative assessments, , and epidemiological studies by distinguishing between aleatory (inherent randomness) and epistemic (knowledge-based) uncertainties, thereby improving the robustness of predictions in fields like and environmental modeling. For instance, in finite element analysis for , it evaluates variability in material properties and boundary conditions to predict structural performance under real-world conditions. The core methodology follows standardized procedures, such as those outlined in the (GUM), which recommends a two-stage evaluation: Type A, based on statistical analysis of repeated observations to compute experimental standard deviations, and Type B, relying on scientific judgment, manufacturer specifications, or prior knowledge to estimate uncertainties from non-statistical sources. These component uncertainties are then combined using the law of , typically through variance addition for independent inputs, accounting for sensitivities via partial derivatives in functional models: u_c^2(y) = \sum_{i=1}^N c_i^2 u^2(x_i), where c_i = \partial f / \partial x_i represents sensitivity coefficients. This framework promotes international comparability of results by requiring full disclosure of uncertainty sources and calculations. Beyond , advanced techniques like simulations and extend uncertainty propagation to complex, nonlinear systems, helping prioritize influential parameters and reduce overall uncertainty through targeted refinements. Recent advances as of 2025 have increasingly integrated UQ with and techniques to enhance predictive reliability in complex systems. In engineering design, such methods mitigate risks by quantifying forward uncertainty in outputs from input variabilities, supporting optimization and validation against experimental data. These approaches, grounded in probability and statistics, ensure that reported results include confidence intervals, fostering trust in applications ranging from simulations to civil assessments.

Fundamentals

Definition and Scope

Uncertainty analysis is the process of identifying, characterizing, and quantifying uncertainties arising from inputs, models, or measurements to evaluate their effects on outputs or conclusions. This involves systematically assessing limitations in or to provide a more complete understanding of results, often distinguishing between random and systematic components. The foundations of uncertainty analysis trace back to early statistical work on error propagation, notably Karl Pearson's 1898 contributions to calculating probable errors in frequency constants and their influence on variation and correlation. These ideas evolved through the 20th century, culminating in standardized frameworks such as the Guide to the Expression of Uncertainty in Measurement (GUM), first published in 1995 and updated with minor corrections in 2008 by the Joint Committee for Guides in Metrology (JCGM). As of 2024, an introductory part (ISO/IEC Guide 98-1) was published, and in July 2025, the JCGM proposed a new definition of measurement uncertainty under review for future editions, including potential updates to the International Vocabulary of Metrology (VIM). The GUM, developed in response to a 1977 request from the Comité International des Poids et Mesures (CIPM) for harmonized uncertainty reporting, provides general rules for evaluation and expression applicable across measurement domains. Uncertainty analysis finds broad application in for predicting , in physics for validating experimental results, in for modeling pollutant dispersion or climate impacts, and in under for formulation. It plays a critical role in by quantifying potential variabilities in hazard predictions and in by informing design margins and failure probabilities. A foundational tool in this process is the law of for a y = f(x_1, \dots, x_n) with input quantities, where the combined standard uncertainty u_c(y) is given by u_c(y) = \sqrt{ \sum_{i=1}^n \left( \frac{\partial f}{\partial x_i} \right)^2 u^2(x_i) }, known as the root-sum-square method, which approximates the output uncertainty based on input standard uncertainties u(x_i) and sensitivity coefficients.

Types of Uncertainty

In uncertainty analysis, uncertainties are broadly classified into aleatoric and epistemic types, reflecting fundamental distinctions in their origins and reducibility. Aleatoric uncertainty, also known as irreducible or stochastic uncertainty, arises from inherent randomness or variability in the system being studied, such as quantum effects, natural fluctuations, or stochastic processes that cannot be eliminated regardless of additional information. This type is typically modeled using probability distributions, like the Gaussian distribution, to capture the frequentist variability observed in repeated experiments under identical conditions. For instance, in weather forecasting, aleatoric uncertainty manifests in the unpredictable fluctuations of atmospheric patterns driven by chaotic dynamics. In contrast, epistemic uncertainty stems from a lack of or incomplete , including measurement errors, model approximations, or insufficient , and is potentially reducible through further or improved methods. It represents subjective beliefs that can be updated as new evidence becomes available, often quantified in Bayesian frameworks where prior distributions evolve with . An example is the uncertainty in readings from an uncalibrated sensor, which diminishes once is performed to correct for systematic biases. This distinction is crucial: aleatoric uncertainty sets a fundamental limit on predictability, while epistemic uncertainty highlights opportunities for refinement in analysis or experimentation. Beyond these primary categories, uncertainty analysis recognizes other specific classifications that arise in practical applications, particularly in modeling and contexts. Parameter uncertainty pertains to variability or imprecision in the input values or coefficients of a model, often due to errors from limited observations. Model form uncertainty, a subset of epistemic uncertainty, arises from inadequacies in the structural of the , such as missing physical processes or oversimplified assumptions in the mathematical formulation. For example, in simulations, model form uncertainty may occur if turbulence effects are inadequately captured by the chosen equations. Scenario uncertainty, meanwhile, addresses unknowns related to future conditions or alternative pathways, such as varying environmental forcings or decision contexts, requiring the of multiple plausible futures to bound potential outcomes. These classifications provide a framework for dissecting complex uncertainties, enabling targeted strategies in fields like physical experiments where both aleatoric variability in measurements and epistemic gaps in can coexist.

Uncertainty in Physical Experiments

Sources of Experimental Uncertainty

Experimental uncertainty arises primarily from two categories: systematic errors, which introduce consistent biases, and random errors, which cause variability in repeated measurements. Systematic errors often stem from imperfections in the experimental setup or procedure, such as instrument drift where a scale's shifts over time due to wear, leading to persistent over- or underestimation of values. Environmental factors, like temperature fluctuations affecting the expansion of measurement scales or sensors, can also induce systematic biases by altering the physical properties of equipment during the experiment. Operator inconsistencies, including biased reading techniques or procedural variations, further contribute to these errors by introducing human-induced offsets that repeat across trials. Random errors, in contrast, manifest as unpredictable fluctuations due to inherent in the system, such as electronic in detectors or unresolved aleatoric effects from physical processes. These are typically quantified through repeated measurements, where the spread in results reflects sampling variability or transient disturbances like vibrations in the environment. In experiments, aleatoric uncertainties represent irreducible , such as quantum fluctuations in particle detection, while epistemic uncertainties arise from incomplete of experimental conditions. Beyond these broad categories, specific sources include the finite resolution limits of instruments, where the smallest detectable change exceeds the true variation, leading to quantization errors in readouts. Uncertainties can also propagate from auxiliary measurements, for instance, the voltage input in a introducing through its own tolerances. Human factors, such as errors from off-angle readings of analog dials, add further variability by depending on the observer's position and precision. A practical example occurs in of materials, where grip slippage at the specimen-machine interface causes systematic underestimation of applied force, while hysteresis—residual deformation in the gauge after load cycles—introduces random variability in elongation measurements. To ensure reliability, experimental uncertainties must comply with ISO standards, particularly the Guide to the Expression of Uncertainty in Measurement (; JCGM GUM-1:2023), which emphasizes to international references for calibrating instruments and validating procedures.

Quantification and Propagation in Experiments

In experimental settings, uncertainty quantification begins with evaluating the standard uncertainty associated with individual measurements. Type A evaluation employs statistical methods based on repeated observations of the quantity under the same conditions. The standard deviation of the from these observations provides the Type A standard uncertainty, calculated as u(X_i) = \frac{s(X_i)}{\sqrt{n}}, where s(X_i) = \sqrt{\frac{1}{n-1} \sum_{k=1}^n (X_{i,k} - \bar{X}_i)^2} is the experimental standard deviation, \bar{X}_i is the , and n is the number of observations. This approach captures random variations, such as those from instrument noise or environmental fluctuations in a of voltage. Type B evaluation, in contrast, relies on non-statistical , including prior , manufacturer specifications, data, or assumed probability distributions for bounds. For instance, if an instrument's provides a over an of width $2a, the is u(x_i) = \frac{a}{\sqrt{3}}. This method addresses systematic effects, such as drift, where repeated measurements are infeasible, and is often informed by datasheets or expert judgment in experimental protocols. Both Type A and Type B uncertainties contribute equally to the overall assessment, regardless of their origin. The GUM framework, updated as JCGM GUM-1:2023, standardizes this process, with ongoing discussions as of 2025 on refinements to definitions. Propagation of these uncertainties occurs when the measurand is a function y = f(x_1, x_2, \dots, x_N) of multiple input quantities. The law of propagation of uncertainty combines the standard uncertainties via quadrature, yielding the combined standard uncertainty u_c(y) = \sqrt{ \sum_{i=1}^N \left( \frac{\partial f}{\partial x_i} \right)^2 u^2(x_i) }, assuming uncorrelated inputs; covariances are included if dependencies exist. This partial differentiation approach linearizes the function around the best estimates, suitable for small uncertainties in physical experiments like determining electrical resistance from voltage and current measurements. In practice, the combined standard uncertainty informs on experimental plots, representing the dispersion of the result at a one-standard-deviation level. For example, in a plot of versus time, derived from u_c visualize the propagated uncertainty from readings and environmental controls. The Guide to the Expression of Uncertainty in Measurement () framework standardizes this process, culminating in the expanded uncertainty U = k \cdot u_c(y), where the coverage factor k (typically 2 for approximately 95% coverage assuming ) defines a around the result.

Uncertainty in Mathematical Modeling

Sources of Modeling Uncertainty

Modeling uncertainty arises in mathematical models due to inherent limitations in representing complex real-world phenomena through theoretical constructs and approximations. These uncertainties differ from experimental ones by originating in the model's formulation rather than processes. Key sources include , model structure uncertainty, input data uncertainty, and errors in numerical implementations. Parameter uncertainty stems from the variability in estimated coefficients or values used within the model, often resulting from limited observational or reliance on distributions for . For instance, in dynamical systems models, parameters like reaction rates in may be calibrated from sparse datasets, leading to broad posterior distributions that reflect this variability. This type of is particularly pronounced in inverse problems where parameters are inferred indirectly, amplifying errors in predictions. Model structure uncertainty originates from simplifying assumptions made during model development, such as choosing linear approximations over nonlinear ones in differential equations, which can introduce systematic biases. These errors occur because no single model can fully capture all relevant physics or interactions without trade-offs in complexity and computational feasibility. For example, approximating turbulent flows with Reynolds-averaged Navier-Stokes equations ignores finer-scale fluctuations, leading to discrepancies in simulated outcomes. Model structure represents a form of epistemic tied to incomplete of the appropriate model form. Input data uncertainty enters mathematical models through boundary conditions, forcing functions, or initial states derived from experimental measurements, propagating theoretical errors from empirical inputs. Even high-quality data carries inherent variability that affects model reliability, such as fluctuating environmental inputs in simulations. This source bridges empirical and modeling domains but is distinctly modeled as an input within the theoretical framework. In numerical models, discretization errors arise from approximating continuous differential equations via , finite element, or other schemes, introducing errors that depend on grid resolution and time steps. For example, coarser spatial grids in simulations can lead to up to several percent deviations in key quantities like drag coefficients, with error magnitudes scaling as the grid size to a power determined by the method's . These errors are epistemic in nature, stemming from the approximation process rather than randomness. A prominent application of these uncertainties appears in models, where unresolved sub-grid processes—such as formation or mixing—are handled via parameterization schemes that introduce both structural and uncertainties. For instance, convective parameterization in general circulation models can vary equilibrium by 1–3°C, highlighting how simplifications in representing small-scale physics contribute to overall projection spreads. These schemes often rely on tunable parameters fitted to observations, compounding input data uncertainties from historical records.

Propagation Methods in Models

In mathematical modeling, uncertainty propagation refers to the process of quantifying how input uncertainties—such as those arising from model parameters or structural assumptions—affect the uncertainty in model outputs. This is essential for assessing the reliability of predictions in fields like and , where models often involve nonlinear relationships between inputs and outputs. Propagation methods can be broadly classified into analytical and numerical approaches, each balancing computational efficiency with accuracy depending on the model's complexity. Analytical propagation methods rely on approximations, typically using expansions around nominal input values to estimate output uncertainty. For a model output y = f(\mathbf{x}), where \mathbf{x} is of inputs with known means and variances/covariances, the first-order yields the propagated variance as: \text{Var}(y) \approx \sum_{i=1}^n \left( \frac{\partial f}{\partial x_i} \right)^2 \text{Var}(x_i) + 2 \sum_{i=1}^n \sum_{j>i}^n \frac{\partial f}{\partial x_i} \frac{\partial f}{\partial x_j} \text{Cov}(x_i, x_j), evaluated at the mean values of \mathbf{x}. This law of assumes small input uncertainties and local linearity, making it computationally inexpensive but less accurate for highly nonlinear models. methods extend this by treating input deviations as small perturbations around a base state, using first-order sensitivities (partial derivatives) to approximate output changes; for instance, the output perturbation \delta y \approx \sum_i \frac{\partial f}{\partial x_i} \delta x_i. These techniques are particularly useful for quick assessments in linear or mildly nonlinear systems, such as preliminary design stages. Numerical methods, in contrast, handle nonlinearities more robustly by simulating the model under varied input conditions. simulation involves drawing random samples from the input probability distributions, evaluating the model for each sample to generate an empirical distribution of outputs, and estimating moments like and variance from the results. This approach provides unbiased estimates of the full output (PDF) without assuming , though it requires many model evaluations for , especially in high dimensions. To improve efficiency, (LHS) stratifies the input space into equally probable intervals and samples systematically across them, reducing the number of required simulations while maintaining low variance in estimates compared to simple random sampling. LHS was introduced as a stratified alternative for computer model uncertainty analysis, ensuring better coverage of the input domain. A key consideration in selecting methods is the between computational cost and accuracy: analytical and approaches are fast (often requiring only a few model evaluations for sensitivities) but approximate, suiting linear cases, while excels in nonlinear scenarios at higher costs, reducible by .

Parameter Calibration and Output Analysis

Calibration Techniques

techniques in analysis involve estimating model to align predictions with observed , thereby reducing epistemic arising from incomplete of . These methods refine values by minimizing discrepancies between model outputs and measurements, often incorporating statistical frameworks to quantify remaining uncertainties. from modeling serves as the primary target, guiding the selection of approaches that balance fit quality with robustness to limitations. One foundational approach is least-squares optimization, which estimates parameters by minimizing the sum of squared residuals between observed data and model predictions. Formally, the parameter vector \theta is determined as \theta = \arg\min_{\theta} \sum_{i=1}^{n} (y_{\text{obs},i} - f(x_i; \theta))^2, where y_{\text{obs},i} are observations, f(x_i; \theta) is the model function, and n is the number of data points. This method is particularly effective for nonlinear models, providing efficient estimates with small datasets while enabling approximate intervals for parameters. However, it requires iterative algorithms like Levenberg-Marquardt and can be sensitive to outliers or poor initial guesses. For probabilistic models, (MLE) extends this by maximizing the likelihood of observing the data given the parameters, explicitly accounting for measurement errors and noise distributions. In models, MLE calibrates kernel scale parameters to adapt to potential misspecifications, ensuring uncertainty estimates remain conservative rather than overly precise. This approach yields parameter estimates that incorporate error structures, improving reliability in settings with deterministic but noisy observations. To capture full posterior distributions of parameters under Bayesian frameworks, (MCMC) methods sample from the joint posterior, integrating prior beliefs with likelihoods to propagate uncertainties. MCMC explores high-dimensional parameter spaces, generating chains that converge to the target distribution for robust , especially in complex computer models where traditional optimization may overlook posteriors. This technique enhances predictions by quantifying all sources of discrepancy, as demonstrated in applications like environmental simulations. A practical example is the calibration of the Soil and Water Assessment Tool () hydrological model, where parameters governing soil hydrology are estimated using rainfall and streamflow data. In studies of watersheds like the Little Washita River basin, initial adjustments to the curve number (CN2) align surface runoff responses to observed rainfall events, followed by refinements to soil-related parameters such as groundwater delay (GW_DELAY) and baseflow recession (ALPHA_BF) to match hydrographs. This iterative process, often manual or automated, improves simulations of water balance components while highlighting parameter interactions. Parameter identifiability challenges, such as correlations leading to equifinality where multiple parameter sets yield similar outputs, are addressed through regularization techniques or informative priors to constrain the search space. The Generalized Likelihood Uncertainty Estimation (GLUE) methodology, for instance, evaluates ensembles of parameter sets using likelihood measures updated with priors, rejecting implausible combinations to enhance identifiability without assuming unique optima. However, GLUE has faced criticisms for lacking formal Bayesian coherence and potentially overestimating uncertainties, though its proponents argue it pragmatically handles model equifinality. Standards like ASTM E2935 support verification by providing equivalence testing protocols to confirm calibrated models align with reference processes within specified limits.

Output Uncertainty Assessment

Output uncertainty assessment evaluates the range of possible values in the final results of calibrated models or experiments, providing a measure of reliability for predictions or decisions. After parameter calibration, uncertainty in outputs arises from propagated input variabilities and inherent model limitations, necessitating methods to quantify and interpret this spread. Calibrated parameters serve as fixed or probabilistic inputs to forward propagation techniques, enabling the generation of output distributions. Confidence intervals represent a primary tool for summarizing output , derived from either analytical propagation of variances or empirical quantiles from simulations. In approaches, samples from input distributions are propagated through the model to obtain an empirical distribution of outputs, from which a 95% is typically constructed as the interval between the 2.5th and 97.5th percentiles. This non-parametric method avoids assumptions about output and provides a direct probabilistic bound on the output variability. To attribute output uncertainty to specific inputs, variance-based global decomposes the total output variance into contributions from individual parameters and their interactions. The first-order Sobol index for input x_i, denoted S_i, quantifies the fraction of output variance attributable to x_i alone, calculated as: S_i = \frac{\mathrm{Var}\left( \mathbb{E}[y \mid x_i] \right)}{\mathrm{Var}(y)} where y is the model output. Higher-order indices capture interactions, offering a complete that aids in identifying dominant sources post-calibration. Reporting output uncertainty emphasizes —the likelihood that the falls within the assessed —and its implications for risk-based decisions. For instance, in risk analysis, uncertainty assessments inform decision thresholds by comparing predicted failure probabilities against acceptable risk levels, ensuring conservative margins where coverage is incomplete. Ensembles of models or simulations enhance robustness by averaging multiple realizations, reducing epistemic components and yielding tighter, more reliable uncertainty bounds. A key distinction in output is between uncertainty, stemming from parameter non-uniqueness during fitting, and prediction uncertainty, which encompasses additional sources like future inputs. uncertainty is isolated by propagating posterior distributions, while total prediction uncertainty includes aleatory variations; failing to separate them can overestimate reliability. In structural reliability, uncertainty bands around simulation outputs visualize propagated variabilities, such as in estimation models where fractile bands about the highlight epistemic and aleatory effects on structural predictions. These bands guide adjustments by quantifying the of potential risks under uncertainties.

Advanced Methods

Sensitivity analysis within uncertainty analysis evaluates how variations in input parameters affect model outputs, enabling the identification of influential factors to guide model refinement and . Local focuses on the linear response of outputs to small perturbations in inputs around a nominal point, typically computed via partial derivatives \frac{\partial y}{\partial x_i}, where y is the output and x_i an input parameter. This approach approximates the rate of change in output for input changes, providing insights into immediate impacts but assuming and neglecting interactions or distributions beyond the local region. Global sensitivity analysis extends this by assessing input influences across their full ranges, accounting for nonlinearities, interactions, and input distributions. Sobol indices decompose the total output variance into contributions from individual inputs (first-order or main effects, S_i = \frac{V(E(y|x_i))}{V(y)}), groups of inputs (higher-order interactions, S_{ij} = \frac{V(E(y|x_i, x_j)) - V(E(y|x_i)) - V(E(y|x_j))}{V(y)}), and total effects (S_{T_i}, including all interactions involving x_i). These indices sum to unity for the variance decomposition, allowing prioritization of parameters that explain most uncertainty in outputs. Common methods for global sensitivity include Morris screening, which uses elementary effects—finite differences \frac{y(x_1, \dots, x_i + \Delta, \dots, x_k) - y(x)}{ \Delta }—across multiple trajectories to rank parameters by mean absolute effect (\mu^*) and nonlinearity (\sigma), efficiently screening large parameter sets at low computational cost. The Fourier Amplitude Sensitivity Test (FAST) employs , transforming inputs into sinusoidal functions with unique and computing via Fourier coefficients of the output, isolating main effects and interactions through variance in domains. In epidemiological modeling, identifies key parameters influencing outbreak predictions; for instance, in a dengue model, the mosquito biting rate (B) and mosquito lifespan (\mu_m) exhibited the highest sensitivity indices (approximately +1 and -1.04, respectively) on the \mathcal{R}_0, indicating that interventions targeting could substantially reduce predicted infections. Sensitivity analysis integrates with uncertainty propagation by attributing output variance to specific inputs, prioritizing reducible epistemic uncertainties (e.g., those from parameter ignorance) for targeted over irreducible aleatory ones. This targets output uncertainty from methods, focusing attribution. However, high-dimensional problems pose computational challenges, as estimating indices like Sobol's requires O(k N) model evaluations ( k inputs, N samples), often mitigated by grouping similar factors or surrogate models to reduce effective dimensionality.

Bayesian Uncertainty Quantification

Bayesian uncertainty quantification provides a probabilistic framework for incorporating prior knowledge and updating beliefs with to assess both and predictive uncertainties in complex models. At its core, this approach relies on , which yields the posterior distribution of parameters \theta given y as p(\theta \mid y) \propto p(y \mid \theta) p(\theta), where p(y \mid \theta) is the likelihood and p(\theta) is the prior distribution reflecting initial beliefs about the . This posterior encapsulates uncertainty by representing the updated over \theta, while predictive uncertainty arises from integrating over this distribution to obtain the p(\tilde{y} \mid y) = \int p(\tilde{y} \mid \theta) p(\theta \mid y) \, d\theta, which accounts for both aleatoric variability in new observations \tilde{y} and epistemic uncertainty in \theta. In settings with multi-level uncertainties, such as those involving spatial data, hierarchical Bayesian models extend this framework by structuring parameters across levels, allowing shared information to propagate through the hierarchy. For instance, in disease mapping, random effects can model unstructured heterogeneity and structured spatial correlations using pairwise difference priors, enabling "borrowing strength" from neighboring areas to stabilize estimates in regions with sparse data. This approach quantifies uncertainties at multiple scales, producing posterior distributions that capture both local variability and global patterns. To compute these posteriors, (MCMC) methods are commonly employed, but alternatives like and offer scalable options for high-dimensional problems. approximates the posterior by optimizing a simpler distribution to minimize the , providing faster at the cost of some accuracy compared to MCMC, while iteratively samples from conditional distributions to explore the joint posterior. These techniques facilitate in large-scale applications by generating samples for estimating credible intervals. A practical example is the calibration of climate models, where Bayesian updating integrates observational data to refine parameters like equilibrium climate sensitivity (ECS). In the Pathfinder model, priors informed by CMIP6 and IPCC AR6 assessments are updated with historical CO₂ and global mean surface temperature data from 1750–2021, yielding posterior estimates such as ECS = 3.3 ± 0.7 K (1σ) and transient climate response (TCR) = 1.9 ± 0.3 K (1σ), quantifying remaining uncertainties in projections. Model , arising from competing model structures, is addressed through Bayesian model averaging (BMA), which computes predictions as a weighted over plausible models, with weights given by their posterior model probabilities. This method avoids overconfident inferences from by explicitly accounting for across the model space, leading to improved out-of-sample performance. BMA is particularly advantageous over frequentist methods in small-sample scenarios, where it leverages priors to achieve narrower credible intervals and higher with fewer observations—often requiring only 2–3 times the number of parameters in data, compared to 4–5 for maximum likelihood approaches—thus enhancing reliability in data-limited settings.