Fact-checked by Grok 2 weeks ago

Variance-based sensitivity analysis

Variance-based sensitivity analysis (VBSA) is a global sensitivity analysis method that quantifies how the uncertainty in the output of a mathematical or computational model can be apportioned to different sources of uncertainty in its input variables, by decomposing the total variance of the output into contributions from individual inputs and their interactions.^[1] This approach provides a comprehensive measure of input importance, capturing both main effects and higher-order interactions, unlike local sensitivity methods that examine changes around a single point.^[2] The foundations of VBSA were laid by Ilya M. Sobol' in the early 1990s, who introduced the concept of sensitivity indices derived from functional analysis of variance (ANOVA) to assess the relative importance of inputs in nonlinear models. Sobol's seminal work demonstrated that any square-integrable function can be uniquely decomposed into orthogonal terms, allowing the variance of the model output to be expressed as a sum of variances from subsets of inputs.^[3] Subsequent developments by researchers like Andrea Saltelli refined estimation techniques, making VBSA practical for complex, computationally expensive models through efficient sampling strategies.^[1] Central to VBSA are the Sobol' indices, which include first-order indices S_i measuring the fraction of output variance attributable solely to input i, and total-order indices S_{T_i} capturing the total contribution of input i including all interactions. These indices are typically estimated using Monte Carlo methods, such as Sobol' sequences or Latin hypercube sampling, which generate input ensembles to approximate the expected conditional variances.^[2] The method assumes input independence and focuses on unconditional sensitivities, providing interpretable results even for black-box models where internal structure is unknown.^[1] VBSA has broad applications across disciplines, including engineering design for uncertainty quantification in simulations, environmental modeling to identify key drivers of climate or hydrological outputs, and risk assessment in finance and epidemiology.^[4] For instance, it has been used to analyze biomass energy scenarios by ranking model parameters influencing predictions, and in structural mechanics to evaluate nonlinear behaviors under uncertainty.^[5]^[6] Its ability to handle high-dimensional problems and reveal interaction effects makes it a preferred tool for model validation and simplification.^[7]

Introduction

Definition and Principles

Variance-based sensitivity analysis (VBSA) is a global sensitivity analysis technique that quantifies the contribution of uncertainties in model inputs to the uncertainty in the model output by decomposing the total output variance into partial variances attributable to individual input factors and their interactions.^[8]^[3] The method models the system as Y = f(X_1, \dots, X_k), where Y represents the scalar output, X_1, \dots, X_k are the input factors treated as independent random variables with specified probability distributions, and f is the computational model or function relating inputs to output.^[8]^[3] The core principle involves analyzing output uncertainty via variance decomposition, apportioning \mathrm{Var}(Y) to contributions from each X_i and combinations thereof, under the assumption that the model is square integrable and that integrals over the input space exist.^[8]^[3] As part of global sensitivity analysis, VBSA evaluates effects across the entire input domain, in contrast to local methods that rely on derivatives at specific points and overlook interactions or nonlinearities.^[8] This comprehensive exploration enables VBSA to identify how input variability propagates through the model, with Sobol' indices serving as the primary measures derived from the decomposition.^[3]

Historical Development

The origins of variance-based sensitivity analysis trace back to the 1970s, when early methods emerged to quantify how uncertainties in model inputs contribute to output variability through variance decomposition. A foundational approach was the Fourier Amplitude Sensitivity Test (FAST), introduced by Cukier et al. in 1978, which utilized Fourier series expansions to decompose the variance of model outputs into contributions from individual inputs and their interactions, enabling global sensitivity assessment for nonlinear systems. This method marked a shift from local, one-at-a-time perturbations toward comprehensive exploration of input spaces, laying groundwork for subsequent variance-focused techniques.^[9] A pivotal milestone occurred in 1993 with Ilya M. Sobol''s introduction of a rigorous variance decomposition framework, formalizing the allocation of output variance to specific input factors via Sobol' indices and establishing variance-based sensitivity analysis as a cornerstone of global sensitivity methods. Building on this, the 2000s saw significant advancements in computational efficiency, particularly through Monte Carlo-based estimation procedures developed by Andrea Saltelli and collaborators, which optimized sampling designs like Latin Hypercube and radial designs to reduce the number of model evaluations required for accurate index computation in complex models.00280-1) These innovations, detailed in Saltelli et al.'s 2008 primer, democratized the application of Sobol' indices across engineering and environmental sciences. In the 2010s, extensions addressed real-world challenges such as correlated input variables, with methods like those proposed by Most in 2012 adapting variance decomposition to dependent inputs by incorporating copula-based transformations or modified estimators, preserving the interpretability of sensitivity indices.^[10] More recently, from the 2020s to 2025, variance-based sensitivity analysis has integrated with machine learning for surrogate modeling, where data-driven approximations accelerate computations in high-dimensional settings, such as climate models analyzing atmospheric variability; for instance, frameworks combining Gaussian processes or neural networks with Sobol' indices have enabled scalable assessments of parameter impacts in simulations exceeding thousands of dimensions.^[11]

Mathematical Foundations

Variance Decomposition

Variance-based sensitivity analysis relies on the decomposition of the total variance of a model's output into contributions from individual input variables and their interactions. Consider a model Y = f(\mathbf{X}), where \mathbf{X} = (X_1, \dots, X_k) are the input variables assumed to be independent and distributed according to their respective probability measures. The total unconditional variance of the output is given by

\operatorname{Var}(Y) = \int f^2(\mathbf{x}) \, d\mu(\mathbf{x}) - \left( \int f(\mathbf{x}) \, d\mu(\mathbf{x}) \right)^2,

where \mu denotes the joint probability measure over the input space, and the integrals are taken with respect to the distributions of the X_i. This variance can be additively decomposed using the Hoeffding-Sobol decomposition, which expresses the model function as a sum of terms depending on subsets of the inputs:

Y = \sum_{u \subseteq \{1, \dots, k\}} Y_u(\mathbf{X}_u),

where \mathbf{X}_u = (X_i : i \in u) are the variables in subset u, and each Y_u depends only on \mathbf{X}_u. The terms Y_u are orthogonal in the L^2 sense, meaning \mathbb{E}[Y_u Y_v] = 0 for u \neq v, which ensures the additivity of variances:

\operatorname{Var}(Y) = \sum_{u \subseteq \{1, \dots, k\}} \operatorname{Var}(Y_u).

This decomposition generalizes Hoeffding's original work on multivariate functions and was adapted by Sobol' for sensitivity analysis.^[12] The partial variance attributable to subset u, denoted V_u = \operatorname{Var}(Y_u), is expressed through conditional expectations as

V_u = \operatorname{Var}_{\mathbf{X}_u} \left( \mathbb{E}_{\mathbf{X}_{\sim u}} (Y \mid \mathbf{X}_u) \right),

where \mathbf{X}_{\sim u} denotes the variables not in u, and the expectations are taken over the distributions of the respective variables. Expanding this yields the full decomposition:

\operatorname{Var}(Y) = \sum_i V_i + \sum_{i < j} V_{ij} + \cdots + V_{1 \dots k},

with each V_u capturing the contribution of the interactions among the variables in u, after accounting for lower-order effects via the conditional variance structure. The integrals involved in these expectations are computed over the marginal distributions of the inputs, ensuring the decomposition holds under the independence assumption.

Sobol' Sensitivity Indices

Sobol' sensitivity indices arise from the variance decomposition of a model's output and provide normalized measures of input importance. For a model Y = f(\mathbf{X}), where \mathbf{X} = (X_1, \dots, X_p) are the input variables assumed independent and uniformly distributed over [0,1]^p, the total variance \mathrm{Var}(Y) is decomposed into additive terms V_u corresponding to subsets u \subseteq \{1, \dots, p\}, such that \mathrm{Var}(Y) = \sum_{u \subseteq \{1,\dots,p\}} V_u. The Sobol' index for subset u is then defined as

S_u = \frac{V_u}{\mathrm{Var}(Y)},

where V_u represents the partial variance attributable to the inputs in u and their interactions, excluding lower-order terms. This formulation originates from the functional ANOVA decomposition introduced by Sobol'.^[3] These indices interpret the contribution of input subsets to the output uncertainty: S_u quantifies the fraction of \mathrm{Var}(Y) explained solely by the variables in u, capturing both main effects (when |u| = 1) and higher-order interaction effects (when |u| > 1). A key property is that the indices are nonnegative and bounded, $0 \leq S_u \leq 1, reflecting proportions of variance. Moreover, the set of all Sobol' indices sums to unity, \sum_{u \subseteq \{1,\dots,p\}} S_u = 1, ensuring a complete partitioning of the output variance. For models without interactions, the indices are additive, with the sum of first-order indices equaling 1; in general, they account for interactions by allocating variance to interacting subsets.^[3] The Sobol' indices are particularly robust to model non-linearities and interactions, as they derive from a global variance-based framework that averages effects over the entire input space, unlike local methods sensitive to operating points. This makes them suitable for complex, nonlinear systems where inputs may exhibit synergistic or antagonistic effects. Extensions to cases with correlated inputs exist, such as copula-based adjustments that transform dependent variables into independent ones while preserving marginal distributions, allowing adaptation of the standard indices.^[13]

Types of Sensitivity Measures

First-Order Indices

The first-order Sobol' index, denoted S_i, measures the fraction of the total variance of the model output Y that is attributable to the input variable X_i alone, disregarding any interactions with other inputs. It is formally defined as

S_i = \frac{V_{X_i} \left( E_{X_{\sim i}} (Y \mid X_i) \right)}{\mathrm{Var}(Y)},

where V represents the variance operator, E the expectation, \mathrm{Var}(Y) the total unconditional variance of the output, and X_{\sim i} denotes all model inputs excluding X_i. This formulation captures the expected reduction in output variance when X_i is fixed, isolating its main or direct effect on model uncertainty. Interpretation of S_i provides insight into the relative importance of individual inputs: a value close to 1 indicates that X_i alone accounts for nearly all output variability, suggesting it dominates the model's uncertainty. The sum of all first-order indices satisfies \sum_i S_i \leq 1, with equality if the model is purely additive (lacking interactions); any shortfall $1 - \sum_i S_i reflects the influence of higher-order interactions among inputs. These indices are valuable for factor screening, helping prioritize inputs that most strongly drive output variability in complex systems. First-order indices assume independence among input variables, a condition that ensures the variance decomposition remains valid without correlations confounding the individual effects. For example, in a physically based snow model used for climate simulations, the first-order index for the melt factor parameter (which scales with air temperature) might yield a value around 0.4, indicating that temperature variations directly explain about 40% of the variance in simulated snow water equivalent, separate from interactions with other climatic drivers like precipitation. Higher-order indices can complement this by quantifying interaction effects if needed.

Higher-Order Interaction Indices

Higher-order Sobol' indices extend the variance-based sensitivity analysis framework to quantify the contributions of interactions among multiple input variables to the output variance, beyond the effects captured by first-order indices. For a subset u of input indices with |u| > 1, the higher-order index is defined as S_u = V_u / \mathrm{Var}(Y), where V_u = \mathrm{Var}_{X_u} \left( \mathbb{E}_{X_{\sim u}} (Y \mid X_u) \right) - \sum_{v \subset u, |v| < |u|} V_v represents the portion of the conditional variance attributable solely to the interaction among the variables in u, after subtracting the contributions from all proper subsets v. This formulation ensures that S_u isolates the "pure" interaction effect, building on the first-order indices as subtracted building blocks in the decomposition.^[14] These indices provide insight into complex dependencies among inputs, where a second-order index S_{ij} specifically measures the interaction between inputs X_i and X_j. Since all Sobol' indices are non-negative, a non-zero S_{ij} indicates that the interaction contributes additional variance to the output beyond the sum of the individual main effects of X_i and X_j. Higher-order terms like S_{ijk} reveal even more intricate multi-way interactions, helping to uncover non-additive behaviors in the model that might otherwise be overlooked. For instance, in an ecological model of crop yield, S_{ij} for temperature and rainfall could demonstrate their joint influence on yield variability, showing how their interaction contributes to outcomes beyond the separate effects of each factor, such as enhanced risk during extreme weather combinations.^[14] In practice, higher-order indices are often limited to low orders (up to 2 or 3) because they typically account for a small fraction of the total variance, while full computation across all possible subsets faces a combinatorial explosion, with $2^k - 1 terms required for k inputs, rendering it infeasible for high-dimensional models. This sparsity in significant interactions allows analysts to focus on pairwise or triplet effects to achieve a comprehensive understanding without exhaustive evaluation, as demonstrated in applications where second-order terms explain a substantial yet manageable portion of the output uncertainty.^[14]

Total-Order Indices

The total-order Sobol' indices, also known as total-effect indices, quantify the overall contribution of an input variable X_i to the output variance \operatorname{Var}(Y), encompassing both its main effect and all possible interactions with other inputs. Formally, the total-order index for input i is defined as

S_{T_i} = 1 - \frac{V_{\sim i}}{\operatorname{Var}(Y)},

where V_{\sim i} = \operatorname{Var}_{X_{\sim i}} \left( \mathbb{E}_{X_i} (Y \mid X_{\sim i}) \right) represents the variance of the conditional expectation of Y given all inputs except X_i. This measure captures the total effect of X_i by isolating the portion of output variance that cannot be explained without considering X_i.^[15] In interpretation, the difference S_{T_i} - S_i (where S_i is the first-order index) indicates the relative importance of interactions involving X_i; a value close to zero suggests negligible interactions, while a large difference highlights significant synergies or antagonisms with other variables. If S_{T_i} \approx 1, the input X_i is highly influential overall, dominating the model's uncertainty regardless of other factors. Conversely, S_{T_i} \approx 0 implies that X_i has minimal impact, even through interactions, allowing it to be fixed without substantially altering output variance. These indices thus provide a comprehensive per-input assessment, complementing the variance decomposition by aggregating all interaction terms for each variable.^[15]^[3] Key properties include S_{T_i} \geq S_i for all i, reflecting the non-negative contribution of interactions to total effects, and the normalization $0 \leq S_{T_i} \leq 1. They are particularly useful for input reduction in complex models, as inputs with low S_{T_i} can be screened out to simplify analysis while preserving most output variance explanations. Additionally, S_{T_i} = 1 - S_{\sim i}, linking total effects directly to the first-order index of the complementary input set.^[3] In engineering design applications, such as reliability assessment of structural systems, the total-order index for a key parameter like material strength might reveal its full influence on failure probability, including couplings with load and geometry variables that amplify or mitigate risks under uncertainty. For instance, in a bridge design model, a high S_{T_i} for wind load could justify prioritizing its robust characterization over less interactive factors.

Computational Methods

Sampling Designs

Sampling plays a crucial role in variance-based sensitivity analysis by generating a set of N input samples drawn from the prescribed joint probability distributions of the model factors, which are then used to evaluate the model at these points and approximate the multidimensional Monte Carlo integrals underlying the variance decomposition.^[2] This process enables the estimation of conditional variances and expectations without requiring analytical derivations of the model.^[2] Among the key sampling designs, Sobol' low-discrepancy sequences stand out for their space-filling properties, which promote uniform coverage of the input parameter space and yield superior convergence rates in index estimation compared to traditional random sampling.^[2] In the Sobol' method, two independent matrices A and B, each of size N × k (where k is the number of input factors), are generated using these sequences; additional resampled matrices, such as A_B (columns from A except the i-th column from B) and B_A, are then formed to facilitate computations of conditional expectations for first- and total-order indices.^[2] A common recommendation for the base sample size N is at least 500 to ensure reliable estimates, with total model evaluations scaling as N(2k + 2) for comprehensive index computation.^[2] Latin hypercube sampling (LHS) provides an alternative random sampling strategy that achieves efficient uniform coverage by dividing the range of each input factor into N equiprobable strata and selecting one sample from each stratum, thereby reducing clustering and improving representativeness in the input space.^[16] LHS is particularly effective for models with independent inputs and is frequently paired with variance-based index estimation due to its balance of simplicity and variance reduction over simple random sampling.^[16] The extended Fourier Amplitude Sensitivity Test (eFAST) employs a Fourier-based sampling design that generates input samples along a one-dimensional space-filling curve, parameterized by sine functions to explore the k-dimensional hypercube: for each factor i, samples are produced as x_i(s) = \frac{1}{2} + \frac{1}{\pi} \arcsin(\sin(\omega_i s + \phi_i)), where s ranges over (-π, π), ω_i are distinct integer frequencies, and φ_i are phase shifts.^[17] This approach transforms the multidimensional integration into a spectral analysis, allowing efficient variance apportionment with sample sizes typically on the order of several thousand points per factor.^[17] Quasi-Monte Carlo techniques, such as Sobol' sequences, outperform plain Monte Carlo sampling by minimizing discrepancy and thus reducing estimator variance, especially for models with smooth response surfaces, which accelerates convergence to true sensitivity indices.^[2]

Estimation Procedures

The estimation of variance-based sensitivity indices involves a structured workflow that follows the generation of input sample matrices, typically using low-discrepancy sequences like Sobol' sequences to ensure efficient exploration of the parameter space.^[18] In the standard Saltelli scheme, the process starts by creating two independent matrices A and B, each of size N × k, where N denotes the number of samples and k the number of input factors, with entries uniformly distributed on [0,1].^[18] For each factor i from 1 to k, auxiliary matrices are then formed: A_B(i) by copying all columns of A except the i-th column, which is taken from B, and B_A(i) by copying all columns of B except the i-th column, which is taken from A.^[18] The model function f is subsequently evaluated across all rows of A to obtain f(A), all rows of B to obtain f(B), all A_B(i) for each i to obtain f(A_B(i)), and all B_A(i) for each i to obtain f(B_A(i)), resulting in N(2k + 2) total model evaluations.^[18] To compute the first-order sensitivity index for factor i, the conditional expectation of the output given X_i is approximated by averaging model outputs where X_i is held constant while the other inputs vary, drawing from the paired evaluations of f(A) and f(A_B(i)).^[18] This approach isolates the main effect of X_i by fixing it across the two sets of evaluations and allowing variation only in the complementary inputs.^[18] For the total-order index of factor i, which accounts for its full contribution including all interactions, the procedure approximates the variance attributable to all factors except i by using evaluations from f(B) and f(B_A(i)), where all inputs but X_i are resampled to capture the residual effects.^[18] When input factors follow distributions other than uniform on [0,1], the procedure requires preprocessing to normalize them to this interval, typically by applying the inverse cumulative distribution function (CDF) of each factor's distribution to map the original samples onto [0,1], thereby preserving the probabilistic structure while aligning with the uniformity assumption underlying the variance decomposition. An efficient variant of the estimators, known as the Janon procedure, achieves optimal asymptotic variance and normality using replicated experimental designs within the standard Saltelli sampling scheme of N(2k + 2) evaluations, enabling robust estimation of both first- and total-order indices with improved precision for the same computational cost.^[19]

Estimators and Algorithms

Variance-based sensitivity analysis relies on Monte Carlo estimators to approximate the Sobol' indices from model evaluations at sampled input points. These estimators typically use two independent input matrices, denoted as A and B, along with derived matrices such as A_{B_i} (where all columns of A are retained except the i-th column, which is taken from B) and B_{A_i} (similarly, all columns of B except the i-th from A). Let f(A), f(B), f(A_{B_i}), and f(B_{A_i}) represent the model outputs evaluated at these matrices, with N denoting the sample size. The total output variance \hat{\text{Var}}(Y) is estimated as \hat{\text{Var}}(Y) = \frac{1}{N} \sum_{n=1}^N f(B)_n^2 - \left( \frac{1}{N} \sum_{n=1}^N f(B)_n \right)^2. The standard estimator for the first-order sensitivity index \hat{S}_i quantifies the main effect of input X_i and is given by

\hat{V}(E(Y|X_i)) = \frac{1}{N} \sum_{n=1}^N f(B)_n \left( f(A_{B_i})_n - f(A)_n \right),

such that \hat{S}_i = \hat{V}(E(Y|X_i)) / \hat{\text{Var}}(Y). This formulation, derived from Monte Carlo integration, provides an unbiased estimate under large N and is recommended for its efficiency in capturing conditional expectations.^[20] For the total-order index \hat{S}_{T_i}, which includes all interactions involving X_i, a common estimator is

\hat{S}_{T_i} = 1 - \frac{1}{N \hat{\text{Var}}(Y)} \sum_{n=1}^N f(A)_n \left( f(B_{A_i})_n - f(B)_n \right).

This approach estimates the first-order index of all inputs except X_i and subtracts it from unity, effectively isolating the total contribution of X_i. It is particularly useful for identifying non-influential factors when \hat{S}_{T_i} \approx 0. Both estimators require N(k+2) model evaluations for k inputs and converge to their true values as N \to \infty via the law of large numbers in Monte Carlo integration.^[20] Improved estimators proposed in Saltelli et al. (2010) address potential biases in finite samples by refining the matrix configurations and correlation structures. For the total-order index, an enhanced version uses the product f(A)_n f(B_{A_i})_n in place of differences to better approximate the inner expectation, reducing effective sample size requirements and variance in estimates:

\hat{U}_i = \frac{1}{N} \sum_{n=1}^N f(A)_n f(B_{A_i})_n - f_0^2, \quad \hat{S}_{T_i} = 1 - \frac{\hat{U}_i}{\hat{\text{Var}}(Y)},

where f_0 is the estimated mean output. This radial sampling design minimizes dependence between samples, improving accuracy for correlated outputs. For higher-order interaction indices, correlation-based estimators leverage the covariance between f(A_{B_i}) and f(B_{A_j}) for i \neq j, enabling efficient computation of pairwise or multi-way effects without exhaustive enumeration. These refinements are shown to lower bias by up to 20% in benchmark tests compared to earlier formulations.^[20] Algorithms for implementing these estimators emphasize Monte Carlo integration for numerical approximation, with convergence assessed through the central limit theorem, where the standard error scales as O(1/\sqrt{N}). Practical convergence is monitored by repeating estimations with increasing N until indices stabilize within a tolerance, often requiring N \geq 10^4 for smooth models. Confidence intervals are commonly obtained via bootstrapping: resample the paired outputs \{f(B)_n, f(A_{B_i})_n, f(A)_n\} with replacement B times (typically B=1000), recompute the indices for each bootstrap sample, and take the percentile interval (e.g., 2.5%–97.5%) to quantify uncertainty. This non-parametric method accounts for sampling variability without assuming normality.^[20]^[21] For time-dependent models where outputs Y(t) vary over time t, standard estimators are extended by integrating the variance over the time horizon, yielding time-integrated sensitivity indices \hat{S}_i(t) = \frac{\int V(E(Y(t)|X_i)) \, dt}{\int \text{Var}(Y(t)) \, dt}. This approach, using discretized integrals or functional approximations, handles temporal correlations by treating the integrated output as a scalar, preserving the variance decomposition while capturing dynamic effects. Algorithms incorporate time-averaging in the Monte Carlo sums, with bootstrapping adapted to resample time series jointly for robust intervals.

Computational Cost and Efficiency

Variance-based sensitivity analysis methods, such as the Sobol' approach, demand substantial computational resources primarily due to the extensive model evaluations required for estimating sensitivity indices. In the basic Sobol' implementation using Monte Carlo sampling, computing first-order and total-order indices necessitates (k + 2)N evaluations of the model function f, where k denotes the number of input variables and N is the base sample size.^[2] To ensure convergence and accuracy, N typically ranges from 500 to several thousand, but must increase with k to mitigate the curse of dimensionality, resulting in costs that scale exponentially for high-dimensional problems (k > 20).^[2]^[22] This scaling arises because reliable variance decomposition demands sufficient sampling density across the input space, which grows combinatorially with dimensionality. Efficiency enhancements often involve surrogate modeling to approximate the computationally intensive f, thereby minimizing direct evaluations. Gaussian process emulators, for instance, construct probabilistic approximations from a subset of model runs and enable variance-based index estimation with far fewer original simulations, achieving speedups of orders of magnitude for complex systems.^[23] Sequential adaptive sampling designs complement this by iteratively selecting evaluation points based on current uncertainty in the surrogate, concentrating resources on influential regions and further reducing total costs compared to fixed one-shot sampling.^[24] These strategies are particularly effective when integrated, as the surrogate guides adaptive refinements until index estimates stabilize within desired confidence intervals.^[25] Trade-offs in implementation balance accuracy and expense; for example, quasi-random (quasi-Monte Carlo) sampling with low-discrepancy sequences like Sobol' points can achieve the same precision as random sampling but with 10 to 100 times fewer evaluations, thanks to superior variance reduction in low to moderate dimensions.^[2] Additionally, prioritizing low-order indices (first- and second-order) circumvents the full exploration of all 2^k interaction terms, which is infeasible for k > 10, allowing practical analysis by assuming higher-order effects are negligible in many applications.^[22] Recent advances in the 2020s have leveraged parallel computing paradigms to distribute the (k + 2)N evaluations across multiple processors or GPU clusters, enabling scalability for large-scale models with k up to 100 and N exceeding 10^5.^[26] Variance reduction techniques, such as control variates that correlate estimators with known low-variance proxies (e.g., from simplified models), further decrease the effective sample size needed for reliable indices by up to 50% in targeted scenarios.^[27] These methods, often combined with surrogates, address the core bottlenecks while preserving the interpretability of variance-based measures.

Applications and Examples

Real-World Applications

Variance-based sensitivity analysis (VBSA) has been extensively applied in engineering to address design under uncertainty, particularly in aerospace reliability assessments. In the conceptual design of commercial aircraft such as the Boeing 737-800, VBSA ranks input uncertainties like engine performance parameters and flight Mach number based on their contributions to output variance in metrics such as fuel energy consumption, enabling targeted reduction of simulations by focusing on influential factors.^[28] At NASA Langley Research Center, VBSA quantifies epistemic uncertainties in multidisciplinary analysis and optimization for launch vehicles and flight systems, identifying key parameters that drive variance in performance predictions and supporting calibration of complex models.^[29] In environmental science, VBSA facilitates uncertainty quantification in climate models by attributing variance in projections to factors like emissions scenarios versus natural variability. Integrated assessment models (IAMs), which inform IPCC assessments, employ VBSA to decompose multivariate outputs such as temperature rise and sea-level projections, revealing dominant drivers like socioeconomic pathways and revealing interactions that amplify risks in long-term forecasts.^[30] VBSA supports risk assessment in finance by quantifying sensitivities of portfolio models to market factors, enhancing decision-making under uncertainty. In value-at-risk (VaR) computations for portfolios, variance-based methods analyze how input variances in asset returns and correlations contribute to overall risk metrics, guiding hedging strategies and allocation adjustments.^[31] For instance, in mean-variance portfolio optimization, VBSA evaluates the impact of parameter perturbations on expected returns and risk, providing global measures analogous to Sobol indices for robust sensitivity profiling.^[8] In healthcare, VBSA analyzes variance in pharmacokinetic (PK) models to understand drug response variability due to patient covariates such as age, genetics, and organ function. Applied to physiologically based PK models in the Open Systems Pharmacology Suite, Sobol and extended Fourier amplitude sensitivity test (EFAST) methods identify key parameters like lipophilicity and unbound fractions that dominate variance in exposure metrics (e.g., Cmax and AUC) for drugs like itraconazole, informing personalized dosing.^[32] VBSA evaluates the contribution of covariate effects to variability in pharmacokinetic/pharmacodynamic outcomes in population models.^[33]

Illustrative Examples

Variance-based sensitivity analysis (VBSA) is often illustrated using benchmark functions that allow for analytical verification of results, enabling clear demonstration of computational procedures and interpretive insights. One such canonical example is the Ishigami-Homma function, a three-input nonlinear model designed to exhibit varying degrees of main effects and interactions among inputs. The Ishigami-Homma function is defined as

Y = \sin(X_1) + 7 \sin^2(X_2) + 0.1 X_3^4 \sin(X_1),

where the inputs X_1, X_2, X_3 are independent and uniformly distributed over [-\pi, \pi]. This function is particularly useful for VBSA because its analytical first-order Sobol indices are known: S_1 = 0.314, S_2 = 0.442, S_3 = 0.044, highlighting the dominance of X_2's main effect while showing how X_3 contributes primarily through interactions (evident when comparing to total-order indices, where S_{T3} > S_3). To compute these indices step-by-step for the Ishigami function, begin with sample generation using a quasi-random sequence, such as the Sobol low-discrepancy sequence, to create an efficient experimental design. For k=3 inputs and N=1000 samples, generate a base matrix A of size N \times k and a resampling matrix B by scrambling one column of A; then form additional matrices by replacing columns of A and B with corresponding columns from the other (resulting in A_B, B_A, and B_C for total effects). This design, proposed by Saltelli, requires N(k+2) model evaluations and efficiently estimates both first- and total-order indices from the same samples. Next, evaluate the model Y at these matrices to obtain output vectors Y_A, Y_B, Y_C, etc. The first-order index for input i is estimated as \hat{S}_i = 1 - \frac{\frac{1}{N} \sum_{j=1}^N Y_A(j) Y_{A_{B_i}}(j) }{ \hat{V}(Y) }, where \hat{V}(Y) is the empirical variance of Y_A, capturing the fraction of output variance attributable to X_i alone. For total-order indices, \hat{S}_{T_i} = 1 - \frac{\frac{1}{N} \sum_{j=1}^N Y_B(j) Y_{B_{A_i}}(j) }{ \hat{V}(Y) }, which includes interactions; a notable difference S_{T_i} > S_i for X_3 (e.g., 0.557 vs. 0.044) underscores the role of higher-order interactions in the model's behavior. These estimates converge to the analytical values with sufficient N, demonstrating VBSA's ability to decompose variance even in nonlinear settings. Another illustrative application of VBSA arises in chaotic dynamical systems, such as the Lorenz '63 model, which simulates atmospheric convection through the differential equations \frac{dx}{dt} = \sigma(y - x), \frac{dy}{dt} = x(\rho - z) - y, \frac{dz}{dt} = xy - \beta z with standard parameters \sigma=10, \rho=28, \beta=8/3. Despite its sensitivity to initial conditions—a hallmark of chaos—VBSA applied to short-term forecasts (e.g., over 0.5 time units) reveals parameter sensitivities in output variables like the x-state, with the Rayleigh number \rho often dominating variance contributions across states, while interactions remain minimal due to the model's structure. This example illustrates VBSA's robustness in identifying influential factors amid deterministic unpredictability, relevant to weather simulation. For reproducible computations of such examples, the Python library SALib provides implementations of sampling designs (e.g., Sobol sequences) and estimators for Sobol indices, allowing users to generate matrices via sobol.sample(problem, N) and analyze results with sobol.analyze(problem, Y). This open-source tool facilitates didactic explorations, ensuring alignment with analytical benchmarks for functions like Ishigami.

Advantages and Limitations

Strengths

Variance-based sensitivity analysis (VBSA) provides global coverage by evaluating the effects of input factors across the entire input domain, thereby accounting for non-linear relationships and higher-order interactions that local, derivative-based methods overlook, which are limited to infinitesimal changes around nominal points. This holistic approach ensures a comprehensive understanding of model behavior under varying conditions, making it particularly suited for complex systems where inputs exhibit non-monotonic influences.^[14] A key strength lies in its decomposability, which enables a full partitioning of the output variance among individual inputs, pairwise interactions, and higher-order terms, allowing for precise attribution and the prioritization of the most influential factors to streamline model development and interpretation. This variance decomposition underpins VBSA's ability to quantify contributions exhaustively, supporting tasks like factor fixing and model reduction without loss of fidelity.^[14] VBSA is inherently model-agnostic, treating models as black boxes and requiring only input-output evaluations, which renders it applicable to any computational framework regardless of internal assumptions about linearity or additivity; moreover, the resulting sensitivity indices remain distribution-independent following transformations of the input variables to a standard space. Additionally, the indices offer high interpretability by expressing sensitivity as straightforward percentages of total explained variance, facilitating effective communication in policy applications, such as environmental risk assessments where stakeholders need to identify dominant drivers of uncertainty for informed decision-making.^[14]^[34]

Challenges and Limitations

Variance-based sensitivity analysis (VBSA) relies on the fundamental assumption that input variables are independent, which often fails in real-world models where correlations exist, such as in hydrological or biological systems. This assumption underpins the ANOVA-high-dimensional model representation (HDMR) decomposition, but dependent inputs invalidate the standard variance partitioning, leading to biased sensitivity indices that misattribute output variance. Extensions like copula-based methods or regional sensitivity analysis can address dependencies but introduce added complexity, including the need for careful selection of dependence structures and increased computational demands.^[35]^[36]^[37] A major limitation is the curse of dimensionality, where the computational cost grows exponentially with the number of input factors k, as the full HDMR expansion involves $2^k - 1 terms. For high-dimensional problems (k > 10), this renders exhaustive exploration infeasible, often missing subtle interaction effects in large parameter spaces and requiring approximations like total-effect indices that aggregate influences without detailing specific interactions.^[38]^[39] VBSA also suffers from poor sample efficiency, necessitating large sample sizes N for reliable convergence of estimators, with estimation error scaling as O(1/\sqrt{N}). In practice, thousands to tens of thousands of model evaluations are typically required to achieve stable indices, and the method is particularly sensitive to noise in model outputs, amplifying errors in noisy simulations.^[40]^[39] Interpretation of VBSA indices poses further challenges, as they quantify contributions to output variance, assuming variance fully captures uncertainty, which overlooks other distributional aspects like means, tails, or higher moments relevant for risk assessment. This variance-centric view is less suitable for non-variance-based metrics, such as expected values, and critiques highlight over-reliance on these indices in high-stakes domains like pharmacology, where incomplete uncertainty representation due to ignored correlations can mislead evaluations.^[41]^[42]^[37]

References

[1]
Variance based sensitivity analysis of model output. Design and ...
Variance based methods have assessed themselves as versatile and effective among the various available techniques for sensitivity analysis of model output.
[2]
[PDF] Variance based sensitivity analysis of model output. Design and ...
In his original work Sobol' proposes to use Monte Carlo probable error to esti- mate the error in the computation of the sensitivity indices [37]. In [1] ...
[3]
[PDF] Global sensitivity indices for nonlinear mathematical models and ...
[12] I.M. Sobol. 0. , Sensitivity estimates for nonlinear mathematical models, Matem. Modelirovanie 2 (1) (1990) 112–118 (in. Russian), MMCE, 1(4) (1993) 407 ...
[4]
[PDF] Variance-Based Sensitivity Analysis to Support Simulation-Based ...
Sensitivity analysis plays a critical role in quantifying uncertainty in the design of engi- neering systems. A variance-based global sensitivity analysis ...
[5]
Application of a variance‐based sensitivity analysis method to the ...
Jul 8, 2018 · The goal of this paper is to document the application of Sobol's method to the biomass learning model, including the method requirements, ...Methods · Sobol's Method · Sensitivity Index Estimates
[6]
(PDF) Variance-Based Global Sensitivity Analysis for Multiple ...
Aug 7, 2025 · Sensitivity analysis is a vital tool in hydrological modeling to identify influential parameters for inverse modeling and uncertainty analysis, ...<|control11|><|separator|>
[7]
[PDF] Sensitivity Analysis in Practice - Andrea Saltelli
2 Variance based measures are generally estimated numerically using either the method of Sobol' or FAST (Fourier Analysis Sensitivity Test), or extensions ...
[8]
An annotated timeline of sensitivity analysis - ScienceDirect.com
In 1993, Sobol' introduced an innovative approach to sensitivity analysis based on the decomposition of the output variance (Sobol', 1993). This method, known ...Missing: roots | Show results with:roots
[9]
[PDF] Variance-based sensitivity analysis in the presence of correlated ...
Abstract. In this paper we propose an extension of the classical Sobol' estimator for the estimation of vari- ance based sensitivity indices.<|separator|>
[10]
Surrogate-Based Global Sensitivity Analysis with Statistical ...
Surrogates can be either data-driven machine learning models or lower-fidelity simulation models that reduce the resolution or number of components of the ...
[11]
Sensitivity Estimates for Nonlinear Mathematical Models
Semantic Scholar extracted view of "Sensitivity Estimates for Nonlinear Mathematical Models" by I. Sobol.
[12]
[PDF] Estimation of global sensitivity indices for models with dependent ...
Formulas and Monte Carlo numerical estimates similar to Sobol' formulas are derived. A copula-based approach is proposed for sampling from arbitrary ...
[13]
[PDF] Global Sensitivity Analysis. The Primer - Andrea Saltelli
1 Introduction to Sensitivity Analysis. 1. 1.1 Models and Sensitivity Analysis. 1. 1.1.1. Definition. 1. 1.1.2. Models. 2. 1.1.3. Models and Uncertainty.
[14]
[PDF] Importance measures in global sensitivity analysis of nonlinear models
The computational novelty of this study is the introduction of the 'total effect' parameter index. This index provides a measure of the total effect of a given.
[15]
Latin hypercube sampling and the propagation of uncertainty in ...
This presentation will emphasize sensitivity measures that can be obtained when Latin hypercube sampling is used to evaluate the integral in Eq. (1.4). One ...
[16]
[PDF] v4101039 A Quantitative Model-Independent Method for Global ...
Mar 12, 2012 · A new method for sensitivity analysis (SA) of model output is introduced. It is based on the Fourier amplitude sensitivity test (FAST) and ...
[17]
https://www.andreasaltelli.eu/file/repository/Saltelli_Technom.pdf
[18]
https://doi.org/10.1016/S0010-4655(02)00280-1
[19]
https://doi.org/10.1051/ps/2013040
[20]
[PDF] Construction of bootstrap confidence intervals on sensitivity indices ...
Jan 20, 2014 · This paper focuses on variance-based ones, computed by polynomial chaos expansion. Sensitivity indices, coming from variance decomposition ( ...
[21]
An Efficient Approach to Deal with the Curse of Dimensionality in ...
This has been referred to as the curse of dimensionality and makes the complete decomposition unfeasible in most practical applications. In this paper we show ...
[22]
Variance-based sensitivity analysis of model outputs using ...
Aug 9, 2025 · These meta models are utilised for a variance-based sensitivity analysis that is able to evaluate the sensitivities and interactions.
[23]
[PDF] Emulation methods and adaptive sampling increase the efficiency of ...
We find that the emulation and adaptive sampling approaches are faster than Sobol' method for slow models. The Bayesian adaptive spline surface method is the ...
[24]
Efficient dimension reduction and surrogate-based sensitivity ...
The proposed method uses surrogate model to replace the expensive model for sensitivity analysis, and tackle the problem of building surrogate models for high- ...
[25]
Global Sensitivity Analysis of a Large Agent-Based Model of Spatial ...
A parallel computing approach leveraging multi-GPU clusters is developed to address the computational challenge of Sobol's approach. The heterogeneous parallel ...
[26]
Application of the control variate technique to estimation of total ...
To improve the efficiency of the Monte Carlo estimates for the Sobol׳ total sensitivity indices we apply the control variate reduction technique and develop a ...
[27]
https://www.sciencedirect.com/science/article/abs/pii/S095183201400163X
[28]
Probabilistic Methods for Sensitivity Analysis and Calibration in the ...
Feb 6, 2015 · In this paper, a series of algorithms are proposed to address the problems in the NASA Langley Research Center Multidisciplinary Uncertainty ...
[29]
[PDF] Method for System-Level Multidisciplinary Uncertainty Analysis of ...
The Sobol sensitivity index, a variance-based method for performing global sensitivity analysis that uses variance as the basis for quantifying the ...
[30]
Global sensitivity analysis of integrated assessment models with ...
Feb 22, 2025 · At the core of the approach are novel sensitivity measures based on the theory of optimal transport. We apply the approach to the uncertainty ...
[31]
[PDF] A Probabilistic Simulation Based VaR Computation and Sensitivity ...
This paper presents a new method to compute VaR (value at risk) and perform corresponding variance based sensitivity analysis. VaR has a long history of being ...
[32]
Global sensitivity analysis of Open Systems Pharmacology Suite ...
Sensitivity analyses reveal how such uncertainty or variability affects PBPK model predictions. They also play an important role in model optimization, either ...
[33]
Evaluation of covariate effects using variance-based global ...
Aug 4, 2021 · Variance-based global sensitivity analysis (GSA) can simultaneously quantify contribution of each covariate effect to the variability for the ...Missing: healthcare | Show results with:healthcare<|separator|>
[34]
Variance based sensitivity analysis of model output. Design and ...
Variance based methods have assessed themselves as versatile and effective among the various available techniques for sensitivity analysis of model output.Missing: project integrated assessment
[35]
Probabilistic human health risk assessment and Sobol sensitivity ...
Jul 6, 2023 · The Sobol sensitivity analysis was used to determine the effective input parameters and to quantify the share of each parameter in the variance ...
[36]
[PDF] Variance-Based Sensitivity Indices For Models With Dependent Inputs
Jun 6, 2014 · The drawback of their approach is that only linear models with correlations. (linear dependences) are supported. In Li et al. (2010), the ...
[37]
Variance-based sensitivity indices for models with dependent inputs
These measures allow us to distinguish between the mutual dependent contribution and the independent contribution of an input to the model response variance.
[38]
Considerations and Caveats when Applying Global Sensitivity ... - NIH
Jul 17, 2020 · Generally, local sensitivity analysis is based on model parameters set at baseline values with consideration of only minor perturbations to ...
[39]
VISCOUS: A Variance‐Based Sensitivity Analysis Using Copulas for ...
Jun 20, 2021 · This is mainly due to the high number of interacting, and uncertain input factors causing the “curse of dimensionality” problem. When using a ...
[40]
[PDF] Variance-based sensitivity analysis: The quest for better estimators ...
The sensitivity analysis of mathematical models aims to 'apportion the output uncertainty to the uncertainty in the input factors' (Saltelli and Sobol', 1995).
[41]
[PDF] Convergence study in global sensitivity analysis - OSTI.GOV
Aug 11, 2016 · First, the error in estimation of an MC-based quantity of interest converges at a rate of 1/. √. N, where N is the number of samples. As the ...Missing: efficiency | Show results with:efficiency
[42]
(PDF) A new uncertainty importance measure - Academia.edu
Profile image of Emanuele Borgonovo Emanuele Borgonovo ... A New Uncertainty Importance Measure E. ... 2 Global Sensitivity Analysis Global sensitivity analysis ...<|control11|><|separator|>
[43]
A new uncertainty importance measure - ScienceDirect.com
A new uncertainty importance measure · Abstract · Introduction · Section snippets · Global sensitivity analysis · A moment independent importance measure · A test ...