Fact-checked by Grok 2 weeks ago

Variogram

The variogram, also known as the semi-variogram, is a key statistical function in that quantifies the spatial dependence and variability of a regionalized , such as grades or environmental measurements, by calculating the expected squared difference between values at two locations separated by a given h. Mathematically, it is defined as \gamma(h) = \frac{1}{2} \mathbb{E} \left[ (Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h}))^2 \right], where Z represents the spatial , making it particularly useful for analyzing non-stationary processes without requiring a constant mean, unlike functions. Originating from the work of South African mining engineer D.G. Krige in the 1950s, who applied early geostatistical methods to gold ore valuation, the variogram was formalized by French mathematician Georges Matheron in the 1960s as part of the development of techniques. The empirical variogram, computed from observed data as \hat{\gamma}(h) = \frac{1}{2|N(h)|} \sum_{N(h)} [Z(\mathbf{x}_i) - Z(\mathbf{x}_j)]^2 for pairs separated by h, serves as the foundation for fitting theoretical models that capture spatial structure. Common models include the spherical, which rises ly to a plateau; the , for gradual increases; and the Gaussian, which provides a smooth increase, each parameterized to reflect real-world behaviors. Central to the variogram's utility are its key components: the nugget effect, representing micro-scale variability or measurement error at zero lag; the sill, the total variance level where spatial correlation stabilizes; and the range, the distance beyond which observations become uncorrelated, adhering to Tobler's of that near things are more related than distant ones. These elements enable the variogram to model (directional dependence) and guide robust methods like the Cressie-Hawkins for noisy data. In practice, variograms underpin applications in resource estimation, such as and modeling, where they inform for predicting values at unsampled locations, as well as environmental monitoring and hydrological simulations to assess spatial patterns in pollutants or levels. For instance, in unconventional oil and gas reservoirs, variogram analysis supports sensitivity studies and cross-validation to improve prediction accuracy.

Definition and Fundamentals

Definition

In , the variogram serves as a key tool for quantifying the degree of spatial dependence or dissimilarity between observations of a regionalized —such as a spatially distributed like grades or environmental attributes—separated by a given lag distance h. It captures how the variability in differences between points grows with increasing separation, thereby providing insight into the underlying structure of spatial continuity within the data. This measure is essential for understanding patterns in natural systems where nearby locations tend to exhibit more similarity than distant ones, a principle central to applications in , , and . The variogram is formally defined as the of the squared difference between values at points separated by h, often expressed in relation to variance. A distinction exists between the full variogram, which corresponds to twice the semivariogram value (representing the complete variance of the difference), and the semivariogram itself, which is half that variance; this stems from early formulations where the semivariogram emphasized the incremental structure. Historically, the terms "variogram" and "semivariogram" have been used interchangeably in geostatistical , reflecting evolving conventions since Georges Matheron's foundational work, though precise usage helps avoid ambiguity in modeling spatial processes. The applicability of the variogram depends on specific statistical assumptions about the underlying . Under second-order stationarity, has a constant mean and a that varies solely with distance, ensuring the variogram's behavior is translation-invariant. More flexibly, the intrinsic hypothesis—introduced by Matheron—relaxes this to assume stationarity only in the increments (differences between points), allowing the variogram to model non-stationary fields where full variance may not exist but spatial differences remain well-behaved. A practical of the variogram's role appears in analyzing spatial continuity for phenomena like levels or deposit concentrations, where it delineates the beyond which observations become effectively independent, informing techniques for unsampled locations.

Mathematical Formulation

The semivariogram, denoted as \gamma(\mathbf{h}), is defined for a spatial Z(\mathbf{x}) as half the expected squared difference between values at two points separated by a \mathbf{h}: \gamma(\mathbf{h}) = \frac{1}{2} \mathbb{E} \left[ \left( Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h}) \right)^2 \right], where the expectation is taken over all possible locations \mathbf{x} in the domain. This formulation arises under the assumption of intrinsic stationarity, which requires that the mean of the differences is zero, \mathbb{E}[Z(\mathbf{x} + \mathbf{h}) - Z(\mathbf{x})] = 0, and that the variance of these differences depends only on \mathbf{h}, not on \mathbf{x}. Intrinsic stationarity does not assume a constant mean for Z(\mathbf{x}) itself or the existence of a finite variance for the field, making it a weaker condition suitable for processes with possible trends or non-stationary means. The full variogram is then $2\gamma(\mathbf{h}) = \mathbb{E} \left[ \left( Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h}) \right)^2 \right], representing the expected squared difference directly for point pairs separated by \mathbf{h}. Under stronger second-order stationarity, the field Z(\mathbf{x}) has a constant mean \mu = \mathbb{E}[Z(\mathbf{x})] and a function C(\mathbf{h}) that depends only on \mathbf{h}, with finite variance \sigma^2 = C(\mathbf{0}). In this case, the semivariogram relates to the via \gamma(\mathbf{h}) = \sigma^2 - C(\mathbf{h}), allowing derivation from the covariance structure while ensuring the differences' second moments are well-defined. In the isotropic case, the semivariogram depends solely on the of the , \gamma(\mathbf{h}) = \gamma(|\mathbf{h}|), implying no directional dependence in spatial continuity. For anisotropic processes, \gamma(\mathbf{h}) varies with both the length and direction of \mathbf{h}, generalizing the formulation to account for directional differences in spatial structure, such as elongation along geological features.

Properties

Basic Properties

The variogram function \gamma(\mathbf{h}), which quantifies the average squared difference between values of a spatial separated by lag vector \mathbf{h}, possesses several fundamental mathematical properties that underpin its use in . One key property is non-negativity, ensuring \gamma(\mathbf{h}) \geq 0 for all \mathbf{h}, as it represents half the expected variance of increments and cannot be negative. This follows directly from the under the intrinsic of stationarity. A related property is the origin condition, where \gamma(\mathbf{0}) = 0, reflecting that the squared difference at zero lag is zero since it compares a value to itself. This holds because \mathbb{E}[(Z(\mathbf{s}) - Z(\mathbf{s}))^2] = 0 for any location \mathbf{s}. Under micro-ergodicity assumptions, which imply local ergodicity and the absence of a discontinuity at the origin (nugget effect), the variogram is continuous at the origin. This continuity facilitates the separation of local mean fluctuations from spatial covariance structure, enabling reliable estimation of the underlying trend. The variogram is also a conditionally negative definite (CND) , meaning that for any of locations \mathbf{s}_1, \dots, \mathbf{s}_n and real numbers \lambda_1, \dots, \lambda_n satisfying \sum_{i=1}^n \lambda_i = 0, the satisfies \sum_{i=1}^n \sum_{j=1}^n \lambda_i \lambda_j \gamma(\mathbf{s}_i - \mathbf{s}_j) \leq 0. This CND property is crucial for , as it guarantees that the kriging weights produce non-negative prediction variances and unbiased estimators. For large lags, under second-order stationarity where the process has finite variance \sigma^2, the variogram approaches the sill value \sigma^2 as \|\mathbf{h}\| \to \infty. This asymptotic behavior indicates that spatial dependence diminishes, and distant observations become uncorrelated, akin to independent realizations.

Model Parameters

In fitted variogram models, three primary parameters capture the essential features of spatial dependence: the nugget effect, the sill, and the range. These parameters provide interpretable insights into the underlying spatial structure of a stationary random process, linking mathematical descriptions to physical phenomena such as measurement inaccuracies and correlation distances. The nugget effect, denoted c_0, quantifies the apparent discontinuity in the variogram at lag distance h = 0, arising from errors or unresolved variability at scales finer than the sampling . This parameter reflects short-range heterogeneity that cannot be modeled explicitly due to data limitations. The sill, given by c_0 + c where c is the structural variance component, represents the asymptotic value that the variogram approaches as h increases, corresponding to the total variance of the spatial process under second-order stationarity assumptions. It indicates the maximum dissimilarity between observations separated by large distances, equivalent to the a priori variance of the . The range, denoted a, is the lag distance at which the variogram reaches approximately the sill value, marking the point where spatial dependence becomes negligible and observations can be treated as independent. Beyond this distance, correlations in the process dissipate, defining the practical extent of spatial continuity. These parameters collectively interpret the spatial correlation length through the range, which delineates zones of influence, and the error components via the nugget relative to the sill, where a high nugget-to-sill ratio signals substantial micro-scale noise or error. In practice, a prominent nugget effect implies the need for denser sampling to resolve fine-scale variations and reduce estimation uncertainty in applications like kriging.

Empirical Estimation

Computing the Empirical Variogram

The computation of the empirical variogram begins with pairing observed data points z(\mathbf{x}_i) and z(\mathbf{x}_j) based on their spatial separation distance, denoted as the h = \|\mathbf{x}_i - \mathbf{x}_j\|. These pairs are then grouped into bins where the lag distances are approximately equal, allowing for the aggregation of multiple pairs to stabilize estimates. Binning is essential because exact matches of distance are rare in continuous spatial data, so a (or bin width) defines the interval for grouping, typically starting from small values near zero to capture short-range behavior. A , often set to half the lag size, further allows inclusion of pairs whose distances fall within a small deviation from the bin center, ensuring sufficient data per without excessive smoothing. Within each bin, the empirical variogram value \hat{\gamma}(h) is calculated using Matheron's classical : \hat{\gamma}(h) = \frac{1}{2 N(h)} \sum_{N(h)} [z(\mathbf{x}_i) - z(\mathbf{x}_j)]^2 where N(h) represents the number of pairs in the bin for h. This measures half the squared between paired values to approximate the theoretical semivariance under second-order stationarity assumptions. The factor of $1/2 aligns the empirical estimate with the theoretical variogram , and the is over all qualifying pairs in the bin. Computationally, this involves iterating through all pairs of observations, computing distances, assigning them to bins, and applying the formula per bin. To account for potential , directional variograms are computed by restricting pairs to those aligned within a specified angular tolerance (e.g., ±22.5°) around a chosen direction, such as north-south or east-west. This allows detection of direction-dependent spatial continuity, where the variogram shape may differ by , revealing structural features like elongated geological formations. Multiple directions are often evaluated at increments of 30° or 45° to map comprehensively. The selection of bin size and tolerance is guided by data density and sampling design; larger bins increase N(h) but may obscure fine-scale variation, while smaller bins risk instability from few pairs. A common guideline requires at least 30 pairs per bin for reliable estimates, as fewer can lead to high variance and poor approximation of the underlying structure. The maximum lag is typically half the domain extent to avoid edge effects, and omnidirectional variograms average over all directions when isotropy is assumed. Once computed, the empirical variogram is visualized by plotting \hat{\gamma}(h) against the midpoint lag distances h, often with indicating the standard deviation within bins or N(h) for assessment. This reveals key trends, such as a nugget effect at small h (discontinuity at the ), a linear rise indicating increasing dissimilarity with distance, or a plateau approaching the sill (total variance). Such plots guide subsequent modeling by highlighting the range of spatial dependence.

Estimation Challenges and Robust Methods

Empirical variogram estimation is susceptible to several challenges that can introduce and variability in the results. Outliers, often present in geostatistical data due to errors or events, can disproportionately inflate semivariance values, leading to distorted spatial structure representations. Clustering of sampling locations, common in exploratory surveys, causes over-representation of short-distance pairs and under-representation of longer ones, resulting in biased estimates of the nugget effect and . Edge effects arise near study area boundaries, where fewer pairs are available for larger , restricting reliable to roughly half the domain size and introducing artificial . Additionally, insufficient pairs per lag —typically fewer than 30—amplifies sampling variability, causing unstable and biased variograms that fail to capture true spatial dependence. To address these issues, robust estimators have been developed to reduce sensitivity to outliers and clustering. The Cressie-Hawkins robust estimator, which applies a fourth-root to squared differences before averaging, provides greater stability for distributions with heavy tails compared to the classical method, though it remains somewhat vulnerable to severe contamination. For even higher robustness, estimators based on the median absolute deviation (MAD) use the median of absolute pairwise deviations as a scale measure, rescaled for consistency, outperforming the Cressie-Hawkins method in simulations with up to 30% outliers by preserving variogram shape and reducing mean squared error. Median polish, an iterative nonparametric technique, removes row and column effects (approximating trends) from gridded data before variogram computation, yielding residuals suitable for robust even with irregular sampling. Non-stationarity in the mean, manifesting as trends, further complicates global variogram estimation by violating the constant mean assumption and producing linearly increasing semivariances. To mitigate this, local variograms are computed within subregions assumed stationary, allowing spatially varying models, while global approaches involve detrending the data—via or universal kriging—to isolate residuals whose variogram reflects local variability without trend-induced bias. Adequate sample sizes are crucial for reliable , with at least 30 pairs recommended per to minimize variance and bias; smaller datasets, such as those with fewer than 20 pairs, often exhibit erratic behavior, overestimating the sill or underestimating the range. Diagnostics like the variogram cloud plot all pairwise semivariances to reveal outliers, clustering artifacts, or non-ergodicity through scattered points deviating from the expected trend, while variogram maps visualize directional dependencies to detect edge-induced asymmetries. For instance, in small datasets from clustered soil sampling, classical variograms may show spurious nuggets due to insufficient long-range pairs, but mitigation via declustering weights or robust can reduce this bias, ensuring more accurate inputs.

Variogram Modeling

Common Theoretical Models

In , theoretical variogram models provide parametric functions to approximate the empirical variogram under the assumption of second-order stationarity and , enabling smooth and of spatial processes. These models typically incorporate a nugget effect c_0, representing micro-scale variability or measurement error that causes a discontinuity at the origin; a structured component that describes the rise in semivariance with lag distance h; and a sill c + c_0, the plateau value approached as h increases, corresponding to the total variance. Common models include bounded forms like spherical, , and Gaussian, which reach a finite sill, and unbounded forms like linear for cases without clear spatial dependence limits, such as trends or patterns. The spherical model is one of the most widely used due to its realistic representation of many natural phenomena, featuring a linear rise near the origin followed by a smooth transition to the sill and a sharp cutoff at the range a. Its equation is given by: \gamma(h) = \begin{cases} c_0 + c \left[ 1.5 \left( \frac{h}{a} \right) - 0.5 \left( \frac{h}{a} \right)^3 \right] & \text{if } h \leq a \\ c_0 + c & \text{if } h > a \end{cases} where c is the partial sill and a is the range parameter. This model exhibits a parabolic shape near zero, mimicking moderate continuity before flattening abruptly. The exponential model captures processes with continuous but irregular variation, showing an initial rapid rise that asymptotically approaches the sill without a defined range, with the practical range defined as A = 3a where \gamma(h) reaches about 95% of the sill. Its formulation is: \gamma(h) = c_0 + c \left[ 1 - \exp\left( -\frac{3h}{A} \right) \right] The curve starts steeply and decays exponentially toward the sill, suitable for datasets with high short-range variability. The Gaussian model describes smoothly varying phenomena, such as those influenced by diffusion, with a parabolic rise near the origin that transitions to an asymptotic approach to the sill; the practical range is r = a \sqrt{3}, where a is the scale parameter. It is expressed as: \gamma(h) = c_0 + c \left[ 1 - \exp\left( -\frac{3 h^2}{r^2} \right) \right] This model's continuous differentiability at the origin reflects high continuity, though it can lead to overly smooth kriging if not combined with a nugget effect. For unbounded spatial dependence, such as in trending data or fractal structures, the linear model applies, featuring a straight-line increase from the nugget with slope b, without reaching a sill. The equation is: \gamma(h) = c_0 + b h Its shape implies persistent growth in dissimilarity with distance, appropriate for non-stationary processes lacking a clear correlation length. Complex spatial structures often require nested variogram models, which are linear combinations of the above basic forms to capture multiple scales of variation, such as short-range noise plus longer-range continuity; for example, a nugget plus an exponential component (sill 0.2, range 1000 m) nested with a spherical component (sill 0.5, range 5000 m) yields a total sill of 0.7 with stepped rises. The overall variogram is the sum \gamma(h) = \sum_i \gamma_i(h), ensuring positive definiteness if each component is valid. This approach allows flexible fitting to empirical data exhibiting hierarchical patterns.

Model Selection and Fitting

Model selection and fitting for variograms involves choosing an appropriate theoretical model that adequately represents the spatial structure captured by the empirical variogram while ensuring the model is valid for subsequent geostatistical analyses such as . Visual fitting remains a foundational approach, where practitioners manually adjust model parameters—such as the nugget, sill, and —to align the theoretical with key features of the empirical variogram, including the initial rise near the origin, the plateau at the sill, and the distance at which spatial dependence diminishes. This method relies on expert judgment to match the overall shape and , providing an intuitive starting point for more rigorous optimization. Automated methods enhance objectivity in fitting theoretical models to empirical variograms. (WLS) estimation, proposed by , minimizes the weighted sum of squared differences between empirical and theoretical variogram values, with weights inversely proportional to the variance of the empirical points to emphasize reliable lags with more pairs. Maximum likelihood (ML) estimation treats the variogram parameters as part of a under a assumption, optimizing them to maximize the probability of observing the data given the model, which is particularly useful for incorporating errors and providing asymptotic . Cross-validation techniques, such as minimizing the from , further refine fits by evaluating how well the model predicts withheld data points. Criteria for selecting among candidate models include statistical measures like the (AIC), which balances goodness-of-fit against model complexity by penalizing excess parameters to avoid . Visual inspection of residuals between empirical and fitted variograms helps identify systematic deviations, while assessments of performance—such as through prediction accuracy metrics—ensure the chosen model improves spatial outcomes. In practice, software tools facilitate these processes; the gstat package in implements WLS and ML fitting with built-in validation options, allowing users to iteratively refine models. Similarly, the scikit-gstat library in supports to empirical variograms using optimization routines, integrating seamlessly with scientific computing workflows. Validation of the fitted model typically employs leave-one-out cross-validation (LOOCV), where each data point is sequentially removed, the variogram is refitted to the remaining data, and predictions are compared to the omitted value to compute errors like the or standardized residuals. This approach quantifies predictive accuracy and helps confirm the model's robustness across the dataset. Common pitfalls in model selection include by selecting overly complex models that capture noise rather than true spatial structure, and neglecting the nugget effect, which can lead to underestimation of short-scale variability and biased predictions. To mitigate these, practitioners often combine multiple criteria and start with simpler models before considering nested extensions.

Advanced Topics

Anisotropy

In , refers to the directional dependence of spatial variability captured by the variogram, where the structure differs by despite overall stationarity. Two primary types are distinguished: geometric , in which the varies by while the sill remains , resulting in elliptical lines of equal variogram value; and zonal , where the sill varies by but the is approximately , often reflecting layered or stratified structures. Detection of anisotropy typically involves computing directional empirical variograms at multiple angles (e.g., every 30° or 45°) and visualizing the results with rose diagrams, which plot or sill values against to highlight principal axes of . For geometric , rose diagrams often reveal an elliptical pattern indicating major and minor directions of . Geometric anisotropy is modeled by applying an to the lag vector \mathbf{h}, transforming it to \mathbf{h}' = A \mathbf{h}, where A is a combining R and scaling T, such that the variogram \gamma(\mathbf{h}) = \gamma_{\text{iso}}(\|\mathbf{h}'\|) uses an isotropic base model. A = T R, \quad R = \begin{pmatrix} \cos \phi & \sin \phi \\ -\sin \phi & \cos \phi \end{pmatrix}, \quad T = \begin{pmatrix} 1/a_{\max} & 0 \\ 0 & 1/a_{\min} \end{pmatrix} Key parameters include the major axis length a_{\max} (longest range), minor axis length a_{\min} (shortest range), and orientation angle \phi (angle of the major axis relative to a reference direction). Zonal anisotropy modeling often nests directional components, such as \gamma(\mathbf{h}_u, \mathbf{h}_v) = \gamma_1(|\mathbf{h}_u|) + \gamma_2(|\mathbf{h}_v|), where \mathbf{h}_u and \mathbf{h}_v are components along principal directions, ensuring the total sill matches the overall variance. A classic geological example is sedimentary deposits, where the variogram range is longer along planes (major axis) due to depositional , but shorter perpendicular to them (minor axis), as seen in salinity data from the coastal zone with a major range of approximately 13,786 m northeast and minor range of 2,097 m southeast. In , accounting for adjusts search neighborhoods to elongate along the major axis, improving prediction accuracy by honoring directional and reducing estimation variance in aligned samples.

Non-Stationarity and Extensions

In geostatistics, non-stationarity arises when the statistical properties of a spatial process vary across the domain, violating the assumptions of second-order stationarity required for standard variogram models. This can manifest as trends or drifts that cause the mean to change systematically with location, leading to biased variogram estimates if not addressed. To handle such cases, extensions to the variogram framework allow for modeling fields that are intrinsically stationary but not second-order stationary, focusing on the structure of increments rather than absolute values. Intrinsic random functions provide a foundational extension for non-second-order stationary fields, where the variogram generalizes to describe the expected squared increments without assuming a finite variance for the process itself. Introduced by Matheron, these functions of order k ensure that the increments of order k have zero and finite variance, enabling variogram computation for processes with polynomial drifts up to degree k. For order 0 (simple kriging assumption), the variogram remains bounded; for higher orders, it becomes unbounded, reflecting increasing dissimilarity at large lags. This approach is particularly useful in mining and environmental applications where global stationarity fails due to underlying geological trends. Local variograms address trend heterogeneity by estimating variogram parameters within subregions or moving windows, allowing spatial variation in sill, range, or nugget effect to capture non-stationary . This method partitions the domain into homogeneous zones based on auxiliary data or adaptive meshing, fitting separate models to each to model local structures. Such techniques improve prediction accuracy in landscapes with varying depositional environments, as demonstrated in digital elevation modeling where global variograms overestimate variability in heterogeneous terrains. Drift removal preprocesses data by subtracting estimated trends—typically linear or —prior to variogram computation, isolating the component for standard . This universal kriging extension models explicitly within the estimation framework, ensuring that the variogram reflects local fluctuations rather than global trends. In practice, least-squares fitting identifies the drift order, with higher-degree applied to surfaces like grades influenced by depth or . For categorical or non-Gaussian data, indicator variograms transform variables into binary indicators at multiple thresholds, computing the variogram as \gamma_I(h) = \frac{1}{2} P[I(x) \neq I(x+h)], where I is the . This nonparametric approach, developed by Journel, accommodates multimodal distributions and order relations among indicators, facilitating nonlinear geostatistical simulations without assuming normality. It is widely applied in for modeling, where it reveals spatial connectivity patterns in discrete geological units. The madogram extends the variogram for robustness against outliers by using the expected , defined as \nu(h) = \frac{1}{2} E[|Z(x) - Z(x+h)|]. This measure, less sensitive to extreme values than the classical squared difference, aids in estimating extremal coefficients and dimensions in non-stationary settings with contaminated data. Variants like the block madogram further adapt it for support effects in aggregated samples, enhancing reliability in . In non-ergodic cases, where the spatial average does not converge to the ensemble mean, theoretical variograms may be unbounded, lacking a finite sill and indicating persistent large-scale heterogeneity. This occurs in intrinsic models without a nugget effect or in processes with , bounding the applicability of standard estimators that assume . Such bounds highlight the need for simulation-based validation to assess estimation variance in finite domains.

Applications

Geostatistical Applications

In geostatistics, the variogram plays a central role in ordinary by quantifying spatial dependence to determine optimal weights for interpolating values at unsampled locations. The kriging weights, denoted as λ_i, are derived from solving a where the variogram function γ(h) directly influences the structure between data points and the prediction location, ensuring unbiased and minimum-variance estimates. This approach, formalized by Matheron in the , allows for precise spatial in resource evaluation by incorporating the variogram's characterization of dissimilarity as a function of separation distance h. Block extends ordinary to estimate average values over larger volumes, such as mining blocks, by regularizing the point-support variogram to account for the support effect. Regularization involves averaging the variogram over the block dimensions, which smooths the spatial structure and reduces estimation variance for volume-based predictions like reserves. This technique is essential in applications where point samples must be upscaled to block models for practical decision-making. For , sequential Gaussian simulation (SGS) utilizes a fitted variogram to generate multiple realizations of spatial fields, enabling the creation of probabilistic maps that capture variability beyond deterministic . In SGS, data are transformed to Gaussianity, the variogram models the correlation structure, and conditional simulations are drawn sequentially along a random , providing equiprobable scenarios for . This method honors the fitted variogram's sill, nugget, and range parameters to reproduce the spatial continuity observed in the data. In , variograms are applied to grade estimation and reserve calculation by informing kriging-based block models that delineate high-grade zones and quantify with associated uncertainty. For instance, variogram analysis reveals in deposits, guiding selective mining unit designs to optimize . Historically, Matheron's pioneering work in the at the Centre de Morphologie Mathématique applied variograms to South gold mine data, establishing for practical reserve evaluation in non-homogeneous deposits. Environmental employs for mapping contaminant plumes and modeling , where the variogram models the spatial of pollutant concentrations to dispersal patterns. In contaminant studies, variogram fitting supports to interpolate sparse monitoring data, delineating plume extents and informing remediation strategies in aquifers. The variogram's relation to the covariance function under stationarity assumptions further links it to broader spatial frameworks in these applications.

Modern and Interdisciplinary Uses

In recent years, the integration of techniques with variogram analysis has advanced automated fitting processes in , particularly through s and optimization algorithms for parameter estimation. A 2024 study introduced a Bayesian-optimized regressor to model experimental variograms, demonstrating improved accuracy and efficiency over traditional methods by handling complex nonlinear relationships in spatial data. In and , variogram models have been applied to analyze spatial variability in rainfall data, with the quadratic-exponential variogram providing a flexible framework for model fitting. A of rainfall records from stations in the Río Bravo–San Juan basin in utilized this model to fit experimental variograms, demonstrating robustness across diverse conditions. Such approaches have supported spatial of data, aiding in and water under changing conditions. Variogram-based texture analysis remains a key tool in for classification from , quantifying spatial heterogeneity to distinguish features like and areas. Recent applications integrate variograms with classifiers, such as random forests, to extract textural features from high-resolution images, achieving accuracies up to 90% in identifying changes. For instance, variograms between spectral bands have improved delineation of agricultural and forested regions in Mediterranean landscapes. In epidemiology, variograms facilitate modeling the spatial spread of diseases during pandemics. Geostatistical analyses of COVID-19 data have employed semivariograms to quantify spatial correlations among counties, revealing post-holiday surges with incidence rate ratios of 1.3 to 1.41 and aiding in targeted intervention planning. Such models enhance predictions of outbreak dynamics in urban settings. For mineral resource exploration, multivariate combined with has enabled precise mapping of transmissivity in aquifers, crucial for management in contexts. A 2024 study applied cross-variograms to interpolate transmissivity values across aquifers, incorporating to support sustainable resource extraction. This hybrid approach integrates auxiliary variables like , improving model reliability for large-scale hydrological assessments. Hybrid -artificial intelligence frameworks address challenges in modeling large datasets by combining variogram-based spatial structure with for enhanced efficiency. A 2025 framework merged simulations with neural networks to characterize geochemical distributions in mine tailings, reducing RMSE by 88–91% compared to ordinary . These integrations enable scalable analysis of high-dimensional environmental data, filling gaps in traditional for real-time applications.

Relation to Covariance and Autocorrelation

Under second-order stationarity, where the function C(\mathbf{h}) = \Cov(Z(\mathbf{x}), Z(\mathbf{x} + \mathbf{h})) depends only on the \mathbf{h}, the variogram \gamma(\mathbf{h}) is directly related to the by the equation \gamma(\mathbf{h}) = C(0) - C(\mathbf{h}), with C(0) representing the variance of the . This arises from expanding the expected squared difference in the variogram definition: \gamma(\mathbf{h}) = \frac{1}{2} \E[(Z(\mathbf{x} + \mathbf{h}) - Z(\mathbf{x}))^2] = \Var(Z(\mathbf{x})) - \Cov(Z(\mathbf{x}), Z(\mathbf{x} + \mathbf{h})) = C(0) - C(\mathbf{h}), assuming a constant mean. The relation holds for processes where the variogram is bounded, corresponding to the sill equaling the variance C(0). The autocorrelation function \rho(\mathbf{h}), defined as the normalized covariance \rho(\mathbf{h}) = C(\mathbf{h}) / C(0), further links the variogram to traditional correlation measures. Substituting into the variogram-covariance relation yields \rho(\mathbf{h}) = 1 - \gamma(\mathbf{h}) / C(0), where C(0) is the sill of the variogram. The correlogram, which plots \rho(\mathbf{h}) against lag, thus provides a normalized view of spatial dependence equivalent to a rescaled variogram, emphasizing similarity rather than dissimilarity. This equivalence underscores how variograms and correlograms both capture spatial autocorrelation, though the former is scaled by half the variance. For a variogram to be valid, it must be conditionally negative definite, ensuring the corresponding function C(\mathbf{h}) = C(0) - \gamma(\mathbf{h}) is positive definite, a requirement for Gaussian processes. This between valid variograms and covariances allows in either direction, but only under bounded variograms where the process has finite variance. In Gaussian process modeling, the is routinely derived from a fitted variogram model using this formula, facilitating and . A key advantage of the variogram over direct covariance estimation is its robustness to unknown or non-constant means, as it relies solely on pairwise differences without requiring mean subtraction or estimation. computation, in contrast, is sensitive to mean misspecification, which can bias results in non-ergodic or trending fields. This property makes the variogram particularly suitable for intrinsically processes, extending its applicability beyond full second-order stationarity.

Cross-Variogram and Multivariate Extensions

The cross-variogram extends the univariate variogram to measure spatial dependence between two distinct random fields, Z_1(\mathbf{x}) and Z_2(\mathbf{x}), assumed to be intrinsically . It is defined as \gamma_{12}(\mathbf{h}) = \frac{1}{2} \mathbb{E} \left[ (Z_1(\mathbf{x}) - Z_1(\mathbf{x} + \mathbf{h})) (Z_2(\mathbf{x}) - Z_2(\mathbf{x} + \mathbf{h})) \right], where \mathbf{h} is the lag vector separating locations \mathbf{x} and \mathbf{x} + \mathbf{h}. This quantifies the average product of differences between paired observations from the two fields at separation \mathbf{h}, enabling the analysis of cross-spatial correlations that may differ from individual auto-variograms. Unlike the univariate case, the cross-variogram can be asymmetric if the fields exhibit directional dependencies, though is often assumed under joint intrinsic stationarity. The linear model of coregionalization (LMC) provides a flexible framework for modeling multivariate variograms by decomposing each variable into a linear combination of independent latent processes with shared spatial structures. In the LMC, a p-variate field \mathbf{Z}(\mathbf{x}) is expressed as \mathbf{Z}(\mathbf{x}) = \sum_{k=1}^K \mathbf{A}_k \mathbf{Y}_k(\mathbf{x}), where \mathbf{A}_k are p \times L_k coefficient matrices, and \mathbf{Y}_k(\mathbf{x}) are independent univariate fields with basic variograms \gamma_k(\mathbf{h}). The resulting cross-variogram matrix is then \boldsymbol{\Gamma}_{12}(\mathbf{h}) = \sum_{k=1}^K \mathbf{A}_{1k} \mathbf{A}_{2k}^\top \gamma_k(\mathbf{h}), ensuring positive definiteness through the summation of nested or independent structures. This model simplifies fitting by reducing multivariate complexity to univariate components, commonly used when direct cross-data are sparse. Seminal formulations trace to early geostatistical texts, with refinements emphasizing parsimonious nested structures for practical implementation. In applications, the cross-variogram underpins cokriging, an extension of that incorporates auxiliary s correlated with the target to improve prediction accuracy, particularly when the primary is undersampled. For instance, in reservoir characterization, cokriging uses (densely measured via logs) as a secondary to estimate permeability (sparsely cored), leveraging their positive to reduce variance. Collocated cokriging further simplifies this by assuming the auxiliary is available at prediction locations, approximating the full with a single , thus avoiding complex inversions while maintaining efficiency for large datasets. This variant is widely adopted in environmental and petroleum for its computational tractability. Matrix-valued variograms generalize the approach to full multivariate covariance structures, representing the entire cross-variogram matrix \boldsymbol{\Gamma}(\mathbf{h}) as a single entity with off-diagonal elements capturing inter-variable dependencies. These models ensure the matrix is conditionally negative definite for valid , often fitted via simultaneous of experimental matrices to enforce intrinsic stationarity across all components. Such formulations are essential for high-dimensional fields, like multispectral data, where joint preserves correlations without assuming independence. High-impact contributions emphasize parsimonious parameterizations to mitigate in sparse multivariate settings.

References

  1. [1]
    Variogram - an overview | ScienceDirect Topics
    The variogram is mainly used in geostatistics to describe the spatial geometry of regionalized variables. It is the basis of various prediction and simulation ...
  2. [2]
    Variogram: Definition, Examples - Statistics How To
    A variogram is an effective tool for describing the behavior of non-stationary, spatial random processes. It is used primarily in spatial statistics, ...
  3. [3]
    Variography — SciKit GStat 1.0.0 documentation
    The variogram relates the separating distance between two observation points to a measure of observation similarity at that given distance.
  4. [4]
    The Variogram Basics: A visual introduction to one of the most useful ...
    The Variogram Basics: A visual introduction to one of the most useful geostatistical concepts · Sill – Perhaps the most important feature of the semivariogram.
  5. [5]
    [PDF] 5. Geostatistics - Insee
    2 Variogram of the regularised variable. Variance by block can be defined based on information about single data points (covariance function). Var[Z(V)] = C(V,V) ...
  6. [6]
  7. [7]
    [PDF] Overview of geostatistics • Let Z(s) and Z(s + h) two random ...
    The quantity 2γ(h) is known as the variogram and is very crucial in geostatistics. The variogram says that differences of variables lagged h-apart vary in a ...
  8. [8]
    [PDF] Lecture 4 – Assumptions of geostatistics - Ecospatial Lab | USM
    Oct 2, 2018 · 6. Semivariograms. Unbounded variogram: The process may be intrinsic but not second‐stationary. Hole effect: Due to regular repetition in the ...
  9. [9]
    Understanding a semivariogram: The range, sill, and nugget ...
    The value that the semivariogram model attains at the range (the value on the y-axis) is called the sill. The partial sill is the sill minus the nugget.
  10. [10]
    Variogram Model — GStatSim - GitHub Pages
    The parameters of a variogram model are the nugget, sill, and range. The nugget or y-intercept represents small scale variability. Some of the nugget effect ...
  11. [11]
    Summary of the readings
    Noel Cressie, one of the big guns of geostatistics, says that a lag needs to be computed from about 50 pairs before it settles down (for it to be reliable).
  12. [12]
    The Sill of the Variogram - Geostatistics Lessons
    Aug 17, 2021 · The sill is commonly considered to be the variogram value where the variogram points or function flatten off at increasing distance.
  13. [13]
    Semi-Variogram: Nugget, Range and Sill - GIS Geography
    SILL: The value at which the model first flattens out. RANGE: The distance at which the model first flattens out. NUGGET: The value at which the semi-variogram ...Missing: Matheron | Show results with:Matheron
  14. [14]
    The influence of variogram parameters on optimal sampling ...
    For a spherical variogram, the magnitude of the relative nugget effect did not affect the sampling schemes, except for very high values (0.75). Introduction.
  15. [15]
    Binning empirical semivariograms—ArcGIS Pro | Documentation
    To reduce the number of points in the empirical semivariogram, the pairs of locations will be grouped based on their distance from one another.
  16. [16]
    [PDF] 4.1.1 The Empirical Variogram
    Most variograms are defined through several parameters; namely, the nugget effect, sill, and range. ... A generic variogram showing the sill, and range parameters ...<|control11|><|separator|>
  17. [17]
    [PDF] The VARIOGRAM Procedure - SAS Support
    The VARIOGRAM procedure computes empirical measures of spatial continuity for two-dimensional spatial data. These measures are a function of the distances ...
  18. [18]
    Accounting for anisotropy using directional semivariogram and ...
    The semivariogram and covariance functions change not only with distance but with direction as well. This is called anisotropy.
  19. [19]
    Calculation and Modeling of Variogram Anisotropy
    Jul 5, 2022 · This lesson reviews the challenge of determining variogram directions and anisotropy in the context of modern geostatistics.
  20. [20]
    [PDF] The VARIOGRAM Procedure - SAS Support
    Examples of minimum-pairs empirical rules include the suggestion by. Journel and Huijbregts (1978, p. 194) to use at least 30 point pairs for each lag class.
  21. [21]
    Creating empirical semivariograms—ArcGIS Pro | Documentation
    To create an empirical semivariogram, determine the squared difference between the values for all pairs of locations. When these are plotted, with half the ...<|control11|><|separator|>
  22. [22]
    [PDF] INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS
    “Geostatistics can be regarded as a collection of numerical. techniques that deal with the characterization of spatial attributes, employing primarily random ...
  23. [23]
    Mining Geostatistics - Google Books
    First published in 1978, this book was the first complete reference work on the subject of mining geostatistics, an attempt to synthesize the practical ...
  24. [24]
    Variogram models — SciKit GStat 1.0.0 documentation
    Scikit-GStat implements different theoretical variogram functions. These model functions expect a single lag value or an array of lag values as input data.
  25. [25]
    Variogram Modeling — GeostatsPy Well-documented ...
    the geometric anisotropy model is based on azimuth of the major direction of continuity, range in the major direction and range in the minor direction ( ...
  26. [26]
    VARFIT: a fortran-77 program for fitting variogram models by ...
    Introduction · for assistance in the trial-and-error visual fitting (fitting `by eye'). · when a high number of variograms must be modelled on a routine basis.<|control11|><|separator|>
  27. [27]
    Fitting variogram models by weighted least squares
    The method of weighted least squares is shown to be an appropriate way of fitting variogram models. The weighting scheme automatically gives most weight to.
  28. [28]
    Cross Validation (Geostatistical Analyst)—ArcGIS Pro | Documentation
    The primary use for this tool is to compare the predicted value to the observed value in order to obtain useful information about some of your model parameters.
  29. [29]
    On the Akaike Information Criterion for choosing models for ...
    A problem in the application of geostatistics to soil is to find satisfactory models for variograms of soil properties. It is usually solved by fitting ...
  30. [30]
  31. [31]
    Another look at anisotropy in geostatistics | Mathematical Geosciences
    In this article, I take another look at the modeling of anisotropy in geostatistics. A new, more specific classification of types of anisotropy is proposed.
  32. [32]
    (PDF) Modeling of zonal anisotropic variograms - ResearchGate
    Aug 6, 2025 · In this paper an overview of models of zonal anisotropy is presented. Models of variograms with zonal anisotropy were fitted to the salinity data using R.Missing: seminal | Show results with:seminal
  33. [33]
    [PDF] Anisotropy models for spatial data - HAL
    Aug 6, 2015 · Abstract This work addresses the question of building useful and valid models of anisotropic variograms for spatial data that go beyond ...Missing: detection seminal
  34. [34]
    [PDF] MODELLING OF GEOMETRIC ANISOTROPIC SPATIAL VARIATION
    Semivariogram modeling is the foundation for geostatistical analysis – in order to apply kriging to a data set it is necessary to model the variogram.Missing: seminal papers
  35. [35]
    [PDF] 3D variogram interpretation and modeling - CCG
    The variogram is a critical input to geostatistical studies: (1) it is a tool to investigate and quantify the spatial variability of the phenomenon under ...
  36. [36]
    Rose diagrams indicating the direction of longest variogram range...
    This paper aims to evaluate and map the groundwater quality in the Gaza Strip by the means of geostatistical procedures including variograms, Kriging and maps ...Missing: detection | Show results with:detection
  37. [37]
    [PDF] Mining Geostatistics
    The distribution of ore grades within a deposit is of mixed character, being partly structured and partly random. On one hand, the mineralizing process.
  38. [38]
    [PDF] Calculation and Modeling of Variogram Anisotropy
    This lesson reviews the challenge of determining variogram directions and anisotropy in the context of modern geostatistics; there are typically many ...
  39. [39]
    An Adaptive Method of Non‐stationary Variogram Modeling for DEM ...
    Jul 12, 2012 · For the adaptive method, the global domain is divided into different meshes with various sizes according to the variability of local variograms.
  40. [40]
    [PDF] Estimators of Fractal Dimension: Assessing the Roughness of Time ...
    Considering both efficiency and robustness, we recommend the use of the madogram es- timator, which can be interpreted as a statistically more efficient ver-.
  41. [41]
    [PDF] Basic Steps in Geostatistics: The Variogram and Kriging
    Where there is evident trend in a variable of interest the variogram is by definition that of the residuals from the trend, and it cannot be approximated by ...
  42. [42]
    Chapter 14 Kriging | Spatial Statistics for Data Science - Paula Moraga
    ... variogram generated with vgm() using a spherical model, and with partial sill, range, and nugget equal to our initial guess values. This plot allows us to ...
  43. [43]
    [PDF] Exercise 10: Change of support pdfkeywords=Geostatistics
    Jan 6, 2014 · The term regularization, applied to variograms, refers to the process by which a variogram at one support is related to that at another. We ...
  44. [44]
    Sequential Gaussian simulation for geosystems modeling
    Compute the experimental variogram and fit it with an appropriate variogram model based on the normal score transformed data. (3). Define a random path such ...
  45. [45]
    The Variogram and Kriging - SpringerLink
    This chapter covers two of the principle techniques of geostatistics that solve this need for prediction; the variogram and kriging.<|separator|>
  46. [46]
    Geostatistics and artificial intelligence coupling - Frontiers
    Dec 11, 2024 · Experimental variogram modelling is an essential process in geostatistics. The use of artificial intelligence (AI) is a new and advanced way ...Missing: paper | Show results with:paper
  47. [47]
    (PDF) Leveraging Deep Learning for Automated Experimental ...
    Feb 3, 2025 · To address these challenges, this paper proposes an automatic fitting method for experimental variogram functions based on deep learning.Missing: Frontiers | Show results with:Frontiers
  48. [48]
    A Quadratic–Exponential Model of Variogram Based on Knowing ...
    In this work, we detailed a procedure for a complete analysis of rainfall time series, from the construction of the experimental variogram to curve fitting.Missing: 2022 | Show results with:2022
  49. [49]
    (PDF) A Quadratic–Exponential Model of Variogram Based on ...
    Oct 15, 2025 · In this work, we detailed a procedure for a complete analysis of rainfall time series, from the construction of the experimental variogram to ...Missing: 2022 | Show results with:2022
  50. [50]
    Random Forest classification of Mediterranean land cover using ...
    The pseudo-cross variogram between the visible and near-infrared bands was the most important textural features for general classification, and the multi- ...
  51. [51]
    Full article: Utilizing image texture to detect land-cover change in ...
    Jul 19, 2010 · Both measures of texture (GLCM and variogram) provided information that increased land-cover change detection accuracy over that for spectral ...
  52. [52]
    Modeling post-holiday surge in COVID-19 cases in Pennsylvania ...
    Many statistical models have been proposed to understand the trends of the COVID-19 pandemic and factors associated with increasing cases. While Poisson ...
  53. [53]
    Geostatistical COVID-19 infection risk maps for Portugal
    Jul 6, 2020 · To predict the spatial distribution of the COVID-19 infection risk for a specific period and the associated uncertainty in mainland Portugal, ...Missing: anisotropic | Show results with:anisotropic
  54. [54]
    Multivariate Geostatistics for Mapping of Transmissivity and ... - MDPI
    This study aims to map the values of the hydraulic transmissivity and their uncertainties in entire the Salitre Karst Aquifer (SKA) using multivariate ...2. Materials And Methods · 3.3. Interpolation And... · 4.2. Cross-Variogram And...
  55. [55]
    (PDF) Multivariate Geostatistics for Mapping of Transmissivity and ...
    Aug 24, 2024 · The application of geostatistical methods allows for spatial interpolation and mapping based on observations combined with uncertainty ...
  56. [56]
    Hybrid geostatistical and deep learning framework for geochemical ...
    Oct 7, 2025 · In this work a hybrid geostatistical–deep learning framework was established to model geochemical distribution in old tailings. This study ...
  57. [57]
    [PDF] Geostatistical Model, Covariance structure and Cokriging
    γ(0) = 0 ... A variogram is a conditionnally negative definite function. In particular: any variogram matrix Γ = [γ(xα −xβ )] is conditionally negative semi- ...
  58. [58]
    [PDF] Analogies and Correspondences Between Variograms and ...
    Oct 12, 2000 · In this paper we present analogous results for variograms, and we explore the relationships between covariance functions and variograms. In ...
  59. [59]
    Geostatistical Models and Methods
    definition of the variogram. The sample variogram is a spatial decomposition or partition of the sample variance, as described in another module, exploratory ...
  60. [60]
    [PDF] Geostatistical Methods in R
    A cross–variogram describes correlation between covariables and is given by: γ12(h) = 1. 2. E[(Z1(x + h) − Z1(x))(Z2(x + h) − Z2(x))], where Z1 and Z2 are ...
  61. [61]
    [PDF] Лhort Note on Models of Coregionalization Abstract Introduction - CCG
    This note provides a synopsis of coregionalization models, including analytical and heuristic models, for modeling multiple variables in geostatistics.
  62. [62]
    Cokriging Prediction Using as Secondary Variable a Functional ...
    Aug 6, 2020 · Cokriging is a geostatistical technique that is used for spatial prediction when realizations of a random field are available.
  63. [63]
    (PDF) Permeability Estimation Based on Cokriged Porosity Data
    Sep 15, 2015 · Abstract and Figures. Estimation of permeability based on cokriged porosity data using geostatistical method.
  64. [64]
    Collocated Cokriging - Geostatistics Lessons
    Jun 25, 2020 · Collocated cokriging simplifies estimation by using an intrinsic model and the collocated secondary data. This lesson will explain and compare the different ...
  65. [65]
    Fitting matrix-valued variogram models by simultaneous ...
    We demonstrate a proposed variogram modeling scheme using a spatial data set. Because the scheme relies on a procedure for simultaneously diagonalizing several ...