Fact-checked by Grok 2 weeks ago

Empirical orthogonal functions

Empirical orthogonal functions (EOFs) are a statistical technique used to decompose a multivariate data set, particularly spatiotemporal fields in geophysics, into a set of orthogonal spatial patterns and corresponding time series that maximize the explained variance of the data. Also known as empirical principal component analysis, EOFs identify dominant modes of variability by computing the eigenvectors and eigenvalues of the data's covariance matrix, often via singular value decomposition for efficient handling of large datasets. This method partitions the data into uncorrelated components, with the leading modes typically accounting for the majority of the total variance, making it valuable for dimensionality reduction and pattern recognition. Originating from principal component analysis developed by Karl Pearson in 1901 and Harold Hotelling in 1933, EOFs were adapted for meteorological applications by Andrey Obukhov in 1947 and popularized by Edward Lorenz in 1956 for statistical weather prediction. In practice, the analysis involves anomaly fields weighted by factors like latitude-dependent cosine for spherical domains, yielding principal components as the temporal amplitudes of each spatial EOF mode. Widely applied in atmospheric and oceanic sciences, EOFs elucidate climate variability patterns such as the , , Madden-Julian Oscillation, and through analyses of variables like sea level pressure, zonal winds, and . While EOFs excel in data compression and highlighting prominent structures in "red" (persistent) processes, their constraint can lead to nonlocal patterns that mix physically distinct phenomena, complicating interpretation as true dynamical modes. Extensions like rotated EOFs improve regional interpretability by relaxing strict , and complex EOFs handle propagating signals, such as in equatorial oscillations. choice and sampling errors further influence results, underscoring the need for cautious application alongside physical validation.

Introduction

Definition and Purpose

Empirical orthogonal functions (EOFs) constitute a set of functions derived directly from a to capture the dominant modes of variability in multivariate or spatiotemporal fields. This empirical approach ensures that the basis functions reflect the intrinsic structures within the data, rather than relying on preconceived mathematical forms. By focusing on patterns of , EOFs enable the extraction of key features from complex datasets, such as those involving interrelated variables over space and time. The primary purpose of EOF analysis is to reduce the dimensionality of high-dimensional while retaining the maximum possible variance through a minimal number of modes. This identifies the most significant patterns that explain the bulk of the observed variability, simplifying and without substantial loss of information. EOFs are obtained via the eigenvalue of the structure, where the leading eigenvectors correspond to the modes explaining the largest variances. In this representation, the data is reconstructed as a of spatial patterns— the EOF modes—each weighted by temporal amplitudes known as principal components. These components separate the spatial structure from the time-varying signals, highlighting both the geographic distribution of variability and its temporal dynamics. For example, in a simple univariate of temperature anomalies recorded at multiple sites, the first EOF mode might depict a uniform spatial pattern across locations, accounting for a large of the total variance through a global trend, while subsequent modes isolate site-specific fluctuations. This variance illustrates how EOFs prioritize influential patterns to distill essential insights from the data.

Historical Development

The concept of empirical orthogonal functions (EOFs) traces its roots to the development of (PCA) in statistics. Karl introduced the foundational ideas in 1901, describing a method for finding lines and planes of closest fit to systems of points in space, which laid the groundwork for decomposing correlated variables into orthogonal components. Harold further advanced this in the 1930s, particularly through his 1933 work on analyzing complexes of statistical variables into principal components, emphasizing the mathematical formalism of orthogonal transformations for . These statistical techniques provided the theoretical basis for EOFs, though their application to spatiotemporal data in geosciences emerged later. EOFs gained prominence in during the mid-20th century as a practical tool for handling geophysical datasets. Andrey Obukhov applied the in 1947 for smoothing meteorological fields, marking one of the earliest uses in the field. This was followed by A. Fukuoka's 1951 mention of similar techniques, but it was Edward N. Lorenz who popularized EOFs in 1956 through his work on statistical at , where he coined the term and demonstrated its utility for forecasting by reducing dimensionality while preserving variance. Lorenz's approach shifted EOFs from pure theory to an empirical suited for large-scale atmospheric , influencing early models in the post-1950s era. By the 1970s, EOFs had evolved into a standard tool for (NWP), aiding in data compression and pattern identification amid growing computational capabilities. Their adoption extended to in the , propelled by Rudolph W. Preisendorfer's comprehensive 1988 monograph, Principal Component Analysis in and , which detailed EOF applications for spatiotemporal analysis in marine and atmospheric sciences. This period marked EOFs' transition into a versatile empirical method for climate modeling, enabling efficient processing of extensive datasets from observations and simulations.

Mathematical Formulation

Data Representation and Covariance

In empirical orthogonal function (EOF) analysis, spatiotemporal data are typically represented as an anomaly X of dimensions n \times p, where n denotes the number of time points and p the number of spatial locations or grid points. Each row of X corresponds to a time of the field across all spatial points, and anomalies are computed by subtracting the temporal (climatology) at each spatial point to ensure zero : x'_{tk} = x_{tk} - \bar{x}_k, where \bar{x}_k = \frac{1}{n} \sum_{t=1}^n x_{tk}. This demeaning step centers the data, focusing the analysis on deviations from the long-term average rather than absolute values. The covariance matrix C is then constructed from the anomaly matrix as C = \frac{1}{n} X^T X, a p \times p that quantifies the statistical relationships within the . This formulation captures the spatial averaged over time, with each element c_{ij} = \frac{1}{n} \sum_{t=1}^n x'_{ti} x'_{tj} representing the between the time series at spatial points i and j. The diagonal elements c_{ii} correspond to the variances at individual spatial points, while off-diagonal elements reflect correlations, enabling the identification of coherent spatial patterns in the variability. EOF analysis assumes the underlying data process is , meaning statistical properties like the structure remain constant over time, which supports the validity of the sample as an estimate of the population . The zero-mean condition is enforced by construction through computation, but real-world datasets often include missing values due to observational gaps, which are handled via imputation methods such as iterative EOF-based to fill gaps before estimation. This setup of the provides the foundation for the subsequent , which yields the orthogonal modes.

Eigenvalue Problem and Modes

The eigenvalue problem forms the mathematical core of empirical orthogonal function (EOF) analysis, where the \mathbf{C} of the centered data is decomposed to identify the principal modes of variability. The problem is formulated as \mathbf{C} \mathbf{v}_k = \lambda_k \mathbf{v}_k, with \lambda_k denoting the eigenvalues that quantify the amount of variance explained by the k-th mode and \mathbf{v}_k the corresponding eigenvectors representing the spatial EOF patterns. These eigenvalues \lambda_k are positive and indicate the relative importance of each mode in capturing the data's total variance, often expressed as a \frac{\lambda_k}{\sum_j \lambda_j} \times 100\%. The EOF modes are conventionally ordered by descending eigenvalues, such that \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_p, where p is the rank of \mathbf{C}; this ordering ensures that the first few modes account for the majority of the variability, facilitating data reduction and focus on dominant patterns. The original data can be reconstructed or approximated using the leading modes via \mathbf{X} \approx \sum_k \mathbf{a}_k \mathbf{v}_k^T, where \mathbf{a}_k = \mathbf{X} \mathbf{v}_k are the principal component time series (also called scores) that describe the temporal evolution of the k-th mode and have variance \lambda_k. Truncating the sum to the first r < p modes provides a low-dimensional representation that retains most of the signal while filtering noise. The eigenvectors \mathbf{v}_k are normalized to be orthonormal, satisfying \mathbf{v}_i^T \mathbf{v}_j = \delta_{ij} (where \delta_{ij} is the ), which guarantees that the modes are mutually orthogonal and of unit length, forming a complete basis for the data space. In physical terms, the EOF modes reveal coherent spatial structures of variability; for instance, the leading EOF of Northern Hemisphere sea-level pressure often exhibits a dipole pattern with opposite-signed anomalies over the Arctic and mid-latitudes, representing the and influencing hemispheric weather patterns.

Computation Methods

Step-by-Step Procedure

The computation of empirical orthogonal functions (EOFs) involves a systematic sequence of steps to extract dominant patterns from spatiotemporal datasets, ensuring the analysis captures variability while minimizing artifacts from data inconsistencies. This procedure assumes a dataset consisting of observations at multiple spatial points over time, typically represented as anomalies to focus on fluctuations rather than means. Step 1: Collect and preprocess data. Begin by assembling the raw spatiotemporal data into a suitable format, such as a matrix where rows correspond to time points and columns to spatial locations or variables. Preprocessing is essential to isolate variability: remove the temporal mean (climatology) at each spatial point to obtain anomalies, which centers the data and eliminates the influence of long-term averages. Detrending may be applied to remove linear trends if the focus is on oscillatory or nonlinear variations, using techniques like least-squares fitting. Normalization, such as dividing by standard deviation or applying spatial weights (e.g., \sqrt{\cos(\phi)} for latitude \phi to account for grid distortions), ensures equitable contribution from different locations. Handle missing values or gaps through interpolation or exclusion to avoid biasing the covariance structure, particularly in irregularly sampled oceanographic or atmospheric data. Step 2: Form the data and compute . Organize the preprocessed anomalies into a data X of dimensions n \times p, where n is the number of time samples (observations) and p is the number of spatial points (variables). If n \geq p, compute the sample S = \frac{1}{n-1} X^T X (or \frac{1}{n} for large n); this p \times p quantifies the variance and covariances among spatial points over time. If p > n, compute the temporal S = \frac{1}{n-1} X X^T (or \frac{1}{n} for large n); this n \times n is smaller and numerically more stable, forming the basis for deriving orthogonal modes. This is symmetric and positive semi-definite, with its equaling the total variance of the dataset. Step 3: Perform and sort modes by eigenvalue magnitude. If n \geq p, apply eigenvalue to the spatial , solving for eigenvalues \lambda_k and eigenvectors v_k (EOFs) that represent the spatial patterns of variability. If p > n, apply eigenvalue to the temporal to obtain eigenvalues \lambda_k and eigenvectors u_k (temporal patterns), then derive the spatial EOFs as v_k = \frac{1}{\sqrt{\lambda_k}} X^T u_k. The eigenvalues indicate the amount of variance explained by each mode. Sort the modes in descending order of eigenvalue magnitude, so the leading EOFs capture the largest fractions of total variance; typically, the first few modes account for the bulk of the signal. This ordering prioritizes physically meaningful dominant structures over noise-dominated lower modes. Step 4: Compute and interpret scores (temporal coefficients). Project the anomaly matrix onto the sorted EOFs to obtain the principal component (PC) scores, or temporal coefficients, given by a_k = X v_k where v_k is the k-th spatial EOF; these scores form orthogonal time series that describe how each spatial mode evolves temporally. (In the p > n case, the u_k serve as the temporal patterns, scaled appropriately.) Interpretation involves examining the PCs for patterns like oscillations or trends, linking them to physical processes—e.g., the leading PC might correspond to a seasonal cycle in sea surface temperatures. The normalization ensures that the variance of each PC equals its corresponding eigenvalue. Step 5: Validate via reconstruction error or plots. Reconstruct the original anomalies as \hat{X} = \sum_{k=1}^m a_k v_k^T using the first m modes and compute the reconstruction error (e.g., root-mean-square difference) to quantify how well the retained modes approximate the data; low error confirms the modes capture essential variability. Generate plots of eigenvalues or cumulative variance explained (e.g., the first few modes often capture over 90% of total variance in coherent fields like ), and use the North rule to assess mode separability based on sampling \delta \lambda_k \approx \lambda_k \sqrt{2/n^*}, where n^* is the effective sample size accounting for . Considerations for sample size are crucial: if the number of time points n greatly exceeds spatial points p (n \gg p), compute the from X^T X to avoid ill-conditioning; conversely, if p \gg n, use X X^T and relate modes via the derivation above for . Inadequate sample size relative to domain complexity can lead to spurious modes, so effective must exceed the number of retained modes for reliable results.

Numerical Implementation

The primary algorithm for computing empirical orthogonal functions (EOFs) is the () of the centered X, which factors it as X = U [\Sigma](/page/Sigma) V^T, where the columns of V provide the EOFs, [\Sigma](/page/Sigma) contains the singular values, and U gives the principal components. is preferred over direct eigendecomposition of the due to its superior , particularly for large or high-dimensional datasets where forming the covariance matrix explicitly can amplify rounding errors and lead to instability. This approach is especially advantageous in geophysical applications, where data matrices often have dimensions n \times m with n (time points) or m (spatial points) exceeding thousands, as avoids the need to compute and store the full m \times m covariance matrix. For ill-conditioned data matrices, where small perturbations can cause large changes in the EOFs due to near-zero singular values, regularization techniques such as methods are applied by adding a penalty term \lambda I to the before decomposition, stabilizing the solution while shrinking less important modes. This regularization, equivalent to modifying the by thresholding small singular values, helps mitigate in noisy or sparse datasets common in observations. Implementations are available in several programming languages and libraries tailored for scientific computing. In , the library's class performs EOF analysis via on centered data, supporting efficient handling of large arrays through options like randomized for approximation. The xarray ecosystem extends this with the xeofs package, which integrates EOF computation directly with labeled multidimensional arrays, preserving metadata for climate datasets. MATLAB's pca function in the Statistics and Toolbox computes EOFs using , with built-in support for economy-size decompositions to reduce memory usage. In , the prcomp function from the stats package offers -based EOFs, optimized for statistical workflows. For high-performance needs in large-scale simulations, Fortran libraries such as those developed for EOF computation at , requiring and BLAS, support integration with parallel processing in climate modeling. The of SVD-based EOF analysis is O(\min(n,m)^2 \max(n,m)), making it feasible for moderate-sized problems but challenging for very high-dimensional data exceeding available . To address this, techniques such as the or using randomized algorithms can reduce effective dimensions while retaining accuracy for the leading modes.

Applications

Atmospheric and Climate Sciences

Empirical orthogonal functions (EOFs) have been extensively applied in atmospheric and sciences to identify and characterize large-scale patterns of variability in fields such as (SST) and sea level pressure (SLP). These modes help isolate coherent spatial structures associated with climate phenomena, enabling researchers to quantify their influence on regional and global climate dynamics. By decomposing multivariate datasets into orthogonal modes ranked by explained variance, EOF analysis reveals dominant signals that underpin teleconnections—remote linkages between and surface conditions—facilitating improved understanding of variability beyond local scales. A prominent application is the identification of teleconnection patterns like the El Niño-Southern Oscillation (ENSO) through EOF analysis of tropical Pacific anomalies. The leading EOF mode typically captures the mature phase of ENSO, exhibiting a structure with warming in the central-eastern Pacific and cooling in the western Pacific, explaining a substantial portion (often 20-40%) of total variance in the region. This mode's principal component aligns closely with ENSO indices such as Niño-3.4, allowing for the monitoring and prediction of associated global impacts, including altered and temperature patterns. Seminal work demonstrated this by applying EOFs to monthly Pacific data from 1947-1972, where the first mode highlighted ENSO's interannual signature. Similarly, EOFs are used to detect the (NAO), a key mode of atmospheric variability, via analysis of SLP fields over the North Atlantic sector. The first EOF mode represents the NAO, characterized by a north-south dipole in SLP anomalies—low pressure over and high pressure over the —accounting for up to 40% of wintertime SLP variance and influencing and North American weather through shifts in storm tracks and temperature gradients. Rotated EOF techniques, as refined in foundational studies, enhance the isolation of this pattern from other variability, confirming its persistence across seasons and its role in modulating westerly winds. In climate model validation, EOFs provide a robust for comparing simulated atmospheric variability against observations, such as those from the ERA5 reanalysis dataset, which integrates historical measurements with model dynamics to represent past climate states. By computing EOF modes from model outputs and reanalysis SLP or fields, researchers assess how well simulations reproduce patterns like NAO or ENSO, evaluating metrics such as spatial pattern correlation and explained variance ratios. For instance, multi-model ensembles under CMIP reveal that while many models capture the leading modes' structures, discrepancies in variance partitioning highlight biases in tropical-extratropical teleconnections, aiding refinements in model physics. This approach has been instrumental in evaluating internal variability in coupled models against ERA5-derived modes. A illustrative case study involves the first EOF mode of global surface air anomalies from post-1980s datasets, which predominantly captures the monotonic signal driven by forcings. This mode shows a nearly uniform positive loading across latitudes, explaining over 50% of total variance in recent decades, with its principal component exhibiting a clear upward trend aligned with rising concentrations. Analysis of datasets like HadCRUT5 confirms this mode's dominance, distinguishing the forced warming response from oscillatory variability in higher modes, and underscoring EOFs' utility in attributing long-term .

Oceanographic and Geophysical Data

Empirical orthogonal functions (EOFs) have been extensively applied to oceanographic datasets, such as sea surface height anomaly (SSHA) fields, to identify dominant circulation patterns like gyres. In the , EOF analysis of SSHA data from the TOPEX/Poseidon satellite mission revealed two primary spatial patterns: a basin-wide gyre mode accounting for 24.16% of the variance, characterized by anticyclonic circulation during winter monsoons and cyclonic flow in summer, and a north-south double gyre mode explaining 14.59% of the variance, highlighting interactions between seasonal and interannual processes. Similarly, in the northern , EOF of satellite-derived sea surface heights identified a leading mode explaining approximately 64% of the total variance, depicting a semi-annual cyclonic subpolar gyre intensified by forcing in summer and wind stress in winter. EOFs are also employed in salinity fields to uncover upwelling dynamics, which bring nutrient-rich subsurface waters to the surface and influence marine productivity. Off the northwest coast of , EOF analysis of in the upwelling zone demonstrated cooling events accompanied by sea surface increases, with the primary mode capturing basin-scale variability linked to wind-driven upwelling during summer. In the Taiwan Strait, spatiotemporal EOFs applied to summer coastal upwelling indices revealed interannual fluctuations driven by , where the first mode explained over 50% of the variance and isolated persistent upwelling favorable conditions associated with enhanced northeasterly . In geophysical contexts, spatiotemporal EOFs facilitate the analysis of seismic waves and postseismic deformations to elucidate fault dynamics following major earthquakes. After the 2011 Tohoku-Oki earthquake (Mw 9.0), a functional spatiotemporal model incorporating EOF-like of GNSS data modeled postseismic relaxation over a decade, capturing afterslip and viscoelastic responses with spatial patterns correlating to fault slip rates up to 1 cm/year and residuals below 0.8 cm vertically. This approach highlights how EOFs separate transient deformation signals from steady-state plate motion, aiding in the quantification of fault reloading processes. A prominent example of EOF application in fields is the identification of the (PDO), a basin-scale mode influencing interdecadal climate variability and ecosystems. The PDO is defined as the leading EOF mode of North Pacific anomalies (north of 20°N), explaining 20-26% of the variance and featuring positive anomalies in the eastern Pacific during its warm , which historically boosted fisheries in the but has shifted to negative correlations post-2014 due to overriding pan-basin warming. These temperature patterns, derived from EOFs, link PDO phases to fishery yields, with positive phases enhancing catches through favorable ocean conditions for juvenile survival. Integration of EOFs with satellite altimetry data, such as from the series missions, has enabled the extraction of global ocean modes from records. Cyclostationary EOF analysis of Jason-1, Jason-2, and altimetry data (2004-2016) isolated key modes including a low-frequency decadal signal (11% variance) akin to the PDO and a ENSO-related mode (10% variance), providing insights into internal ocean variability separate from long-term trends. These modes, often explaining substantial fractions of total variance, reveal basin-wide teleconnections in dynamic height fields critical for understanding global circulation shifts.

Relations to Other Techniques

Connection to Principal Component Analysis

Empirical orthogonal functions (EOFs) represent a specialized application of (PCA) in the geosciences, particularly for analyzing spatiotemporal data such as climate and oceanographic fields. In this context, the spatial patterns derived from EOF analysis correspond directly to the PCA loadings, which capture the orthogonal modes of variability across spatial dimensions, while the associated , known as principal components or expansion coefficients, align with the PCA scores that describe temporal evolution. This mapping allows EOFs to decompose complex datasets into a set of uncorrelated modes ordered by explained variance, facilitating and pattern identification in high-dimensional geophysical observations. Both EOFs and PCA share the fundamental objective of identifying an orthogonal transformation that maximizes the variance captured in successive projections of the data. This shared principle stems from solving the eigenvalue problem of the data covariance matrix, where the eigenvectors represent the directions of maximum variance. Historically, PCA was formalized by in 1933 as a statistical method for reducing the dimensionality of multivariate data while preserving variance. EOFs emerged as an empirical variant tailored to non-stationary spatiotemporal fields in and , with early applications traced to the late 1940s and 1950s, building on Hotelling's framework to handle geophysical datasets. The methods coincide precisely when applied to centered data, where EOF analysis on the yields identical results to . Centering the data by subtracting the ensures that the decomposition focuses on anomalies rather than the overall state, aligning the variance maximization across both techniques. This equivalence underscores EOFs as PCA adapted for geoscientific contexts, where spatial and temporal structures are explicitly separated to reveal dominant modes of variability, such as those in sea level pressure or fields.

Distinctions from Proper Orthogonal Decomposition

Empirical orthogonal functions (EOFs) and proper orthogonal decomposition (POD) share a common mathematical foundation in eigenvalue decomposition of data covariance structures but diverge in their theoretical emphasis and practical implementation. POD originates from turbulence theory, where it provides an optimal basis for representing flow fields by maximizing the kinetic energy captured in a finite number of modes, defined through an inner product that reflects the physical energy norm, such as \langle \mathbf{u}, \mathbf{v} \rangle = \int \mathbf{u} \cdot \mathbf{v} \, dV for velocity fields. In contrast, EOFs prioritize statistical optimality by maximizing the variance explained in empirical datasets, typically employing the standard L2 inner product \langle \phi, \psi \rangle = \int \phi \psi \, dV to yield orthogonal spatial patterns that capture the dominant variability in observational records. These distinctions arise from their disciplinary origins: POD is rooted in theoretical for analyzing coherent structures in turbulent s, often applied in contexts like aerodynamic and reduced-order modeling of simulations. EOFs, developed for statistical weather prediction, are empirically driven and suited to geophysical , such as identifying teleconnection patterns in records where physical norms are less relevant than variance. A subtle computational difference lies in the snapshot method commonly used in POD, which constructs the basis from a finite set of flow snapshots by solving an eigenvalue problem on the matrix of these snapshots, ensuring in the sense to prioritize energetically dominant modes. EOFs, handling spatiotemporal datasets, typically emphasize the covariance matrix for direct variance maximization across the full data ensemble, though both approaches yield similar decompositions when the inner product aligns. This normalization variance in POD underscores its physics-based focus, while EOFs remain agnostic to specific physical metrics beyond statistical .

Limitations and Extensions

Common Challenges

One significant challenge in empirical orthogonal function (EOF) analysis is the non-uniqueness of the derived modes, particularly when employing rotation techniques such as varimax to enhance physical interpretability. Rotated EOFs can preserve the total explained variance while yielding substantially different spatial patterns, as the choice of the number of modes to rotate introduces arbitrariness and affects stability. For instance, rotating fewer modes (e.g., the leading six) versus a larger set (e.g., 20) can lead to varying representations of teleconnection patterns like the . This ambiguity complicates the identification of physically meaningful structures, as equivalent variance explanations may support multiple interpretations. EOF analysis relies on key assumptions of stationarity and in the underlying data, which are frequently violated in geophysical datasets, resulting in spurious or misleading modes. Non-stationary processes, such as those driven by long-term shifts, can contaminate the modes; for example, a monotonic trend often dominates the leading EOF as a uniform spatial pattern with a linearly increasing temporal component, masking underlying oscillatory variability. Nonlinear further exacerbate this issue by producing modal mixing, where multiple physical processes are conflated into a single EOF rather than separated orthogonally. Such violations lead to interpretations that do not align with the true of the system, as seen in analyses of fields where non-stationarity distorts mode extraction. Sampling errors pose another critical limitation, especially with short observational records common in and oceanographic studies, which undermine the reliability of low-frequency modes. Finite sample sizes introduce uncertainty in eigenvalue estimates, potentially causing overlapping modes where distinct signals are indistinguishable, particularly for weakly energetic higher-order EOFs. This effect is pronounced in datasets with limited temporal coverage, such as decadal records, where simulations are often required to assess mode separability and significance. Equivocation arises when EOFs impose artificial constraints that do not correspond to physical processes, often resulting in the leading mode manifesting as a or global pattern that obscures regional dynamics. For example, in tropical analyses, the first EOF may capture a basin-wide signal, superimposing diverse influences like El Niño-Southern and local trends, thereby equivocating the representation of localized variability. This in space and time can generate spurious dipoles or patterns that lack dynamical basis, hindering the linkage of modes to underlying physics.

Advanced Variants

Complex empirical orthogonal functions (CEOFs) extend traditional EOF to handle oscillatory or propagating signals in , particularly those with relationships, by incorporating -valued representations. This variant is particularly useful for analyzing fields like wind velocities or wave motions where real-valued EOFs may fail to capture coherent propagation. To implement CEOFs, the field is complexified, often by applying the to the original to obtain the imaginary part, which effectively shifts the by 90 degrees and reveals the instantaneous amplitude and of oscillations. The resulting complex matrix is then subjected to eigenvalue decomposition of its Hermitian , yielding spatial patterns and that describe both standing and traveling waves. For instance, in analyzing the (QBO) in stratospheric zonal winds from 1958 to 2001, the leading CEOF explained 71.32% of the variance and captured the downward propagation at approximately 1 km per month. This approach was pioneered in applications to geophysical , such as variability. Extended empirical orthogonal functions (EEOFs) build on standard EOFs by incorporating temporal lags to account for both spatial and time-delayed correlations, making them suitable for identifying teleconnections and propagating phenomena like the Madden-Julian Oscillation (MJO). In EEOF analysis, the data matrix is augmented with multiple time lags (e.g., M consecutive time steps), forming an extended that captures evolution over short time windows, such as 10-15 days for MJO studies. The is then computed from this , and eigenvalue reveals modes that represent coherent space-time patterns, often visualized as Hovmöller diagrams to show phase speeds and growth regions. For example, applying EEOFs to (OLR) data over 5 years identified the MJO's 30-60 day periodicity and eastward propagation across the . This method enhances the detection of dynamic structures in climate data, such as ENSO teleconnections, by filtering out noise from uncorrelated variability. Rotated empirical orthogonal functions (REOFs) address limitations in the physical interpretability of standard EOFs by applying an rotation to the leading modes, which relaxes the strict spatial constraint to produce more localized and regionally coherent patterns. Rotation criteria, such as the varimax method, maximize the variance of the squared loadings on each component, simplifying structures and reducing domain dependence that can mix physically distinct features in unrotated EOFs. The rotated patterns maintain the total explained variance but redistribute it to emphasize simpler, interpretable modes; for instance, in sea-level pressure data, REOFs isolated the (NAO) as a without contamination from Pacific influences. This technique is widely used in teleconnection studies, where unrotated EOFs might blend multiple circulation regimes. Seminal applications demonstrated its value in classifying low-frequency atmospheric patterns across seasons. Multichannel empirical orthogonal functions (MEOFs), also known as multivariate EOFs, generalize the method to simultaneous of multiple interrelated fields or variables, such as , , and nutrients in oceanographic data, by treating them as a single vector-valued field. This approach computes a across all channels, revealing coupled modes that capture inter-variable covariability, which is crucial for understanding complex systems like ocean biogeochemistry. For example, MEOFs applied to , , and potential profiles reconstructed historical distributions with improved accuracy over univariate methods. In contexts, kernel EOFs (kEOFs) further extend this to nonlinear manifolds by employing the kernel trick, mapping data into a high-dimensional feature via nonlinear functions (e.g., Gaussian kernels) without explicit computation, akin to . The eigenvalue problem is solved in the kernel-induced using the , enabling the extraction of nonlinear principal modes; applications to global sea surface height anomalies from 1992-2008 highlighted curved structures in ENSO and ocean currents that linear EOFs missed. These variants are particularly impactful in data-driven modeling of nonlinear dynamics.

References

  1. [1]
    Empirical Orthogonal Function (EOF) Analysis - Climate Data Guide
    In climate studies, EOF analysis is often used to study possible spatial modes (ie, patterns) of variability and how they change with time.
  2. [2]
    [PDF] Empirical orthogonal functions and related techniques in ...
    May 22, 2007 · Empirical orthogonal function (EOF) analysis. (Fukuoka, 1951; Lorenz, 1956) is among the most widely and extensively used methods in atmospheric ...
  3. [3]
    Empirical Orthogonal Functions: The Medium is the Message in
    Empirical orthogonal function (EOF) analysis is a powerful tool for data compression and dimensionality reduction used broadly in meteorology and oceanography.
  4. [4]
    [PDF] UCSD—SIO 221c: EOFs (Gille) 1 Figure 1
    Empirical orthogonal functions represent dominant patterns of variability in a space-time record, using as few spatial patterns as possible. (EOFs go by a ...
  5. [5]
    [PDF] Part 4: Time Series II - UCI ESS
    Empirical Orthogonal Function (EOF) analysis attempts to find a relatively small number of independent variables. (predictors; factors) which convey as much ...
  6. [6]
    [PDF] Empirical orthogonal functions and statistical weather prediction.
    Regardless of whether far-past or near-past history is more desirable, the dynamic equations suggest that near-past history alone may be of considerable use.
  7. [7]
    EOF Calculations and Data Filling from Incomplete Oceanographic ...
    The paper presents a new self-consistent method to infer missing data from oceanographic data series and to extract the relevant empirical orthogonal functions.
  8. [8]
    [PDF] 1 1. Overview In these notes we discuss a family of linear analysis ...
    These linear analysis methods identify preferred structures in data matrices, using domains like parameter, space, and time, and include EOF, SVD, and CCA.
  9. [9]
    Empirical Orthogonal Functions
    Empirical Orthogonal Functions¶. We will use a simple example to ... Using the normal matrix notation, the data matrix, A, is dat_dm , with each ...
  10. [10]
  11. [11]
    PCA — scikit-learn 1.7.2 documentation
    Principal component analysis (PCA). Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.Iris Dataset · Sklearn.decomposition · KernelPCA · SparsePCA
  12. [12]
    EOF software - David W. Pierce, Scripps Institution of Oceanography
    This page provides Fortran software for calculating empirical orthogonal functions (EOFs). EOFs are used for decomposing data sets that have two or more ...
  13. [13]
    Empirical Orthogonal Teleconnections - AMS Journals
    Apr 15, 2000 · Lorenz, E. N., 1956: Empirical orthogonal functions and statistical weather prediction. Statistical Forecasting Project, Scientific. Rep. 1 ...
  14. [14]
    Empirical Orthogonal Analysis of Pacific Sea Surface Temperatures in
    An empirical orthogonal function analysis has been performed on monthly mean sea surface temperatures for the greater part of the Pacific Ocean between 55°N and ...
  15. [15]
    Low-dimensional representations of Niño 3.4 evolution and ... - Nature
    Jun 24, 2020 · For instance, the leading Empirical Orthogonal Function (EOF) of monthly sea surface temperature (SST) anomalies is associated with ENSO ...
  16. [16]
    Uncertainty Estimates of the EOF-Derived North Atlantic Oscillation in
    An empirical distribution of the occurrences of NAO action centers, or nodes, is also provided by the bootstrap technique.Abstract · Introduction · Data and methods · Results
  17. [17]
    Various ways of using empirical orthogonal functions for climate ...
    May 26, 2023 · We present a framework for evaluating multi-model ensembles based on common empirical orthogonal functions (common EOFs) that emphasize salient features.Missing: definition | Show results with:definition
  18. [18]
    Common EOFs: a tool for multi-model comparison and evaluation
    Jul 14, 2022 · Common EOFs provide a natural basis to compare models' time series and their explained variance, and allow intuitive model evaluation, following ...
  19. [19]
    Global temperature modes shed light on the Holocene ... - Nature
    Sep 18, 2020 · First two spatial empirical orthogonal function (EOF) ... warming mode indicating the robustness of the global warming signal in all three models.
  20. [20]
    Global-mean surface temperature variability: space–time ...
    Oct 28, 2017 · The observed global-mean surface temperature (GST) has been warming in the presence of increasing atmospheric concentration of greenhouse gases.
  21. [21]
  22. [22]
  23. [23]
    [PDF] The Proper Orthogonal Decomposition in the Analysis of Turbulent ...
    INTRODUCTION. 1.1 The Problems of Turbulence. It has often been remarked that turbulence is a subject of great scientific.
  24. [24]
    TURBULENCE AND THE DYNAMICS OF COHERENT ...
    Ill we deal with the alteration of coherent structures under changes of parameter, such as Reynolds number, Rayleigh number, and so forth. Again, while the ...
  25. [25]
    [PDF] Strategies for model reduction: comparing different optimal bases
    Mar 29, 2004 · Table 1: EOF variance spectra, using L2 norm M0 and kinetic energy norm M1. Shown are the cumulative variances of the CDV model data. Model.
  26. [26]
    A Comparison Study of EOF Techniques: Analysis of Nonstationary ...
    It is emphasized that fitting a prescribed model to a dataset can result in spurious modes or modal mixing. In the PXEOF technique, given data are ...
  27. [27]
    A Cautionary Note on the Interpretation of EOFs in - AMS Journals
    The EOF analysis always represents modes of variability that are orthogonal in time and space. The constraint of the orthogonality in space is often not ...
  28. [28]
    Empirical orthogonal functions and related techniques in ...
    May 22, 2007 · Empirical orthogonal functions (EOFs) were first used in meteorology in the late 1940s. The method, which decomposes a space-time field into spatial patterns.
  29. [29]
    Examples of Extended Empirical Orthogonal Function Analyses in
    An extended empirical orthogonal function analysis technique is described which expands a data set in terms of functions which are the “best” representation of ...
  30. [30]
    [PDF] kernel empirical orthogonal function analysis of 1992-2008 global
    This paper describes a kernel version of empirical orthogonal function (EOF) analysis and its application to detect patterns of interest in global monthly mean ...