Fact-checked by Grok 2 weeks ago

Least-squares spectral analysis

Least-squares spectral analysis (LSSA) is a statistical for estimating the frequency spectrum of a by fitting a composed of sinusoids to the data using the principle of , which minimizes the sum of squared residuals between observations and the model. This approach is particularly advantageous for handling irregularly spaced or gapped data, where traditional methods like the fail due to requirements for uniform sampling. Introduced by Petr Vaníček in 1969, LSSA originated as an approximate method known as "successive spectral analysis," building on earlier ideas of fitting sine waves to pre-selected frequencies from periodograms. Vaníček's work emphasized its application to geophysical and astronomical data, where measurement irregularities are common, and it was further refined in subsequent publications to incorporate structures and deterministic components like trends. The method computes the as the ratio of the model's explained variance to the total variance, often expressed through the \Phi where the fitted function g = \Phi x and s = \frac{g^T C_f^{-1} g}{f^T C_f^{-1} f}, with C_f as the data . LSSA excels in reducing spectral leakage—a common artifact in Fourier analysis—by explicitly accounting for data correlations and allowing the removal of known signals before residual analysis. It also supports rigorous statistical testing of spectral peaks using F-distributions to assess significance, making it valuable for detecting periodicities in noisy environments. Applications span diverse fields, including for data regularization and noise attenuation, astronomy for periodograms of unevenly sampled light curves (as in the related Lomb-Scargle method), and for analyzing GPS with gaps. Implementations, such as the software developed at the , have evolved through revisions, with version 5.02 incorporating advanced features for practical use in research.

Overview

Definition and principles

Least-squares spectral analysis (LSSA) is a statistical for identifying periodic components in time series , particularly those with uneven or irregular sampling intervals, by fitting a composed of sinusoids to the observed through minimization of the sum of squared residuals. This approach treats the problem as a task, where the amplitudes and phases of potential frequencies are estimated via ordinary , providing a that quantifies the power at trial frequencies. The primary purpose of LSSA is to detect and characterize hidden periodic signals in noisy datasets where traditional Fourier-based methods, such as the (FFT), are inapplicable due to their requirement for uniformly spaced observations. By directly optimizing the fit of without assuming even sampling, LSSA enables the analysis of real-world , such as astronomical observations or geophysical measurements, that often feature gaps or irregular timing. At its core, LSSA models the observed data y(t_i) at times t_i as a sum of sinusoids plus a and : y(t_i) = a_0 + \sum_{k=1}^m \left[ a_k \cos(\omega_k t_i) + b_k \sin(\omega_k t_i) \right] + \epsilon_i, where a_0 is the mean level, a_k and b_k are the cosine and sine coefficients for the k-th \omega_k, m is the number of frequencies, and \epsilon_i represents . The coefficients are estimated by solving the least-squares problem to minimize \sum_i \epsilon_i^2, yielding the spectral power as a measure of fit quality for each \omega_k. A key advantage of LSSA is its ability to handle or irregularly data points natively, avoiding the need for or that can introduce artifacts in FFT-based analyses. This makes it particularly suitable for applications involving sparse observations, where the method's flexibility preserves the integrity of the underlying signal without distorting the content.

Relation to other techniques

Spectral analysis methods seek to decompose data into frequency components to reveal periodic structures and oscillations within the signal. (LSSA) achieves this by fitting sinusoidal basis functions to the data via least-squares optimization, providing a flexible framework for spectrum estimation. Unlike the (DFT) and (FFT), which presuppose uniform sampling and may introduce artifacts when confronted with irregular or gapped data, LSSA accommodates non-uniform sampling directly by minimizing the residuals between observed values and the fitted periodic model. This contrast highlights LSSA's robustness in scenarios where preprocessing like would otherwise distort the signal. LSSA relates to classical periodogram techniques as a least-squares-based counterpart for power spectral density estimation; the traditional derives from the squared DFT coefficients under even spacing assumptions, whereas LSSA generalizes this to uneven data through explicit sinusoidal fitting. The Lomb-Scargle emerges as a prominent variant within this paradigm, equivalent to a least-squares fit for single-frequency sinusoids in irregularly sampled series. LSSA proves particularly valuable for astrophysical light curves, where observations are often irregularly timed due to satellite scheduling or weather variability, enabling reliable periodicity detection in variable star data. In geophysics, it excels with gapped datasets from instruments like superconducting gravimeters, allowing analysis of tidal or seismic signals without aliasing from missing observations.

Mathematical foundations

Least-squares fitting for periodic models

Least-squares fitting forms the core of least-squares spectral analysis (LSSA) by estimating parameters of periodic models that best match observed data, thereby identifying dominant frequencies while accounting for noise. The method minimizes the sum of squared errors S = \sum_{i=1}^n (y_i - f(t_i; \theta))^2, where y_i are the data points at times t_i, f(t_i; \theta) is the model prediction, and \theta represents the unknown parameters. This minimization is solved analytically for linear models using the normal equations, yielding unbiased estimates under the assumption of uncorrelated errors. In the context of periodic models, the function f(t) is formulated as a linear superposition of trigonometric terms at predetermined frequencies \omega_k: f(t) = \sum_{k=1}^m \left[ a_k \cos(\omega_k t) + b_k \sin(\omega_k t) \right], where a_k and b_k are the coefficients to be fitted, and m is the number of frequency components. This representation captures the periodic structure of the signal and enables the problem to be cast as a linear regression, with the design matrix comprising columns of \cos(\omega_k t_i) and \sin(\omega_k t_i) evaluated at each t_i. The solution for \hat{\theta} = [\hat{a}_k, \hat{b}_k] is then \hat{\theta} = (\Phi^T \Phi)^{-1} \Phi^T \mathbf{y} for unweighted cases, or weighted equivalents when measurement errors are known. Such adaptation allows LSSA to handle both evenly and unevenly spaced data, providing a flexible alternative to traditional Fourier methods for spectral decomposition. For computational efficiency and to mitigate parameter correlations, the fitting leverages trigonometric identities to establish an for the model functions. Specifically, the pairs at each frequency can be orthogonalized relative to the data sampling, often by successive subtraction of fitted components or by exploiting identities like \sin(\omega t + \phi) = A \cos(\omega t) + B \sin(\omega t) to reparameterize in terms of and , thereby decorrelating estimates across harmonics. In cases of evenly spaced observations, the trigonometric basis is inherently , simplifying the inversion of the normal matrix to diagonal form and enabling independent estimations. This reduces numerical ill-conditioning and enhances the method's robustness for high-dimensional fits. Error estimation in periodic least-squares fitting assumes additive in the observations, with the noise variance \sigma^2 inferred from the after model fitting: \hat{\sigma}^2 = \frac{1}{n - 2m} \sum_{i=1}^n (y_i - \hat{f}(t_i))^2, where n is the number of data points and $2m accounts for the consumed by the parameters. The residuals r_i = y_i - \hat{f}(t_i) are orthogonal to the fitted model by the projection theorem, ensuring that the estimate \hat{\sigma}^2 is unbiased for homoscedastic Gaussian errors. This framework supports statistical tests for the of fitted frequencies, such as comparing the explained variance to noise levels via .

Periodogram formulation in LSSA

In least-squares spectral analysis (LSSA), the periodogram quantifies the spectral power at a candidate angular frequency \omega by determining the least-squares fit of a single sinusoidal model y(t) = a \cos(\omega t) + b \sin(\omega t) to the observed time series data \{y_i, t_i\}_{i=1}^N. This fit minimizes the sum of squared residuals, yielding estimates for the amplitudes a and b that maximize the explained variance at that frequency. The resulting periodogram value P(\omega) serves as a direct measure of the goodness-of-fit for the periodic component. The explicit formulation of the LSSA periodogram for unweighted is given by P(\omega) = \frac{1}{2} \left[ \frac{ \left( \sum_{i=1}^N y_i \cos(\omega t_i) \right)^2 }{ \sum_{i=1}^N \cos^2(\omega t_i) } + \frac{ \left( \sum_{i=1}^N y_i \sin(\omega t_i) \right)^2 }{ \sum_{i=1}^N \sin^2(\omega t_i) } \right], where the sums are over the N points. This expression arises from solving the normal equations of the least-squares problem, with the two terms corresponding to the squared contributions from the cosine and sine components, respectively. To interpret P(\omega) as a fraction of total variance, it is typically normalized by the variance \sigma_y^2 = \frac{1}{N} \sum_{i=1}^N y_i^2, yielding P(\omega)/\sigma_y^2, which ranges between 0 and 1 for a single-frequency fit. To construct the full spectrum, P(\omega) is computed over a discrete grid of frequencies spanning the range of interest, such as from near-zero to the determined by the sampling rate. Peaks in this scanned highlight frequencies where the sinusoidal model best captures the data's periodic structure, with the height of a peak indicating the relative strength of that component. The grid resolution is chosen to balance computational cost and frequency discrimination, often with \Delta \omega \approx 1/T where T is the total observation span. Under the null hypothesis of white Gaussian noise with no periodic signal, $2P(\omega)/\sigma_y^2 follows a \chi^2 distribution with 2 degrees of freedom, enabling statistical significance testing. Exceedance probabilities are assessed by comparing the normalized periodogram to critical values from the \chi^2 or equivalent F-distribution (with 2 and N-2 degrees of freedom), where values above a threshold (e.g., corresponding to 95% confidence) indicate significant periodicity. This property holds asymptotically for large N and supports false alarm probability estimates across the scanned spectrum. For signals involving multiple harmonics, the LSSA periodogram extends to simultaneous least-squares fitting of a multi-frequency model, such as y(t) = \sum_{k=1}^K [a_k \cos(k \omega t) + b_k \sin(k \omega t)], using an augmented with columns for each . The total spectral power is then the cumulative reduction in residual variance from this joint fit, distributed as \chi^2 with $2K under the null, allowing detection of complex periodicities while accounting for harmonic interactions.

Historical development

Origins and early work

The method of least-squares spectral analysis has its conceptual roots in the late 18th and early 19th centuries, when the least-squares technique was pioneered for fitting models to astronomical data. formally introduced the method in to determine the orbits of comets by minimizing the sum of squared differences between observed and predicted positions. Independently, had conceived the approach around 1795 and applied it to predict the orbit of the asteroid after its discovery in 1801, publishing a detailed theoretical justification in 1809 that emphasized its probabilistic foundations under the assumption of Gaussian errors. These advancements established least-squares as an essential tool in astronomy for handling noisy and imprecise observations, providing a framework for parameter estimation that would later extend to periodic signal detection. By the late , the analysis of periodic phenomena in unevenly sampled became a pressing concern in astronomy and , driven by datasets like records that featured irregular intervals. In 1898, Arthur Schuster developed the as a technique to investigate hidden periodicities, computing the squared of sine waves fitted to the via a measure akin to least-squares for each trial period. This method addressed the limitations of direct transforms on gapped , offering a way to quantify the strength of potential periodic components without requiring evenly spaced points. Schuster's innovation marked an early bridge between least-squares fitting and spectral estimation, particularly valuable for astronomical affected by observational constraints. In the , further progress emerged in applying least-squares directly to model planetary perturbations from irregularly spaced observations, tackling inherent challenges in . Sylvio Ferraz-Mello utilized least-squares fitting in his analyses of satellite and planetary orbits during this period, accounting for data gaps caused by limited visibility windows, Earth's shadowing effects, and other observational interruptions common in astronomical monitoring. These efforts highlighted the superiority of least-squares over classical methods for uneven data, as it allowed flexible modeling of periodic perturbations without interpolation artifacts, setting the stage for more systematic spectral applications in the following decade.

Evolution of key methods

The least-squares spectral analysis (LSSA) method was first introduced by Petr Vaníček in 1969 as a robust technique for analyzing geophysical time series data, particularly to address challenges in identifying periodic signals amid irregular sampling and systematic noise. Vaníček's approach framed spectral estimation as a least-squares fit of multiple sinusoids to the data, allowing for the simultaneous inclusion of trend and noise models to mitigate artifacts common in traditional Fourier methods. This foundational work emphasized the method's flexibility for unevenly spaced observations, marking a shift from classical periodograms toward optimization-based spectral decomposition. In 1976, Nicholas R. Lomb adapted LSSA specifically for astronomical applications, focusing on the analysis of unevenly sampled such as stellar light curves. Lomb's formulation derived the statistical properties of the least-squares spectrum, demonstrating its equivalence to a least-squares fit of sine waves and providing guidelines for grid selection to handle sparse effectively. This adaptation highlighted LSSA's advantages over discrete Fourier transforms in preserving without interpolation, influencing its adoption in fields requiring precise periodicity detection. Jeffrey D. Scargle extended the method in by incorporating measurement uncertainties and deriving rigorous statistical tests for periodicity significance in unevenly spaced astronomical data. Scargle's generalization normalized the to follow an under assumptions, enabling false alarm probability assessments and improving reliability for weak signals. This work solidified LSSA's role in time series analysis by addressing noise variability, a key limitation in prior formulations. During the and , refinements focused on enhancing LSSA's handling of multi-periodic signals and correlated , alongside computational optimizations. Horne and Baliunas (1986) introduced a normalized variant with Monte Carlo-derived significance thresholds, facilitating the detection of multiple harmonics in noisy datasets. For multi-periodicity, extensions built on Vaníček's multi-sine framework allowed iterative fitting of multiple frequencies, improving resolution in complex signals like geophysical or variable stars. handling advanced through weighted least-squares models that accounted for colored noise, reducing bias in gapped records. Computational advances in the late 1980s transformed LSSA from labor-intensive manual optimizations to efficient numerical implementations. Press and Rybicki (1989) developed a fast algorithm using nonequispaced fast Fourier transforms, reducing complexity from O(N²) to O(N log N) and enabling analysis of large-scale datasets in astronomy and geophysics. This innovation shifted practical applications toward automated processing on emerging computers, broadening LSSA's use in real-time signal processing and long-term monitoring.

Core methods and variants

Vaníček's approach

Petr Vaníček introduced least-squares spectral analysis (LSSA) in as a method for detecting periodic signals in time series , particularly suited for unevenly spaced observations common in geophysical measurements, with further development in 1971. The approach involves a direct least-squares fit of sinusoidal functions across a discrete grid of trial frequencies, estimating the and of potential harmonics to approximate the observed . This formulation builds on ideas of fitting sine waves to pre-selected frequencies from periodograms. The core of Vaníček's method is to model the time series f(t_i) at observation times t_i as a linear combination of cosine and sine terms for a given frequency \omega_j: g(t_i) = \hat{x}_{1j} \cos(\omega_j t_i) + \hat{x}_{2j} \sin(\omega_j t_i), where \hat{x}_{1j} and \hat{x}_{2j} are the least-squares estimates obtained by solving \hat{\mathbf{x}}_j = (\Phi_j^T \Phi_j)^{-1} \Phi_j^T \mathbf{f}, with \Phi_j as the N \times 2 of the trigonometric basis functions and \mathbf{f} the of observations. The power at \omega_j is then computed as the normalized explained variance: s(\omega_j) = \frac{\mathbf{f}^T \hat{\mathbf{g}}_j}{\mathbf{f}^T \mathbf{f}}, which quantifies the goodness-of-fit for that frequency. For models with multiple harmonics, the residuals are minimized simultaneously over an extended basis including several frequencies and additional terms for systematic effects (e.g., linear trends), forming a larger design matrix \Phi to solve \hat{\mathbf{x}} = (\Phi^T \Phi)^{-1} \Phi^T \mathbf{f}, enabling the identification of interacting periodic components without iterative removal. A key innovation in Vaníček's approach is the orthogonal search procedure, which projects the data onto subspaces spanned by the trigonometric basis functions for each trial frequency independently, circumventing the need for pre-whitening— the sequential removal of dominant signals that can introduce biases in traditional spectral methods. This orthogonality is achieved through the least-squares framework itself, allowing unbiased estimation even in the presence of data gaps or irregular spacing, as the method treats each frequency test as a separate projection onto a two-dimensional subspace. In , Vaníček's LSSA has been applied to analyze Earth's rotation variations, such as and length-of-day fluctuations, by fitting periodic models to historical astronomical observations like UT1-TAI to reveal and atmospheric influences. It has also proven valuable for field studies, including the processing of superconducting gravimeter records to detect subtle periodic signals from ocean tides and atmospheric loading effects on the . These applications leverage LSSA's robustness to uneven sampling in long-term geophysical monitoring. Despite its advantages, Vaníček's original approach is computationally intensive for large datasets, as the inversion of the normal matrix \Phi^T \Phi must be performed for each frequency in the grid, with complexity scaling as per trial for single-frequency fits and higher for multi-harmonic models involving dozens of parameters. This limits its efficiency on modern scales without optimizations like fast solvers.

Lomb-Scargle periodogram

The Lomb-Scargle represents a key adaptation of least-squares spectral analysis specifically tailored for unevenly sampled data, enabling the detection of periodic signals without requiring data . Introduced by Nicholas R. Lomb in 1976, the formulates a by performing a least-squares fit of sinusoidal models to the data at various , adjusting for irregular sampling through direct computation over the observed points. This approach avoids the distortions introduced by traditional transforms on gapped datasets, providing a more robust estimate of power . Jeffrey D. Scargle extended Lomb's formulation in by incorporating a time-shift , τ, to orthogonalize the basis functions, which simplifies the least-squares fitting process and ensures the periodogram's statistical properties mirror those of classical under assumptions. The core formula for the power at ω is given by P(\omega) = \frac{1}{\sigma^2} \frac{ \left[ \sum_i y_i \cos(\omega (t_i - \tau)) \right]^2 }{ \sum_i \cos^2(\omega (t_i - \tau)) } + \frac{1}{\sigma^2} \frac{ \left[ \sum_i y_i \sin(\omega (t_i - \tau)) \right]^2 }{ \sum_i \sin^2(\omega (t_i - \tau)) }, where y_i are the values at times t_i, \sigma^2 is the variance, and τ is chosen such that the cross-term vanishes, making the fit equivalent to a phase-shifted least-squares of a pure sinusoid. This direct summation over the N points eliminates the need for grid interpolation, reducing computational bias and effects common in evenly spaced methods. A significant advantage of the Lomb-Scargle periodogram lies in its well-characterized statistical behavior for significance testing. Under the of , the power values follow an , allowing the false alarm probability (FAP) for the maximum peak to be computed analytically as FAP ≈ 1 - exp(-P_max), where P_max is the highest normalized power; this enables reliable thresholding for periodicity detection without extensive simulations. Scargle demonstrated that this distribution holds even for uneven sampling, provided the sampling is not overly clustered, offering a direct analog to the classical 's properties. In astronomy, the Lomb-Scargle periodogram has become a standard tool for identifying periodicities in light curves, where observations are often irregularly spaced due to weather, satellite orbits, or other observational constraints. For instance, it has been instrumental in detecting pulsation periods in Cepheid and RR Lyrae stars from sparse photometric datasets, facilitating discoveries of binary systems and intrinsic variabilities that underpin distance measurements and models. This method builds briefly on foundational least-squares fitting principles for periodic models, such as those explored by Vaníček.

Generalized Lomb-Scargle extensions

The generalized Lomb-Scargle periodogram extends the classical method by incorporating measurement uncertainties through weights and allowing for a floating in the model fit. In , Scargle introduced a weighted version that accounts for individual data point variances σ_i, defining weights as w_i = 1/σ_i² to emphasize more precise measurements in the least-squares fitting process. This weighting reduces bias in spectral estimates for heterogeneous error structures common in astronomical data. The generalized power spectrum P(ω) is formulated as the normalized reduction in chi-squared from a constant model to a sinusoidal model: P(\omega) = \frac{\chi_0^2 - \chi^2(\omega)}{\chi_0^2}, where χ₀² is the chi-squared for the weighted mean model, and χ²(ω) is for the weighted sinusoidal fit. The weighted sums involved, such as the variance terms YY = ∑ w_i (y_i - Y)² (with Y the weighted mean), incorporate w_i = 1/σ_i² normalized by the total weight W = ∑ 1/σ_i², yielding an analytic expression that avoids numerical optimization. This form maintains the periodogram's efficiency while handling unequal errors, outperforming unweighted variants in noisy, unevenly sampled datasets. A key advancement is the floating mean, which jointly fits an offset c alongside the sinusoidal parameters, eliminating the assumption of a zero or pre-subtracted mean. This addresses from incomplete phase coverage or long-period trends, as the model becomes y(t) = a cos(ωt) + b sin(ωt) + c, with c estimated via . Zechmeister and Kürster (2009) derived the corresponding terms, showing improved power estimates and false-alarm probability assessments compared to fixed-mean approaches. Multi-harmonic extensions further generalize the method to fit multiple sinusoids at harmonics of a base , useful for non-sinusoidal periodic signals. Zechmeister and Kürster (2009) provided the by summing powers over K harmonics, P_K(ω) = ∑_{k=1}^K P(ω_k) with ω_k = kω, while adjusting the model for functions to preserve and statistical properties. This enables detection of complex periodicities, such as those in eclipsing binaries, with reduced computational overhead relative to independent fits. Bayesian interpretations frame the generalized Lomb-Scargle as a posterior for sinusoidal versus null models, linking the power directly to model evidence under assumptions. This perspective, explored by Mortier et al. (2015), facilitates by incorporating priors on like and offset, enhancing interpretability for significance testing in searches. Such formulations align the method with probabilistic frameworks, allowing seamless integration with MCMC sampling for .

Other advanced variants

Korenberg's fast orthogonal search (FOS), introduced in , extends least-squares spectral analysis by enabling the stepwise addition of sinusoids to model data through Gram-Schmidt orthogonalization. This approach constructs an from candidate sinusoidal terms at discrete frequencies, allowing efficient identification of the most significant components without recomputing the full least-squares fit at each step. By iteratively selecting terms that maximize the reduction in residual variance, FOS produces parsimonious models suitable for noisy, irregularly sampled data, with applications in biological . Palmer's chi-squared method, developed in the late 2000s but building on earlier nonlinear fitting techniques from the 1990s, adapts least-squares spectral analysis for non-sinusoidal models by minimizing the chi-squared statistic over multiple harmonics of a fundamental frequency. Unlike standard sinusoidal fits, this nonlinear least-squares approach accommodates arbitrary periodic shapes by jointly optimizing amplitudes for a user-specified number of harmonics, using fast Fourier transforms to achieve O(N log N) computational complexity for irregularly sampled data with non-uniform errors. This enables detection of complex periodic signals in astronomical time series where pure sinusoids fail. Post-2000 developments include conditional variants of least-squares spectral analysis designed to handle correlated noise, such as the antileakage least-squares spectral analysis (ALLSSA) method, which iteratively regularizes irregularly sampled data while suppressing and attenuating correlated random noise through adaptive weighting. These extensions incorporate noise covariance structures into the fitting process, improving spectrum estimation in geophysical signals like seismic data where traditional LSSA assumes . Additionally, hybrids integrate least-squares spectral analysis with neural networks or ensemble methods; for instance, LSSA-extracted frequency features serve as inputs to neural networks for enhanced accuracy in cost prediction, or to models for in geotechnical applications. These hybrids leverage LSSA's to preprocess data, boosting model performance on high-dimensional, noisy inputs. Orthogonal methods like FOS offer significant efficiency advantages in high-dimensional fits compared to direct least-squares approaches, as the Gram-Schmidt process avoids redundant computations in matrix inversions by maintaining incrementally, reducing complexity from O(M^3) to near-linear in the number of candidate frequencies M for large datasets. This makes them particularly suitable for scenarios with thousands of potential spectral components, where standard variants become prohibitive.

Applications

Astronomy and time series analysis

In astronomy, least-squares spectral analysis (LSSA), particularly through methods like the , has been instrumental in detecting transits from irregularly sampled photometric data. For instance, during the Kepler mission, which monitored thousands of stars for transiting between 2009 and 2018, LSSA techniques identified periodic dips in light curves indicative of planetary orbits, enabling the confirmation of 2,778 by analyzing unevenly spaced observations affected by spacecraft pointing and data download gaps. This approach excels in handling the mission's quarter-long observing segments, where data gaps arise from quarterly roll maneuvers, allowing robust peaks to reveal transit signals amid noise. Similarly, the (TESS), launched in 2018, leverages LSSA for characterizing stellar variability in its full-sky survey, processing short-cadence light curves to distinguish rotational modulation from potential signals. In TESS data, LSSA has been applied to classify variability in hundreds of thousands of , identifying periodic behaviors such as starspot-induced brightness changes with periods ranging from hours to weeks, which is crucial for validating candidates by subtracting stellar noise. Recent analyses of TESS sectors have used generalized LSSA extensions to detect multi-periodic variability in cool dwarfs, improving the false-positive rate for transits in post-2018 observations. As of 2025, enhanced methods have classified over 63,000 periodic variable in TESS data. A notable case study involves the analysis of irregularly sampled light curves for pulsars, where LSSA addresses the challenges of sparse, non-uniform observations from radio telescopes. For pulsar timing arrays, such as those from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), LSSA via the Lomb-Scargle periodogram is used to search for periodic signals in timing residuals amid data gaps due to instrumental downtime or interference. In simulated pulsar light curves with 30-50% data gaps mimicking real survey irregularities, LSSA recovers true frequencies with less spectral leakage than traditional Fourier methods, achieving detection significances above 5σ for signals with amplitudes as low as 1% of the noise level. The primary benefit of LSSA in these astronomical contexts is its robustness to data gaps, as it fits sinusoidal models directly to available observations without , reducing in power estimates for low-frequency signals. This is particularly advantageous for missions like Kepler and TESS, where gaps can alias power into side lobes in standard periodograms; simulations show LSSA suppresses such leakage by up to 40% in gapped datasets, enabling clearer peak detection. Beyond astronomy, LSSA extends to general analysis, including cycles where datasets often feature missing observations from sparse monitoring stations. In assessing multiscale variability in and records, LSSA has identified dominant annual and semi-annual cycles in data spanning 1950-2018, handling gaps from seasonal instrument failures without distorting low-frequency components. For economic indicators, such as GDP or series with irregular reporting due to revisions or economic shocks, LSSA estimates spectral densities to uncover periodicities around 4-7 years, outperforming interpolation-based methods in datasets with up to 20% missing values. These applications highlight LSSA's versatility for unevenly sampled non-astronomical , where the Lomb-Scargle variant provides a least-squares foundation for gap-tolerant .

Engineering and signal processing

In engineering and signal processing, least-squares spectral analysis (LSSA) is particularly valuable for fault detection in machinery where sensor data may exhibit gaps or irregular sampling due to operational constraints or failures. For instance, multichannel antileakage LSSA has been applied to vibration and acoustic signals from marine diesel engines to identify malfunctions such as cylinder misfires or injector faults by estimating dominant frequencies and attenuating noise, even with unevenly spaced measurements from accelerometers and microphones. This approach avoids the need for data interpolation, which can introduce artifacts, and instead fits sinusoids via least squares to the available samples, enabling reliable power spectral density estimation up to the Nyquist frequency. In broader engineering contexts, supports seismic data processing for oil exploration by regularizing irregularly sampled traces and suppressing random noise, facilitating accurate estimation for migration and inversion tasks. The antileakage variant iteratively minimizes , achieving lower reconstruction errors (e.g., norm of 0.004 in synthetic tests) compared to methods like antialiasing , and has been demonstrated on field data from the to interpolate missing traces and enhance subsurface imaging. Similarly, in biomedical , LSSA analyzes unevenly sampled derived from electrocardiogram (ECG) signals, mitigating artifacts from ectopic beats or irregular rhythms by modeling sampling deviations as perturbations and estimating unbiased power spectra without low-pass filtering effects. A representative example of LSSA in industrial diagnostics involves gearbox vibration analysis with uneven samples, where it extracts natural frequencies from non-stationary signals to detect cracks or ; for rotating machinery like turbine blades (analogous to gearbox components), LSSA combined with time-frequency features identifies degradation by fitting spectra to sparse data, outperforming traditional methods in noisy environments. Relative to the (FFT), LSSA reduces in non-stationary signals by directly accounting for irregular spacing and trends through least-squares fitting, preserving signal integrity without windowing artifacts. Post-2010 industrial integrations include toolboxes like LSWAVE, which implement LSSA and antileakage extensions for and seismic applications, supporting graphical analysis of engineering with handling for trends.

Practical implementation

Computational algorithms

Least-squares spectral analysis (LSSA) requires careful selection of the grid to account for the challenges posed by uneven sampling, where traditional Nyquist limits based on uniform spacing do not directly apply. The highest resolvable is typically limited to approximately half the inverse of the inter-sample , though it can extend up to half the inverse of the smallest to avoid from closely spaced points. Users often specify a grid by defining lower and upper bounds (e.g., from a fundamental period to a practical Nyquist equivalent) and the number of points within that range, ensuring sufficient resolution for detecting periods of interest without excessive computational cost. For efficient computation, initial scans over a coarse grid can employ FFT-based , which map unevenly sampled data onto a regular mesh via or "extirpolation" techniques, reducing the complexity from O(MN) to O(N log N) where M is the number of and N is the number of data points. These methods provide rapid identification of candidate but may introduce minor inaccuracies due to the . For higher , especially around peaks, direct is used, computing the least-squares fit by evaluating sums of sines, cosines, and data products at each independently, which maintains exactness for the but scales linearly with N per . Multi-frequency searches in LSSA often involve iterative peak-finding strategies to model multiple sinusoids. algorithms select the frequency with the highest spectral power, fit and subtract the corresponding sinusoid from the residuals, and repeat until no significant peaks remain above a threshold (e.g., 95%). This approach efficiently handles correlated frequencies and leakage effects in uneven data. Genetic algorithms can optimize non-linear extensions or multi-component fits, though they are less common for the core linear LSSA due to higher overhead. Handling large datasets (N > 10^5) in LSSA benefits from strategies, where representative subsets are analyzed to estimate the before full computation, or from parallelization across frequency evaluations, as each is independent. GPU accelerations, leveraging the parallel nature of least-squares summations, have been available since the early , enabling computations for batches of or high-resolution grids in seconds rather than hours on CPUs; for instance, implementations achieve speedups of 10-100x for datasets up to millions of points.

Software tools and examples

Several open-source libraries facilitate the implementation of least-squares spectral analysis (LSSA), with a strong emphasis on the for handling unevenly sampled . In , the Astropy package provides the LombScargle class, which computes periodograms, models periodic signals, and assesses through probabilities, making it particularly suitable for astronomical applications. Similarly, SciPy's scipy.signal.lombscargle function implements the generalized , supporting normalization options and efficient computation for large datasets. For users, the lomb package offers functions to calculate for both even and uneven sampling, including actogram plotting and randomization-based estimation for periodicity detection. In , the LSWAVE toolbox extends LSSA to include antileakage variants and wavelet-based , enabling robust processing of without preprocessing for gaps or trends. Julia developers can utilize the LPVSpectral.jl package, which supports least-squares spectral estimation, sparse spectral methods, and linear parameter-varying extensions, optimized for . , as a MATLAB-compatible environment, features the lssa package for spectral decompositions of irregular using Lomb-Scargle-based algorithms. Additionally, LSSASOFT is an open-source C-language software package released in 2024 for estimating periodicities in time series via LSSA. It handles unequally spaced data, trends, and datum shifts without preprocessing, includes automatic period detection, and supports weighted observations with statistical testing. The package is available on GitHub under the GNU General Public License. A practical implementation example in Python using SciPy demonstrates basic periodogram computation on simulated data:
python
import numpy as np
from scipy.signal import lombscargle
from matplotlib import pyplot as plt

# Generate unevenly sampled sinusoidal data with noise
rng = np.random.default_rng(42)
t = np.sort(rng.uniform(0, 10, 50))  # Uneven times
y = np.sin(2 * np.pi * t / 2) + 0.5 * rng.normal(size=50)  # Signal with noise
freqs = 2 * np.pi * np.linspace(0.1, 5, 1000)  # [Frequency](/page/Frequency) grid

# Compute Lomb-Scargle periodogram (normalized)
pgram = lombscargle(t, y, freqs, normalize=True)

# Plot results
plt.plot(1 / (freqs / (2 * np.pi)), pgram)
plt.xlabel('Period')
plt.ylabel('Power')
plt.title('Lomb-Scargle [Periodogram](/page/Periodogram)')
plt.show()
This code fits sinusoids via least squares to identify the dominant period near 2 units, where the peak power indicates the signal frequency. Tutorials often involve fitting simulated data with added noise to illustrate LSSA's robustness; for instance, generating uneven timestamps, injecting a known periodic signal, and using Astropy's LombScargle to recover the frequency while quantifying uncertainty through bootstrap resampling. Interpreting significance typically relies on the periodogram's false alarm probability (FAP), computed analytically for white noise or via Monte Carlo simulations in libraries like lomb for R, where p-values below 0.05 confirm periodic components against null hypotheses of randomness. Open-source tools like SciPy offer free, extensible implementations with community support and integration into workflows like Jupyter notebooks, achieving comparable accuracy to proprietary options but with faster execution on large datasets due to optimized C backends. In contrast, MATLAB's proprietary Signal Processing Toolbox provides built-in functions such as pwelch for power spectral density estimation, which can incorporate least-squares fitting via custom scripts or add-ons like LSWAVE, though it requires licensing and may incur higher computational costs without GPU acceleration. This comparison highlights SciPy's accessibility for research, while MATLAB excels in integrated visualization for engineering prototypes.

References

  1. [1]
    [PDF] LEAST-SQUARES SPECTRAL ANALYSIS
    Least Squares Spectral Analysis (LSSA) is software developed at the University of New Brunswick, an alternative to Fourier methods, that can handle data gaps.
  2. [2]
    Antileakage least-squares spectral analysis for seismic data ...
    Least-squares spectral analysis (LSSA) was introduced by Vaníček (1969) to analyze unequally spaced time series. It basically estimates a frequency spectrum ...
  3. [3]
    Approximate spectral analysis by least-squares fit
    Cite this article. Vaníček, P. Approximate spectral analysis by least-squares fit. Astrophys Space Sci 4, 387–391 (1969). https://doi.org/10.1007/BF00651344.
  4. [4]
    LSSA - UNB - Geodesy & Geomatics Engineering
    Welcome to Least-Squares Spectral Analysis Website. Least-Squares Spectral Analysis (LSSA) is a spectral analysis approach based on least-squares.
  5. [5]
    a general formalism for the Lomb-Scargle periodogram
    The periodogram is a popular tool that tests whether a signal consists only of noise or if it also includes other components. The main issue of this method is ...
  6. [6]
    Least squares spectral analysis and its application to ...
    This paper reviews the principle of LSSA and gives a possible strategy for the analysis of time series obtained from the Canadian Superconducting Gravimeter ...
  7. [7]
  8. [8]
    Further development and properties of the spectral analysis by least ...
    Vaníček, P.: 1969c, 'An Analytical Technique to Minimize Noise in a Search for Lines in the Low Frequency Spectra', Proceedings of the 6th International ...
  9. [9]
    Carl Friedrich Gauss & Adrien-Marie Legendre Discover the Method ...
    Carl Friedrich Gauss Offsite Link is credited with developing the fundamentals of the basis for least-squares analysis in 1795 at the age of eighteen.
  10. [10]
    Gauss and the Invention of Least Squares - jstor
    The most famous priority dispute in the history of statistics is that between Gauss and Legendre, over the discovery of the method of least squares.
  11. [11]
    Further Development and Properties of the Spectral Analysis by ...
    Further Development and Properties of the Spectral Analysis by Least-Squares. Vaníček, Petr. Abstract. The concept of spectral analysis using least-squares is ...
  12. [12]
    Further development and properties of the spectral analysis by least ...
    PETR VANÍČEK. Surveys and Mapping Branch, Dept. of ... The concept of spectral analysis using least-squares is further developed to remove any ... is achieved for ...
  13. [13]
    Least-Squares Frequency Analysis of Unequally Spaced Data - ADS
    It is shown that, in the least-squares spectrum of gaussian noise, the reduction in the sum of squares at a particular frequency is aX 2 2 variable.Missing: Ferraz- Mello 1964
  14. [14]
    Least-squares frequency analysis of unequally spaced data
    The statistical properties of least-squares frequency analysis of unequally spaced data are examined. It is shown that, in the least-squares spectrum of ga.Missing: paper | Show results with:paper
  15. [15]
    Studies in astronomical time series analysis. II. Statistical aspects of ...
    This paper studies the reliability of periodogram detection of periodic signals in unevenly spaced astronomical data, requiring a modification to retain simple ...
  16. [16]
    Studies in astronomical time series analysis. II - Statistical aspects of ...
    ... periodogram or least-squares analysis. In § IV it is shown how to maximize the efficiency of a detection scheme. Specific examples of all of these concepts ...
  17. [17]
    A Prescription for Period Analysis of Unevenly Sampled Time Series
    Horne, J. H.; ;; Baliunas, S. L.. Abstract. A technique is presented for detecting the presence and significance of a period in unequally sampled time series ...Missing: Lomb- Scargle
  18. [18]
    How to handle colored noise in large least-squares problems in the ...
    An approach to handle stationary colored noise in least-squares problems in the presence of data gaps is presented.
  19. [19]
    Fast Algorithm for Spectral Analysis of Unevenly Sampled Data - ADS
    Rybicki, George B. Abstract. The Lomb-Scargle method performs spectral analysis on unevenly sampled data and is known to be a powerful way to find, ...
  20. [20]
    Fast algorithm for spectral analysis of unevenly sampled data
    ... periodogram analysis. Subject heading : numerical methods I. INTRODUCTION Lomb (1976) and Scargle (1982) developed a novel type of periodogram (Fourier ...
  21. [21]
  22. [22]
    The generalised Lomb-Scargle periodogram
    The Lomb-Scargle periodogram is a common tool in the frequency analysis of unequally spaced data equivalent to least-squares fitting of sine waves.
  23. [23]
    Understanding the Lomb–Scargle Periodogram - IOPscience
    The Lomb–Scargle periodogram is a well-known algorithm for detecting and characterizing periodic signals in unevenly sampled data.
  24. [24]
    [0901.2573] The generalised Lomb-Scargle periodogram. A new ...
    Jan 16, 2009 · The Lomb-Scargle periodogram is a common tool in the frequency analysis of unequally spaced data equivalent to least-squares fitting of sine waves.
  25. [25]
    The generalised Lomb-Scargle periodogram - A new formalism for ...
    Apart from this, the least-squares spectrum analysis (the idea goes back to Vanícek 1971) with more complicated model functions than full sine functions is ...
  26. [26]
    A Bayesian formalism for the generalised Lomb-Scargle periodogram
    We provide a formalism that is easy to implement in a code, to describe a Bayesian periodogram that includes weights and a constant offset in the data.
  27. [27]
    A FAST CHI-SQUARED TECHNIQUE FOR PERIOD SEARCH OF ...
    The algorithm calculates the minimized χ2 as a function of frequency at the desired number of harmonics, using fast Fourier transforms to provide O(Nlog N) ...
  28. [28]
    LSSA-BP-based cost forecasting for onshore wind power
    An LSSA-BP neural network prediction model was established for more accurate onshore wind power cost prediction.
  29. [29]
    Hybrid data-driven model for predicting the shear strength of ...
    Based on the collected database, the proposed model (LSSA- XGBoost model) establishes a nonlinear relationship between the shear strength of DDJS and the inputs ...
  30. [30]
    NASA Exoplanet Archive Overview and Holdings - Caltech
    Sep 8, 2025 · The Exoplanet Archive serves photometric time-series data from surveys that aim to discover transiting exoplanets, such as the Kepler Mission ...
  31. [31]
    TESS Data for Asteroseismology (T'DA) Stellar Variability ...
    We present our methodology for the automatic variability classification of TESS photometry using an ensemble of supervised learners that are combined into a ...
  32. [32]
    Analysing the flux stability of stellar calibrator candidates with TESS
    Using stellar light curves from the Transit Exoplanet Survey Satellite (TESS), the flux variability of each star is characterised by computing its Lomb-Scargle ...
  33. [33]
    Red Noise Suppression in Pulsar Timing Array Data Using Adaptive ...
    Nov 8, 1997 · Here, we present an explicit filtering approach that overcomes the technical issues associated with irregular sampling. ... Lomb–Scargle ...1. Introduction · 2. Methods · 3. Results
  34. [34]
    [PDF] Effect of data gaps: comparison of different spectral analysis methods
    Apr 19, 2016 · In this paper we investigate quantitatively the ef- fect of data gaps for four methods of estimating the ampli- tude spectrum of a time series: ...
  35. [35]
    A Comparative Analysis to Deal with Missing Spectral Information ...
    Apr 18, 2022 · To understand the effectiveness of the CLEAN and LSSA algorithms in reducing the spectral leakage for realistic flagging scenarios, we study the ...
  36. [36]
    The Potential of the Least-Squares Spectral and Cross-Wavelet ...
    Furthermore, the least-squares cross-wavelet analysis is applied to demonstrate how the temperature and precipitation time series can be used for assessment of ...
  37. [37]
    Spectral estimation for locally stationary time series with missing ...
    Jul 2, 2011 · This article addresses the topic of spectral estimation of a non-stationary time series sampled with missing data.
  38. [38]
  39. [39]
  40. [40]
    [PDF] LEAST-SQUARES SPECTRAL ANALYSIS REVISITED
    Nov 3, 1985 · An algorithm is described to compute the optimum least-squares spectrum of an unequally or equally spaced generally non-stationary and.
  41. [41]
    [PDF] 198 9ApJ. . .338. .277P The Astrophysical Journal, 338:277-280 ...
    Lomb (1976) and Scargle (1982) developed a novel type of periodogram (Fourier spectrum) analysis, quite powerful for finding, and testing the significance of, ...
  42. [42]
    [PDF] Anti-leakage least-squares spectral analysis for data regularization
    Least-squares spectral analysis (LSSA). • The LSSA estimates a frequency spectrum based on the least- squares fit of sinusoids to data series. • Unlike the ...
  43. [43]
    Fast Period Searches Using the Lomb-Scargle Algorithm on ... - arXiv
    May 9, 2021 · In this paper, we propose a GPU-accelerated Lomb-Scargle period finding algorithm that computes periods for single objects or for batches of ...
  44. [44]
    Lomb-Scargle Periodograms — Astropy v7.1.1
    The fast method is a pure Python implementation of the fast periodogram of Press & Rybicki [3]. It uses an extrapolation approach to approximate the periodogram ...
  45. [45]
    lombscargle — SciPy v1.16.2 Manual
    The algorithm used here is based on a weighted least-squares fit of the form y(ω) = a*cos(ω*x) + b*sin(ω*x) + c , where the fit is calculated for each ...
  46. [46]
    CRAN: Package lomb
    Mar 26, 2024 · Computes the Lomb-Scargle Periodogram and actogram for evenly or unevenly sampled time series. Includes a randomization procedure to obtain exact p-values.
  47. [47]
    LSWAVE - Ebrahim Ghaderpour
    LSWAVE includes the following tools for signal processing: 1) Least-Squares Spectral Analysis (LSSA) 2) Antileakage Least-Squares Spectral Analysis (ALLSSA)
  48. [48]
    baggepinnen/LPVSpectral.jl: Least-squares (sparse) spectral ...
    A toolbox for least-squares spectral estimation, sparse spectral estimation and Linear Parameter-Varying (LPV) spectral estimation.<|control11|><|separator|>
  49. [49]
    The 'lssa' package - Octave Forge
    A package implementing tools to compute spectral decompositions of irregularly-spaced time series. Currently includes functions based off the Lomb-Scargle ...Missing: software | Show results with:software