Fact-checked by Grok 2 weeks ago

Spectral flatness

Spectral flatness, also known as the coefficient or Wiener , is a metric in that quantifies the uniformity of a signal's power across a band, distinguishing between noise-like (flat) and tonal (peaked) characteristics. It is computed as the ratio of the to the of the power spectrum values, yielding a value between 0 and 1, where 1 represents perfect flatness akin to and 0 indicates a highly tonal spectrum with concentrated energy in few . This measure was formalized by James D. Johnston in 1988 as part of perceptual models for audio coding, where it helps estimate signal to optimize shaping and masking thresholds. In practice, spectral flatness is often expressed in decibels as 10 log₁₀ of the ratio for finer granularity, with typical ranges from -60 (tonal) to 0 (noisy), enabling its use in deriving a tonality α (ranging from 0 for noise-like to 1 for tonal) that scales perceptual masking levels—tonal signals (higher α) apply higher masking offsets (e.g., 14.5 ) compared to noise-like ones (lower α, e.g., 5.5 ). The typically involves the (FFT) to obtain the power spectrum, followed by mean calculations over critical bands or the full spectrum, making it efficient for real-time analysis. Originally developed for in , such as achieving transparent quality at 128 kbit/s, it has since been generalized for non-Gaussian processes to detect excessive structure beyond simple tonality. Beyond audio, spectral flatness finds applications in robust signal matching, where it aids under distortions by comparing spectral uniformity; in to evaluate deviations; and in acoustic analysis for segmentation, such as distinguishing speech from music based on tonal content. Its perceptual relevance stems from human hearing's sensitivity to spectral structure, influencing standards like MPEG audio layers, and it remains a key feature in modern tools for audio processing and machine learning-based sound classification.

Fundamentals

Definition

Spectral flatness, also known as the tonality coefficient or Wiener entropy, is a measure in that assesses the uniformity of a signal's power spectral density (). It quantifies the degree to which the signal's content approximates the even distribution characteristic of , where power is equally spread across all . This metric was first introduced by James D. Johnston in 1988 as part of developing perceptual models for audio coding, enabling the differentiation between structured and random spectral components in sound signals. At its core, spectral flatness highlights the contrast between tonal signals, such as pure sinusoids with concentrated energy at discrete frequencies leading to a peaked , and noise-like signals featuring a broad, even power distribution that mimics . In audio , it serves as a foundational tool for evaluating signal characteristics relevant to and processing.

Interpretation

Spectral flatness quantifies the uniformity of a signal's power , with values approaching 1 indicating a nearly flat typical of or uncorrelated random processes, where is evenly distributed across . Conversely, values near 0 reflect a with concentrated in a limited number of bins, characteristic of tonal or signals such as pure sinusoids or periodic waveforms. This distinction arises because the measure compares the to the of the power values, yielding a normalized that highlights deviations from noise-like behavior. In , spectral flatness serves as an indicator of perceived , linking spectral characteristics to human auditory perception. Higher flatness values correspond to sounds perceived as noisier due to their broadband, unstructured nature, while lower values evoke a of , akin to pitched or musical elements that align with structures in hearing. This relevance stems from its use in models that differentiate noise-like maskers from tonal ones in experiments. From an information-theoretic perspective, spectral flatness is related to the Wiener of the power , offering a measure of signal predictability; flatter spectra imply higher and thus greater unpredictability, while peaked spectra suggest lower and more deterministic structure. This connection underscores its role in assessing stochasticity in signals. In standards like , it functions as an audio descriptor for characterizing spectral in .

Formulation

Mathematical Expression

Spectral flatness, denoted as SF, is mathematically defined as the ratio of the geometric mean to the arithmetic mean of the power spectral density (PSD) values across N frequency bins. Let x(n) for n = 0, 1, \dots, N-1 represent the PSD values. The arithmetic mean is given by \frac{1}{N} \sum_{n=0}^{N-1} x(n), while the geometric mean is \left( \prod_{n=0}^{N-1} x(n) \right)^{1/N}, which is equivalently expressed as \exp\left( \frac{1}{N} \sum_{n=0}^{N-1} \ln x(n) \right) to enhance numerical stability in computation. Thus, SF = \frac{\exp\left( \frac{1}{N} \sum_{n=0}^{N-1} \ln x(n) \right)}{\frac{1}{N} \sum_{n=0}^{N-1} x(n)}. This measure is equivalent to the exponential of the negative Wiener entropy normalized by the number of bins. This formulation originates from early work on linear prediction in speech analysis, where the measure quantifies spectral uniformity. The derivation follows directly from the inequality between arithmetic and geometric means, which states that the arithmetic mean is always greater than or equal to the geometric mean for positive real numbers, with equality holding if and only if all values are identical. For a constant PSD, where x(n) = c for all n and some constant c > 0, both means equal c, yielding SF = 1, indicating perfect flatness akin to white noise. Conversely, for a delta-like spectrum, such as when one x(k) > 0 and all others are zero, the geometric mean approaches zero due to the product including zero terms (or requiring careful handling of \ln 0, typically by excluding zeros or using limits), while the arithmetic mean remains positive, resulting in SF \to 0, reflecting high tonality or peaked energy concentration. In sub-band analysis, the formula is applied analogously to the values restricted to a specific , allowing localized assessment of flatness without altering the core expression. This measure inherently to the interval [0, 1], providing a bounded indicator of uniformity.

Normalization and Units

flatness is typically expressed on a , where values range strictly between 0 and 1, with 1 indicating a perfectly flat akin to and values approaching 0 signifying a highly tonal or peaked . In audio engineering, it is often converted to a decibel (dB) scale for perceptual analysis, defined as SF_{\text{dB}} = 10 \log_{10} (SF), where SF is the linear spectral flatness value; this yields 0 dB for perfect flatness and approaches -\infty dB for highly tonal signals. To handle numerical issues such as zero-valued frequency bins that could lead to undefined logarithms in the geometric mean computation, a small positive constant (e.g., $10^{-10}) is commonly added to the power spectrum values before calculation. For multi-resolution analysis, spectral flatness can be normalized within sub-bands by computing the measure separately for each band—using the ratio of the band's geometric to —and then averaging across bands to obtain an overall value, enabling localized assessments of spectral uniformity.

Properties

Range and Bounds

Spectral flatness (SF), also known as the spectral flatness measure, is bounded in the between 0 and 1, with values approaching 0 for highly tonal signals and 1 for perfectly noise-like signals. The upper bound of exactly 1 is attained when the power (PSD) is uniform across all frequencies, as in the case of . Conversely, the lower bound of 0 is reached in the limiting case of an ideal in the , representing energy concentrated at a single frequency. In decibel scale, defined as \text{SF}_\text{dB} = 10 \log_{10} (\text{SF}), the measure ranges from -\infty dB, corresponding to the tonal extreme, to 0 dB for white noise. This logarithmic representation is commonly used for practical reporting due to its alignment with perceptual scales in audio processing. SF demonstrates monotonicity with respect to spectral concentration: it decreases as peaks in the sharpen, indicating a transition from noise-like to more tonal characteristics. The measure is to multiplicative of the , since it relies on the ratio of the to the , but it is sensitive to bin in transform-based implementations, where insufficient can introduce empty bins or distort representations.

Relation to Other Measures

Spectral flatness, also known as the Wiener , is an information-theoretic measure that assesses the or predictability of a signal's power (). It is closely related to the Shannon of the normalized , where H = -\sum_k p_k \log_2 p_k and p_k represents the of power across frequency bins; both increase with greater spectral uniformity, with maximum values corresponding to a flat, white-noise-like . This connection highlights spectral flatness's role in quantifying the informational uniformity of . In contrast to measures like spectral centroid and spectral flux, spectral flatness provides a complementary on spectral characteristics by emphasizing flatness over location or dynamics. The spectral centroid computes the weighted average frequency, often interpreted as the "center of mass" or perceptual of the spectrum, focusing on the distribution's rather than its variance in uniformity. Spectral flux, meanwhile, quantifies the magnitude of changes between consecutive spectral frames, capturing temporal evolution and onset detection in signals. These metrics together offer a multifaceted analysis of —spectral flatness highlights noise-like versus content through global evenness, whereas centroid and flux address positional and transitional aspects, respectively—enabling more robust signal classification in audio processing tasks. Spectral flatness also relates to broader information-theoretic constructs, particularly the dual total correlation, which measures multivariate dependencies among frequency components. As established by Dubnov (2004), for Gaussian processes, spectral flatness equates to the dual total correlation (or multi-information) of the spectral variables, reflecting the total redundancy or structure imposed by linear dependencies in the . This equivalence extends the measure's interpretive power, portraying deviations from flatness as indicators of correlated, non-independent frequency behaviors, and has been generalized to non-Gaussian linear processes to account for higher-order dependencies. Such ties position spectral flatness as a bridge between classical and multivariate .

Computation

Estimation Techniques

To estimate spectral flatness from a discrete-time signal, the power spectral density (PSD) is first obtained as a prerequisite, typically through the discrete Fourier transform (DFT) applied to windowed segments of the signal or, for time-varying analysis, via the short-time Fourier transform (STFT). The STFT involves dividing the signal into short, overlapping frames to capture local spectral characteristics, with common frame lengths of 20 to 50 milliseconds for audio signals sampled at rates like 22.05 kHz or 44.1 kHz. The computation then follows these steps on each frame or spectral slice. First, a such as the Hann or Hamming window is applied to the frame to mitigate caused by finite-length segmentation. Overlaps between frames, often 50% or more (e.g., 10 ms shift for 20 ms frames), ensure smooth transitions and reduce artifacts. Next, the DFT or (FFT) is computed on the windowed frame, with the squared yielding the periodogram-based PSD estimate; typical FFT sizes range from 512 to 2048 points to balance and efficiency, providing frequency bins spaced at 10-50 Hz for standard audio sampling rates. The of the PSD values is then calculated across the relevant bins, often limited to the audible range (e.g., 500 Hz to 4 kHz) to focus on perceptually important content. For the , the logarithms of the PSD values are averaged, and the result is exponentiated; to prevent logarithms from zero-valued bins, a small positive constant such as $10^{-10} is added to all PSD values for . Finally, spectral flatness is obtained as the ratio of this to the , yielding a value between 0 and 1 that quantifies spectral uniformity.

Practical Implementation

In practical implementations of spectral flatness computation, is a key concern due to the involvement of logarithmic operations on power spectral density () values, which can include zeros or near-zeros leading to undefined or infinite results. To mitigate or log(0) errors, a common approach is to apply a small positive , such as \epsilon = 10^{-10}, by thresholding the PSD magnitudes before computing means; for instance, replacing values below this ensures finite logarithms without significantly altering the measure for typical audio signals. Software libraries facilitate efficient PSD estimation and flatness calculation. In Python, NumPy provides vectorized array operations for mean computations, while Librosa offers a dedicated spectral_flatness function that internally handles (STFT) via FFT, applies the necessary thresholding for stability, and returns frame-wise flatness values, making it suitable for audio analysis pipelines. Similarly, MATLAB's Toolbox includes the spectralFlatness function, which computes the measure directly from signals or spectrograms generated by spectrogram, incorporating built-in handling for edge cases in PSD estimation. For real-time or large-scale processing, efficiency optimizations are essential. The overlap-add (OLA) method in STFT implementations allows continuous analysis by overlapping frames (typically 50-75% overlap with Hann windows), enabling low-latency updates of spectral flatness without full signal buffering, as used in audio streaming applications. Vectorized FFT routines further accelerate PSD computation; for example, the FFTPACK library, a Fortran package for fast Fourier transforms, supports efficient real and complex transforms and is integrated into tools like SciPy for high-performance numerical arrays. Non-stationary signals, common in audio, require averaging spectral flatness across multiple STFT frames to capture temporal variations robustly, such as in long-term spectral flatness measures that aggregate over extended windows for stable estimates in .

Applications

Audio and Music Processing

In perceptual audio coding, spectral flatness plays a key role in estimating the of audio signals to model psychoacoustic masking thresholds and optimize bit allocation. Introduced by Johnston in , the spectral flatness measure (SFM) quantifies how noise-like or tonal a signal is by comparing the geometric and means of its , enabling efficient compression in standards like while minimizing perceptual distortion. This approach was foundational for subsequent codecs, including (AAC), where SFM informs noise-to-mask ratio calculations in filter bank-based psychoacoustic models to allocate fewer bits to noise-like components. The multimedia content description standard incorporates AudioSpectralFlatness as a low-level audio descriptor to characterize the flatness of a signal's short-term power spectrum density function, facilitating content-based retrieval and similarity matching in databases. This descriptor, computed over frequency bands, supports applications like audio indexing by providing a normalized measure (0 to 1) of spectral , aiding in the of tonal from percussive or noisy segments. Recent machine learning applications from 2020 to 2025 have leveraged spectral flatness as a feature in audio processing tasks. In singing voice detection, it contributes to spectral descriptors that enhance classification accuracy in polyphonic music, as seen in surveys and models distinguishing vocals from instruments via tonality cues in audio scene recognition. For speech enhancement, spectral flatness aids noise profiling in machine learning frameworks by assessing the noise-like quality of interfering signals, improving denoising performance in real-time systems. Additionally, a 2025 approach encodes audio features into images for voice characteristic representation, incorporating spectral flatness in the red channel to capture tonal versus noisy attributes, enabling better multimodal analysis of speaker traits.

Biomedical and Other Fields

In biomedical , spectral flatness serves as a key feature for analyzing electroencephalogram (EEG) signals to detect epileptic , where lower values indicate tonal or structured activity typical of ictal states, contrasting with higher flatness in noisy interictal periods. For instance, in noise-robust seizure detection algorithms, spectral flatness is combined with and measures to quantify signal irregularities under , achieving improved classification accuracy on EEG datasets. Similarly, subband incorporating spectral flatness has been employed in automated EEG seizure detection systems, enhancing computational efficiency by distinguishing seizure onsets through flatness variations across frequency bands. In , particularly for assessing birdsong complexity, spectral flatness quantifies the tonality-to-noisiness gradient in vocalizations, with values near 0 indicating pure tones and higher values reflecting noisy, complex spectra associated with behavioral diversity. This measure has been integrated into bioacoustic analyses to evaluate phylogenetic signals in vocal learning , revealing how spectral flatness correlates with evolutionary adaptations in . In scoping reviews of bioacoustics for animal behavior, spectral flatness is listed among standard metrics for measuring vocal features in animals, including , aiding studies on communication and environmental influences without requiring exhaustive feature sets. Beyond , spectral flatness contributes to psychoacoustic models of , where partial variants like the partial spectral flatness measure (PSFM) estimate perceived tonal content by assessing uniformity, outperforming traditional metrics in perceptual audio tasks. Perceptual evaluations confirm that spectral flatness variants predict listener judgments of spectral variance, with higher flatness linked to noise-like sensations in controlled listening experiments. In emerging quantum , quantum-adapted spectral flatness, computed via quantum transforms, enhances audio steganalysis by detecting hidden embeddings in quantum-secure channels, integrated with neural networks for improved detection rates in frameworks. In (VR) applications, spectral flatness informs machine learning-based spatial audio rendering by serving as an input feature for estimating directional parameters, enabling realistic in immersive environments as reviewed in data-driven audio processing surveys. This application underscores its role in perceptual audio synthesis for VR, where flatness helps balance tonal and diffuse components in rendering.

References

  1. [1]
    spectralFlatness - Spectral flatness for signals and spectrograms
    flatness = spectralFlatness(x,f) returns the spectral flatness of the signal, x, over time. How the function interprets x depends on the shape of f.Missing: definition | Show results with:definition<|control11|><|separator|>
  2. [2]
    [PDF] Transform coding of audio signals using perceptual noise criteria
    Johnston, “A method of estimating the perceptual entropy of an audio signal,” submitted to ICASSP '88. [7] -, “Digital coding of musical sound-Some statistics ...
  3. [3]
    [PDF] Generalization of Spectral Flatness Measure for Non-Gaussian ...
    The Spectral Flatness Measure (SFM) quantifies how much tone-like a sound is, and is equivalent to the rate of growth of multi-information (MIR) for Gaussian ...
  4. [4]
    Robust matching of audio signals using spectral flatness features
    This paper discusses the problem of robust identification of audio signals by matching them to a known reference. In order to perform well under realworld ...
  5. [5]
    Spectral Flatness - Crystal Instruments
    Aug 28, 2021 · Spectral flatness is a way to quantify the deviation of a passband from being perfectly flat across the frequency spectrum.
  6. [6]
    [PDF] Using a Spectral Flatness Based Feature for Audio Segmentation ...
    The Spectral Flatness Measure. (SFM) and the corresponding tonality coefficient (Johnston 1988) are used to quantify the tonal quality, i.e. how much tone ...
  7. [7]
    Note on measures for spectral flatness | Electronics Letters
    Spectral flatness is a feature of acoustic signals that has been useful in many audio signal processing applications. The traditional definition of spectral ...
  8. [8]
    Perceptual evaluation of measures of spectral variance
    Jun 5, 2018 · Some of the common measures of whiteness include the Wiener Entropy or Spectral Flatness Measure (SFM),5 Ljung-Box test,6 and Drouiche Test.7.
  9. [9]
    Spectral Flatness - an overview | ScienceDirect Topics
    It serves to distinguish between noise-like signals, which exhibit a flat spectrum, and tone-like signals, which have a peaked spectrum. 1. Formally, spectral ...
  10. [10]
    [PDF] a psychoacoustic model with partial spectral flatness measure for ...
    [10] J.D. Johnston, “Estimation of perceptual entropy using noise masking criteria,” in Acoustics, Speech, and Sig- nal Processing, 1988. ICASSP-88., 1988 ...<|control11|><|separator|>
  11. [11]
    Note on measures for spectral flatness - ResearchGate
    Aug 6, 2025 · This is confirmed by the values of spectral flatness determined using Wiener entropy described by the following formula [37] : ...
  12. [12]
    [PDF] Content-based Identification of Audio Material Using MPEG-7 Low ...
    The so-called SFM (Spectral Flatness Measure) [16] is a function which is related to the tonality aspect of the audio signal and can therefore be used as a ...
  13. [13]
  14. [14]
    [PDF] Linear prediction of audio signals - ISCA Archive
    SFMR(dB) = 10 log10 exp 1. M. ˜ fs f=0ln |R(f)|2. 1. M. ˜ fs f=0 |R(f)|2. , (2) where ... sacrifices part of its zeros to achieve spectral flatness in the high-.
  15. [15]
    librosa.feature.spectral_flatness — librosa 0.11.0 documentation
    A high spectral flatness (closer to 1.0) indicates the ... spectral flatness for each frame. The returned value is in [0, 1] and often converted to dB scale.
  16. [16]
    [PDF] Feature Vectors
    Spectral flatness. • It reflects the flatness properties of the power ... the average of the sub-band flatness values. SFb = " kb. X(kb ). 2. Nb. 1. Nb. X(kb ).<|control11|><|separator|>
  17. [17]
    [PDF] A Segmental Spectral Flatness Measure for Harmonic-Percussive ...
    Knowing if an audio signal originates from a har- monic or a percussive source can be very helpful for fur- ther processing in a lot of audio signal processing ...Missing: origin | Show results with:origin<|control11|><|separator|>
  18. [18]
    [PDF] Modified Spectral Flatness Approach for Robust Train Localisation
    [5] N. Madhu “Note on measures for spectral flatness” in ELECTRONICS LET-. TERS 5th November 2009 Vol. 45 No. 23. [6] Localisation Working Group (LWG) ...
  19. [19]
    [PDF] Speech Enhancement Using Spectral Flatness Measure Based ...
    [6]. GRAY, A.H., and MARKEL, J.D. “A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis,” IEEE ...
  20. [20]
  21. [21]
  22. [22]
  23. [23]
    Efficient voice activity detection algorithm using long-term spectral ...
    Jul 16, 2013 · A low spectral flatness indicates that the spectral power is less uniform in frequency structure, and this would typically sound like speech.
  24. [24]
    Source code for librosa.feature.spectral
    Compute the spectral centroid. Each frame of a magnitude spectrogram is normalized and treated as a distribution over frequency bins.Missing: practical | Show results with:practical
  25. [25]
    Overlap-Add STFT Processing - Stanford CCRMA
    This chapter discusses use of the Short-Time Fourier Transform (STFT) to implement linear filtering in the frequency domain.
  26. [26]
    FFTPACK - NetLib.org
    FFTPACK is a package of Fortran subprograms for the fast Fourier transform of periodic and other symmetric sequences. It includes complex, real, sine, cosine, ...
  27. [27]
    [DOC] ISO/IEC JTC 1/SC 29 N
    Description of the audio spectral flatness of the audio signal. loEdge ... MPEG-7 description. There are additional descriptive tags such as key that ...
  28. [28]
    (PDF) Streaming Audio Using MPEG–7 Audio Spectrum Envelope to ...
    Apr 13, 2017 · 60-74. [19] MPEG–7. MPEG 7 Library: A Complete API to Manipulate ... Audio spectral flatness (the flatness properties of the short-term ...
  29. [29]
    Singing Voice Detection: A Survey - PMC - NIH
    Jan 12, 2022 · Singing voice detection or vocal detection is a classification task that ... spectral flatness) as well as special features such as fluctograms [15].
  30. [30]
    Noise profiling for speech enhancement employing machine ...
    Dec 16, 2022 · Noise profiling for speech enhancement employing machine learning models ... Spectral flatness is a measure of an audio sound spec- trum that provides ...
  31. [31]
    Audio-to-Image Encoding for Improved Voice Characteristic ... - arXiv
    Mar 7, 2025 · ... spectral flatness, spectral contrast, chroma, and harmonic-to-noise ratio), and the blue channel comprises subframes representing these ...
  32. [32]
    Investigating the effects of Gaussian noise on epileptic seizure ...
    Investigating the effects of Gaussian noise on epileptic seizure detection: The role of spectral flatness, bandwidth, and entropy ... frequency resolution, and ...
  33. [33]
    A computationally efficient automated seizure detection method ...
    ... EEG segments into sub-bands, a total of four different spectral features including spectral centroid, spectral flatness, spectral spread, and spectral slope ...
  34. [34]
    Phylogenetic signal in the vocalizations of vocal learning and vocal ...
    Spectral flatness indicates the tonality versus noisiness of a signal, on a gradient from 0 for white noise (equal energy at all frequencies) to 1.0 for a ...<|separator|>
  35. [35]
    A scoping review of the use of bioacoustics to assess various ...
    Spectral flatness of a sound, calculated as the ratio of a power spectrum's geometric mean to its arithmetic mean measured on a logarithmic scale (higher ...
  36. [36]
    Towards quantum audio steganalysis using synergy of quantum ...
    The statistical analysis of these features includes the quantum spectral center (QSC), quantum spectral bandwidth (QSB), quantum spectral flatness measurement ( ...
  37. [37]
    An overview of machine learning and other data-based methods for ...
    May 16, 2022 · The input to the network were well-known hand-crafted audio features such as spectral centroid, spectral flatness, or spectral flux. More ...