Fact-checked by Grok 2 weeks ago

Audio compression

Audio compression refers to techniques used in audio signal processing and storage, encompassing two distinct processes: audio data compression, which reduces the size of digital audio files by encoding signals more efficiently, typically by eliminating redundancies and information imperceptible to the human ear, thereby conserving storage space and transmission bandwidth while preserving acceptable audio quality; and dynamic range compression, which reduces the difference in amplitude between the loudest and quietest parts of an audio signal to achieve more consistent volume levels, often for broadcast, recording, or live sound applications.^[1]^[2] The following sections primarily address audio data compression, with dynamic range compression covered separately. Audio data compression includes two primary categories: lossless compression, which enables perfect reconstruction of the original waveform without any data loss, achieving typical reductions to 40-80% of the original file size; and lossy compression, which discards inaudible components to attain much higher ratios, often 10% or less of the original size, but introduces irreversible alterations that are designed to be psychoacoustically transparent.^[3]^[4] At the core of audio compression, especially lossy variants, lies perceptual coding, which leverages models of human audition to identify and remove signal elements below auditory thresholds, exploiting phenomena such as frequency masking (where louder sounds obscure nearby frequencies) and temporal masking (where sounds briefly mask subsequent or preceding ones).^[2] Key techniques include transform coding via methods like the modified discrete cosine transform (MDCT) or fast Fourier transform (FFT) to shift data into the frequency domain for selective quantization; predictive coding such as differential pulse-code modulation (DPCM) or adaptive DPCM (ADPCM) to encode differences between samples; filter banks for subband decomposition; and entropy coding (e.g., Huffman or arithmetic coding) to further compact the quantized data.^[4]^[2] These approaches often feature asymmetric processing, with computationally intensive encoding and rapid decoding to support real-time playback.^[4] Major standards have driven widespread adoption, including MPEG-1 Audio Layer III (MP3), released in 1992, which uses hybrid filter banks and psychoacoustic modeling for bit rates as low as 112 kbps and compression ratios around 10:1; Advanced Audio Coding (AAC) from MPEG-2 and MPEG-4, offering improved efficiency for multichannel audio at 320 kbps or higher; and lossless options like Free Lossless Audio Codec (FLAC).^[2]^[4] Other notable formats include Dolby AC-3 for surround sound at 384 kbps and telephony standards like G.711 using μ-law or A-law companding at 64 kbps.^[2]^[4] Developments in the 1990s, fueled by advances in digital signal processing and hardware, have made audio compression essential for applications ranging from music streaming and portable devices to broadcasting and telecommunications.^[2]

Audio Data Compression

Principles of Audio Data Compression

Audio data compression refers to the process of encoding digital audio signals into a more compact representation to reduce storage and transmission requirements while maintaining sufficient fidelity for playback. This technique exploits inherent redundancies and perceptual limitations of the human auditory system to achieve significant data reduction without unacceptable degradation in perceived quality.^[5] Digital audio signals are fundamentally represented in uncompressed pulse-code modulation (PCM) format, where analog waveforms are sampled at regular intervals and quantized to discrete amplitude levels. Standard CD-quality audio employs a sampling rate of 44.1 kHz and 16-bit depth per sample per channel, resulting in a data rate of approximately 1.41 Mbps for stereo playback. This uncompressed format generates large file sizes, roughly 10 MB per minute for stereo CD-quality audio, due to the high volume of raw sample data.^[5]^[6] Audio signals contain multiple forms of redundancy that compression algorithms target to eliminate superfluous data. Temporal redundancy arises from correlations between consecutive samples, as audio waveforms exhibit smooth changes over short time periods. Spectral redundancy stems from correlations among frequency components, where certain spectral bands are predictable based on others. Statistical redundancy involves broader patterns in the signal's probability distribution, such as non-uniform amplitude occurrences that can be encoded more efficiently.^[5] A key aspect of effective audio compression, particularly in perceptual methods, is the integration of psychoacoustic models that account for human hearing limitations. The absolute threshold of hearing defines the minimum sound pressure level detectable across frequencies, typically around 0 dB SPL near 4 kHz but rising sharply at lower and higher extremes, allowing inaudible components to be discarded. Critical bands represent the frequency resolution of the auditory system, modeled as approximately 24 nonuniform bands (e.g., narrower at low frequencies, around 100 Hz below 500 Hz, widening to 20% of center frequency above), which group spectral energy for analysis. Masking effects further enable data omission: simultaneous masking occurs when a louder sound renders nearby frequencies inaudible within the same critical band, while temporal masking allows brief pre-masking (1-2 ms before) or post-masking (50-300 ms after) of quieter sounds by transients.^[5]^[7]^[8] Performance in audio compression is evaluated using metrics such as the compression ratio, defined as the ratio of the original file size to the compressed size, which quantifies overall data reduction. Bitrate, measured in kilobits per second (kbps), indicates the average data rate of the encoded stream; for instance, rates around 128 kbps can achieve near-transparent quality for many signals compared to uncompressed 1411 kbps. These metrics highlight inherent trade-offs: higher compression ratios or lower bitrates reduce file sizes but may introduce perceptible artifacts if perceptual models are insufficiently accurate.^[5] The foundations of audio data compression trace back to early developments in the 1970s, including adaptive differential pulse code modulation (ADPCM), which improved upon basic PCM by predicting and encoding differences between samples to exploit temporal redundancy. These innovations, initially applied to speech telephony, evolved through the 1980s with the rise of digital storage media like compact discs, driving demand for efficient coding amid bandwidth constraints. This progression culminated in standardized perceptual frameworks by the 1990s, establishing modern principles for both lossless and lossy techniques.^[5]

Lossless Audio Compression

Lossless audio compression refers to techniques that reduce the size of digital audio files while enabling bit-identical reconstruction of the original waveform upon decoding, ensuring no loss of information or quality. This approach is particularly advantageous for applications requiring archival fidelity, such as music mastering and long-term storage, where preserving every detail of the source material is essential. Typical compression ratios range from 40% to 60% of the original file size, for example, reducing a 50 MB WAV file to 20-30 MB without altering the audio data.^[3]^[9] The core methods employed in lossless audio compression exploit redundancies in the signal through reversible processes. Predictive coding uses linear prediction to estimate subsequent audio samples based on prior ones, encoding only the prediction errors to minimize data volume. Entropy coding, such as Huffman or arithmetic coding, then assigns shorter codes to more probable symbols in the error stream, further optimizing representation based on statistical probabilities. Transform-based approaches, like integer-reversible modulated lapped transforms, convert the audio into a frequency domain without quantization, allowing lossless reversal while concentrating energy for efficient encoding.^[10]^[11]^[12] Key algorithms illustrate these methods in practice. Shorten, developed in the early 1990s by Tony Robinson, relies on predictive modeling with Huffman coding of residuals to achieve simple lossless compression of waveform files. Monkey's Audio (APE) employs channel decorrelation, adaptive prediction, and entropy coding for efficient handling of audio blocks, offering high compression with tagging support. The Free Lossless Audio Codec (FLAC) integrates linear prediction, Rice coding for residuals, and MD5 checksums for data integrity verification, making it a widely adopted open-source standard.^[13]^[14]^[15] FLAC's design emphasizes versatility and robustness. As an open-source format maintained by the Xiph.Org Foundation, it supports Vorbis Comments for metadata like artist and album details, enables streaming via mappings to containers like Ogg, and incorporates error detection through frame headers and CRC checksums. For Rice coding, the parameter k is selected by the encoder based on the distribution of residual values to optimize compression efficiency.^[15]^[16] In performance comparisons, FLAC introduces minimal CPU overhead during decoding compared to uncompressed WAV, as the process involves lightweight prediction reversal and entropy decoding, often reading half the data volume for equivalent playback. WAV decoding is instantaneous with no processing, but FLAC's efficiency ensures negligible impact on modern hardware, with real-time decoding achievable even on embedded devices. FLAC enjoys broad hardware compatibility, supported natively in devices from smartphones to professional audio interfaces, unlike some proprietary formats.^[17]^[18]^[19] Despite these strengths, lossless audio compression faces inherent limitations. It cannot surpass the Shannon entropy limit of the source data, which represents the theoretical minimum bits required for unique representation, bounding achievable ratios for highly entropic signals like white noise. Consequently, lossless files remain larger than their lossy counterparts, which discard imperceptible information to attain higher reductions.^[20]^[3]

Lossy Audio Compression

Lossy audio compression employs perceptual coding techniques that irreversibly remove portions of the audio signal deemed inaudible to human listeners, enabling compression ratios of 90-95% relative to uncompressed PCM formats.^[21] For instance, standard CD audio at 1,411 kbps can be reduced to about 128 kbps in MP3 format, yielding roughly 1 MB per minute of stereo audio, which facilitates efficient streaming, mobile playback, and limited-bandwidth transmission without requiring excessive storage.^[22] Central to lossy compression is the application of psychoacoustic models that exploit human auditory perception limits. Frequency masking occurs when a louder tone obscures weaker tones in nearby frequency bands, while temporal masking hides sounds occurring shortly before or after a dominant one, allowing discard of masked components with minimal perceptual impact.^[23] Critical band analysis, often using the Bark scale—a psychoacoustic frequency measure approximating the ear's critical bandwidths (roughly 100 Hz wide at low frequencies, increasing to 3-4 kHz at higher ones)—divides the spectrum into 24 bands for precise masking threshold computation.^[5] The core pipeline of perceptual coding typically begins with analysis via the Modified Discrete Cosine Transform (MDCT), which converts time-domain audio into frequency subbands for efficient representation. Quantization follows, allocating fewer bits to spectral components below masking thresholds to minimize audible distortion, guided by bit allocation algorithms that prioritize perceptual relevance. Encoding concludes with entropy coding, such as Huffman coding, to further compact the quantized data for transmission or storage.^[5] Prominent lossy codecs include MP3 (MPEG-1 Audio Layer III), developed in the 1990s, which uses a polyphase filter bank for subband decomposition and supports joint stereo coding to exploit inter-channel redundancies.^[22] Its successor, Advanced Audio Coding (AAC), offers superior efficiency through enhanced perceptual modeling and MDCT-based transforms, achieving better quality at equivalent bitrates and serving as the standard for platforms like iTunes.^[22] More recently, Opus, standardized in 2012, combines the SILK speech codec for low bitrates with the CELT music codec in a hybrid design, enabling low-latency operation and adaptive bitrates up to 510 kbps for versatile applications like VoIP and music streaming. In MP3 specifically, audio is processed in frames of 1152 samples (or 576 for Layer III), with a bit reservoir mechanism allowing bitrate variability by borrowing bits across frames for smoother quality distribution. Bit allocation relies on the signal-to-masking ratio (SMR), defined as:

\text{SMR}(k) = \frac{E_s(k)}{E_m(k)}

where E_s(k) is the signal energy in critical band k and E_m(k) is the masking threshold energy; higher SMR values indicate regions needing more quantization precision to avoid audible noise.^[24] Despite these advances, lossy compression introduces trade-offs, including artifacts such as pre-echo—where quantization noise precedes sharp transients due to block-based processing—and general quantization noise at low bitrates, which can manifest as smearing or ringing in the decoded signal. Transparent quality, where differences from the original are imperceptible to most listeners, is typically achieved above 192 kbps for stereo music in codecs like MP3 and AAC under standard listening conditions.^[25] The evolution of lossy codecs has progressed from proprietary formats like MP3 to open alternatives such as Ogg Vorbis, which provides royalty-free compression with competitive quality as a direct MP3 rival. In the 2020s, codecs like LC3 (Low Complexity Communication Codec) have emerged for Bluetooth LE Audio, offering improved efficiency and lower latency for wireless applications while maintaining high perceptual quality at reduced bitrates.

Dynamic Range Compression

Principles of Dynamic Range Compression

Dynamic range compression is an audio signal processing technique that functions as an automatic gain control, attenuating portions of the signal that exceed a predefined threshold to reduce the overall dynamic range—the difference between the loudest and quietest parts—while preserving the signal's frequency content.^[26] This process narrows the amplitude span, for example, from typical values exceeding 90 dB in uncompressed audio to as little as 20 dB in heavily processed material, ensuring more uniform volume levels without introducing tonal alterations.^[26] Unlike data compression methods that minimize digital file sizes for storage efficiency, dynamic range compression manipulates the analog or digital waveform in real time to control perceived volume dynamics.^[27] The technique serves several key purposes in audio production and playback: it prevents clipping by capping peak amplitudes that could distort on amplifiers or speakers, improves listenability in inconsistent environments such as broadcast or consumer playback systems, and enhances overall perceived loudness by redistributing energy across the signal.^[28] By evening out extremes, it allows quieter elements to remain audible without requiring excessive overall gain, which might otherwise amplify noise.^[26] These benefits are particularly valuable in professional recording, live sound reinforcement, and mastering, where consistent dynamics contribute to a polished, engaging listening experience.^[27] At its core, the mechanism involves continuously comparing the input signal's level to a threshold; signals below the threshold pass unchanged, but those above trigger a gain reduction applied proportionally to the excess amount, yielding an output signal calculated as the input multiplied by (1 minus the reduction factor).^[26] This reduction is typically implemented via a variable-gain amplifier controlled by a side-chain detector that analyzes the signal envelope.^[27] The mathematical foundation relies on the compression ratio R, defined for levels in decibels (dB) as:

R = \frac{\text{input level} - \text{threshold}}{\text{output level} - \text{threshold}}

where R = 1 indicates no compression (linear pass-through, akin to expansion in reverse), and R > 1 applies compression, with higher values yielding stronger attenuation—for instance, a 4:1 ratio means every 4 dB above the threshold results in only 1 dB of output increase.^[26] The full output gain in dB can be expressed as y_{dB} = x_{dB} + c_{dB} + M, where x_{dB} is the input, c_{dB} is the compression gain, and M is makeup gain to restore average level.^[26] This process differs fundamentally from other audio effects: equalization modifies frequency-specific balances without affecting overall amplitude dynamics, while limiting enforces an infinite ratio strictly on peak transients to prevent overload, lacking the graduated, sustained adjustment of standard compression.^[27] Instead, dynamic range compression offers time-varying amplitude control, responding nonlinearly to the signal's envelope over time.^[26] The origins of dynamic range compression trace back to tube-based devices developed in the 1930s for radio broadcasting, where early models like the 1937 Western Electric 110A and 1938 RCA Model 96-A used variable-mu tubes to automatically manage signal levels and prevent overmodulation.^[29] These vacuum-tube circuits provided the first practical automatic gain control for live transmissions and recordings.^[29] By the 1960s, the field transitioned to solid-state technology, exemplified by the 1967 UREI 1176 FET compressor, which introduced faster response times and greater transparency through field-effect transistors, revolutionizing studio applications.^[29] Perceptually, dynamic range compression increases the sustain and density of sounds by sustaining note tails and filling the waveform, creating a fuller, more consistent texture that enhances presence in a mix.^[27] However, overuse can reduce the impact of transients—such as drum attacks or plucked string onsets—leading to a flatter, less lively presentation that diminishes the natural punch and emotional contrast of the audio.^[27] Studies confirm that while moderate compression improves clarity and balance, excessive application often lowers perceived quality by introducing audible distortion or fatigue.^[30]

Key Parameters of Compressors

The threshold is the signal level, typically measured in decibels (dB), above which the compressor begins to reduce gain; for example, a threshold of -20 dB means compression activates only when the input exceeds this value, allowing quieter signals to pass unaffected while louder ones are attenuated.^[26] Lowering the threshold engages compression on more of the signal, effectively narrowing the dynamic range, whereas a higher threshold preserves more natural dynamics.^[31] The compression ratio determines the degree of gain reduction applied once the threshold is exceeded; expressed as a ratio such as 4:1, it indicates that for every 4 dB the input signal surpasses the threshold, the output increases by only 1 dB.^[26] Higher ratios, like 10:1 or infinity:1 (used in limiting), provide more aggressive compression, while ratios closer to 1:1 have minimal effect. The knee refers to the transition around the threshold: a hard knee applies abrupt compression, creating a sharp change, whereas a soft knee introduces a gradual curve, often sounding more natural by smoothing the onset.^[31] Attack time specifies the duration, usually in milliseconds (e.g., 1-30 ms), for the compressor to reach full gain reduction after the signal exceeds the threshold; a fast attack quickly tamps down peaks to control transients, while a slower attack allows initial punch to pass through before compression engages.^[26] Release time defines how long it takes (e.g., 50-500 ms) for the compressor to return to unity gain once the signal falls below the threshold; short releases can cause audible "pumping" artifacts as gain rapidly fluctuates, whereas longer releases maintain sustain but may dull subsequent notes if not tuned properly.^[31] Makeup gain compensates for the overall level reduction caused by compression, typically applied as a post-processing boost (e.g., +5 dB) to restore the average output level; the output is thus the compressed signal multiplied by this gain factor, ensuring the processed audio matches the perceived loudness of the original.^[26] Additional features enhance compressor flexibility. Sidechain filtering, often a high-pass filter on the detection path, prevents low-frequency content from overly triggering compression, allowing smoother control over midrange and high frequencies, such as in de-essing where a 2-8 kHz band targets sibilance.^[31] Look-ahead delay, a digital capability introducing 1-10 ms of latency, enables pre-detection of peaks for more precise transient control without overshoot.^[31] Multiband compression applies independent parameters to specific frequency bands, permitting targeted dynamic control, such as taming bass punch while preserving treble sparkle.^[26] The gain reduction can be approximated linearly as

GR = \frac{1}{1 + \frac{1}{R} \cdot \frac{(input - threshold)}{threshold}}

where GR is the multiplicative gain factor, R is the ratio, input is the signal level, and threshold is the set level; this simplifies the relationship for small excursions above the threshold.^[26] Analog compressors, often using tubes or transistors for gain elements, introduce subtle harmonic distortion that imparts a characteristic "warmth" to the audio, while digital implementations provide precise control and features like look-ahead but risk aliasing artifacts if not oversampled properly.^[32]^[31]

Applications of Dynamic Range Compression

In music production, dynamic range compression is widely used to achieve consistent levels across tracks and elements within a mix, often referred to as "gluing" disparate sounds together. For instance, a gentle 2:1 compression ratio applied to vocals helps maintain audibility and emotional delivery without squashing nuances, ensuring the performance sits evenly in the mix.^[33] Bus compression on drum groups, such as applying it to the drum bus with a moderate ratio and fast attack, adds punch and cohesion by controlling transients while preserving impact, a technique common in genres like rock where aggressive compression (e.g., higher ratios around 4:1 or more) tightens the overall sound for energy and drive.^[33] In contrast, classical music production favors subtle or no compression to retain natural dynamics, relying instead on manual fader automation to gently tame extremes and preserve the genre's expressive range.^[34] In broadcasting and podcasting, dynamic range compression levels speech and program material to comply with international standards, ensuring uniform loudness across transmissions and preventing unwanted modulation that could interfere with radio signals. The European Broadcasting Union (EBU) R128 standard mandates an integrated program loudness of -23 LUFS, with true peak levels not exceeding -1 dBTP, often achieved through compression to balance dialogue dynamics while maintaining perceptual consistency for viewers.^[35] This approach avoids abrupt volume shifts between segments, such as ads and content, enhancing listener experience in both television and podcast formats.^[35] For live sound reinforcement, dynamic range compression protects public address (PA) systems from damaging peaks and manages performer variability in real-time environments. High-ratio compression, such as 10:1 on main PA outputs, limits sudden transients from instruments or vocals to safeguard amplifiers and speakers while allowing headroom for the overall mix.^[36] On vocals, moderate compression controls dynamics for singers with inconsistent projection, ensuring clarity in noisy venues without feedback issues.^[36] During mastering, dynamic range compression plays a central role in the "loudness wars," where aggressive processing has progressively reduced average dynamic range in commercial recordings to maximize perceived volume on playback systems. From the 1990s, when typical dynamic range hovered around 12-14 dB due to initial CD-era practices, it declined to about 6-8 dB by the 2010s through multi-stage compression and brickwall limiting (effectively infinite ratios) that clips peaks near 0 dBFS, boosting average levels for radio and streaming competitiveness.^[37] Tools like brickwall limiters further enforce this by preventing overs, though at the cost of transient detail.^[37] In consumer applications, dynamic range compression enables seamless playback across devices and platforms, such as auto-gain features in smartphones that normalize incoming audio to prevent jarring level changes during calls or media. Streaming services like Spotify employ loudness normalization targeting -14 LUFS, where pre-applied compression in masters helps tracks meet this threshold without additional platform-side processing that could degrade quality, ensuring consistent volume between songs regardless of original dynamics. However, excessive dynamic range compression can lead to listener fatigue by eliminating natural ebb and flow, resulting in a monotonous sound that strains perception over extended sessions and diminishes emotional impact. Metrics from the Dynamic Range Database highlight this trend, showing hyper-compressed albums with DR values below 7 correlating with reduced replay value and complaints of auditory exhaustion. Emerging trends incorporate AI-assisted dynamic range compression in digital audio workstations (DAWs), such as iZotope Ozone's Unlimiter module, which uses machine learning to adaptively restore transients and expand range in over-compressed sources, offering real-time suggestions tailored to genre and intent for more nuanced control.^[38]

References

[1]
[PDF] Introduction to Digital Audio Compression - Educypedia
Digital audio signal compression is the removal of redundant or otherwise irrelevant information from a digital audio signal—a process that is useful for ...<|control11|><|separator|>
[2]
[PDF] Lossless Compression of Audio Data - Montana State University
Lossless compression and lossless packing both refer to methods for reducing the number of data bits required to represent a stream of audio samples. Some ...
[3]
None
### Extracted Definitions and Details
[4]
[PDF] Perceptual coding of digital audio - Center for Neural Science
Perceptual coding of digital audio aims to create compact, transparent representations of audio signals, achieving high-quality audio at low bit rates.
[5]
Audio File Size Calculator
This audio file size calculator will help you estimate how much space an uncompressed audio file will take up on your computer's storage.
[6]
[PDF] Psychoacoustic Model
Uses equal frequency spread per band. m Psychoacoustic model only uses frequency masking. m Typical applications: Digital recording on tapes, hard disks, or ...
[7]
[PDF] Perceptual Coding and MP3
These studies lead to five psychoacoustic principles. a. Absolute Threshold of Hearing: This threshold represents the minimal amount of energy if a listener ...
[8]
Best Audio Formats for Recording, Mastering & Distribution
Jun 25, 2025 · FLAC (Free Lossless Audio Codec) is a lossless format that provides about 40-60% compression compared to WAV, meaning you get smaller files ...
[9]
The .flac format: linear predictive coding and rice codes - UfoRoblog
Apr 1, 2018 · In this post I would like to use the .flac example to explore how lossless audio compression is possible and focus on the two core aspects of it.
[10]
Lossless Compression - an overview | ScienceDirect Topics
Entropy coding methods include: Huffman coding: Assigns short codewords to data values with high probability and long codewords to those with low probability.
[11]
[PDF] Lossless and Near-Lossless Audio Compression Using Integer
Existing popular formats do not support fast transcoding, because lossless audio encoding typically uses adaptive time-domain prediction and adaptive entropy ...
[12]
[PDF] SHORTEN: Simple lossless and near-lossless waveform compression
Abstract. This report describes a program that performs compression of waveform files such as audio data. A simple predictive model of the waveform is used ...
[13]
https://tonyrobinson.com/_media/RobinsonCUED94.pdf
[14]
RFC 9639: Free Lossless Audio Codec (FLAC)
This document defines the Free Lossless Audio Codec (FLAC) format and its streamable subset. FLAC is designed to reduce the amount of computer storage space ...
[15]
RFC 9639 - Free Lossless Audio Codec (FLAC) - IETF Datatracker
The FLAC format uses two forms of Rice coding, which only differ in the number of bits used for encoding the Rice parameter, either 4 or 5 bits.
[16]
WAV versus FLAC - The Well-Tempered Computer
It's likely because the CPU has to read half as much data with FLAC compared to WAV. But these numbers are so small, they really don't matter. Other background ...
[17]
FLAC - FAQ - Xiph.org
The overhead is slightly higher than with native FLAC. In either case, the compressed FLAC data is the same and one can be converted to the other without re- ...
[18]
Which is the limit of lossless compression data? (if there exists such ...
Jun 17, 2011 · This limit, called the entropy rate, is denoted by H. The exact value of H depends on the information source --- more specifically, the ...
[19]
Lossy audio compression: principles, methods, misconceptions
Jun 8, 2025 · Lossy compression (MP3, AAC, Opus) provides much higher compression, but the decoded data is not identical with the original.
[20]
[PDF] MP3 and AAC Explained
The paper gives an introduction to audio compression for music file exchange. Beyond the basics the focus is on quality issues and the compression ratio / audio ...Missing: SMR equation
[21]
Psychoacoustic Models for Perceptual Audio Coding—A Tutorial ...
This paper provides a tutorial introduction of the most commonly used psychoacoustic models for low bitrate perceptual audio coding.Psychoacoustic Models For... · 3. Coding Of Stereo Signals · 3.1. Binaural Hearing
[22]
Precise Psychoacoustic Correction Method Based on Calculation of ...
Feb 16, 2025 · ... calculation of the signal-to-mask ratio. (SMR), as in Fig. 2. SMR is the diﬀerence between the masker and the min-. imum value of the masking ...
[23]
https://www.mdpi.com/2076-3417/9/14/2854
[24]
[PDF] Digital Dynamic Range Compressor Design— A Tutorial and Analysis
We explain what makes the designs sound different and provide metrics to analyze their quality. Finally, we provide recommendations for high performance ...
[25]
Dynamic Range Compression: How to Use ... - Mastering.com
Mar 29, 2020 · Compression reduces the dynamic range of a sound. It turns down the loudest parts of the sound while bringing the quietest parts up.
[26]
Audio dynamics 101: compressors, limiters, expanders, and gates
### Summary of Dynamic Range Compression Principles
[27]
The History Of Compressors In The Studio
### Summary of Historical Origins of Audio Compressors
[28]
Effects of wide dynamic-range compression on the perceived clarity ...
Apr 1, 2015 · The results of these studies suggest that compression, especially fast compression, reduces the perceived quality of music. However, this does ...
[29]
[PDF] Dynamic Range Processing and Digital Effects
The attack and release times of feedback compressors are affected by the compression ratio while feed-forward compressors time behavior is determined mainly by ...
[30]
[PDF] Analog Compressor Modeling
We have successfully implemented a working dynamic range compressor in Simulink that has variable threshold, compressor ratio as well as variable attack and ...
[31]
Compression Made Easy - Sound On Sound
Compressors remedy this by reducing a sound's dynamic range: compression will reduce the level differences between the mumbled and unmumbled words.
[32]
Q. How should I compress a classical recording? - Sound On Sound
Sometimes the best way to reduce this dynamic range is to manually ride the fader on mixdown, or draw in volume automation to drop the level of the loudest ...
[33]
None
### Summary of EBU R128 Standard on Loudness Levels and Dynamic Range Compression
[34]
The Role of a Compressor in a Live PA Situation - SOS FORUM
A noisy gig is just the situation where a narrowed dynamic range on the PA can help a lot but often very difficult to do without howling and squealing. I would ...
[35]
The Loudness Wars - USC Viterbi School of Engineering
Yet, unlike analog compressors, which were restricted by how much they could reduce the peak levels of tracks, digital compressors were much more powerful [3].
[36]
iZotope Ozone 12 adds new machine learning modules and a more ...
Sep 3, 2025 · This is designed to restore transients and dynamic range to overly compressed audio files. Get the MusicRadar Newsletter. Want all the ...