Fact-checked by Grok 2 weeks ago

Lossy compression

Lossy compression is a data compression method that reduces the size of data by irreversibly discarding portions of the original information deemed less essential, allowing for a reconstructed version that approximates but does not exactly match the source. This approach achieves significantly higher compression ratios than lossless methods, often 6:1 to 100:1 depending on the data type and quality requirements, by exploiting human perceptual limitations or application-specific tolerances for error. Unlike lossless compression, which preserves all data exactly, lossy techniques prioritize storage and transmission efficiency for bandwidth-constrained scenarios. The core principle of lossy compression revolves around the rate-distortion tradeoff, where the goal is to minimize (measured by metrics like or ) for a given , or equivalently, maximize compression while keeping perceptible quality high. It relies on models of and perceptual irrelevance; for instance, in visual or auditory signals, subtle details below human sensory thresholds can be removed without noticeable degradation. Quantization plays a central role, mapping continuous or high-precision values to a of discrete levels, often combined with for further efficiency. Common techniques include , such as the (DCT) used to concentrate energy in low-frequency components that are prioritized during quantization, and like differential pulse code modulation (DPCM) that exploits temporal or spatial correlations. groups data into clusters represented by codebook entries, while subband or wavelet coding decomposes signals into frequency bands for selective compression. Notable examples are the standard for images, which applies DCT and to achieve ratios up to 20:1 with acceptable quality, and for audio, employing perceptual coding to mask inaudible frequencies. Video formats like MPEG extend these by adding across frames. Lossy compression finds widespread applications in , including streaming services, mobile devices, and storage systems, where it enables efficient handling of large volumes of images, audio, and video without prohibitive or space demands. Standards such as for video conferencing and ISO MPEG-4 for versatile delivery incorporate lossy methods to support scalable and refinement. While it introduces irreversible artifacts at high compression levels, its benefits in enabling transmission and broad accessibility outweigh drawbacks for most consumer and professional uses.

Fundamentals

Definition and Principles

Lossy compression is a data compression technique that achieves reduced file sizes by permanently discarding redundant or perceptually irrelevant from the original data, resulting in the impossibility of upon . This approach contrasts with lossless methods by prioritizing significant size reduction over , often achieving compression ratios several times higher while maintaining acceptable perceptual quality for human observers. The core principle underlying lossy compression involves exploiting models of human perception to identify and eliminate data that contributes minimally to the subjective experience, such as subtle details below sensory thresholds. Key principles of lossy compression include the use of psychoacoustic models for audio data, which account for and frequency sensitivity to discard inaudible components, and psychovisual models for visual data, which leverage characteristics like reduced sensitivity to high spatial frequencies or color differences. These principles are grounded in rate-distortion theory, developed by in 1948, which formalizes the tradeoff between data rate and allowable distortion. The compression process typically unfolds in stages: initial analysis to model perceptual irrelevance, quantization to approximate values with fewer bits, and encoding to further compact the representation. These stages ensure that the discarded information does not substantially impair the perceived quality, guided by empirical studies of human sensory limits. Foundational concepts and early practical applications of lossy compression emerged in the 1970s, coinciding with the development of early standards, notably (ADPCM) for speech audio introduced by researchers at Bell Laboratories. ADPCM exemplified lossy techniques by adaptively quantizing prediction errors in audio signals, achieving efficient reduction for applications while introducing controlled . A basic for lossy compression can be described as follows: raw input data is transformed into a that concentrates energy (e.g., frequency coefficients), quantized to lower precision levels based on perceptual models, and then subjected to to generate the final . For example, in audio processing, lossy compression often removes high-frequency components beyond the typical human of 20 kHz, as these are imperceptible and contribute disproportionately to data volume. Transform coding serves as a prevalent implementation in the transformation stage, reorganizing data to facilitate efficient quantization of less perceptible elements.

Advantages Over Lossless

Lossy compression achieves significantly higher compression ratios compared to lossless methods, often exceeding 10:1 for data, which substantially reduces storage requirements and enhances transmission efficiency over networks. This efficiency is particularly beneficial in bandwidth-constrained scenarios, where lossy techniques enable faster transfer without requiring excessive resources. In practical applications, lossy compression excels in web media delivery, mobile devices, and , where maintaining perceptual quality for human viewers or listeners is sufficient, allowing content providers to serve large audiences with limited . For instance, streaming services rely on lossy formats to optimize playback in real-time environments with variable network conditions. While lossy compression introduces irreversible information loss as a necessary compromise for these gains, the discarded data typically falls below human perceptual thresholds, preserving acceptable quality in most use cases. This trade-off also yields energy savings in storage systems, as smaller file sizes decrease the power consumption associated with data handling and retention. A representative quantitative example is in , where the format achieves an approximately 11:1 ratio from uncompressed CD-quality audio, compared to the roughly 2:1 ratio of lossless , often without noticeable quality degradation for typical listeners. Furthermore, by minimizing data volumes, lossy compression contributes to reduced energy use in data centers, lowering the overall environmental impact through decreased electricity demands for and .

Core Techniques

Transform Coding

Transform coding is a foundational technique in lossy compression that converts input data from the spatial or into the using a reversible mathematical transform. This process exploits the statistical properties of signals, such as their tendency to have correlated samples, by representing the data as a set of frequency coefficients. A key benefit is the concentration of signal energy into a small number of low-frequency coefficients, while high-frequency components carry less energy and can often be discarded or approximated with minimal perceptual impact. This energy compaction property arises because transforms like the Karhunen-Loève transform (KLT) optimally diagonalize the signal's , decorrelating the coefficients and enabling efficient subsequent processing. Among the most widely used transforms, the (DCT) is prevalent for and video compression due to its excellent energy compaction for correlated data, closely approximating the performance of the optimal KLT with lower . Introduced by Ahmed, Natarajan, and in , the DCT expresses a of N real numbers as a sum of cosine functions oscillating at different frequencies. The one-dimensional type-II DCT, commonly employed in block-based coding, is defined as: X_k = \sum_{n=0}^{N-1} x_n \cos\left[\frac{\pi}{N} \left(n + \frac{1}{2}\right) k \right], \quad k = 0, 1, \dots, N-1 where x_n are the input samples and X_k are the DCT coefficients. For audio compression, the Modified Discrete Cosine Transform (MDCT), developed by Princen and Bradley in 1987, is favored for its perfect reconstruction capabilities in critically sampled filter banks and overlap-add structures that reduce aliasing artifacts. The MDCT builds on the DCT-IV by incorporating time-domain aliasing cancellation, making it suitable for time-varying signals. The typical workflow in transform coding begins with applying the forward transform to blocks of input to generate , followed by coefficient selection—often prioritizing low- terms based on their content—and then applying an transform to reconstruct the signal. This selection step facilitates targeted information loss by focusing efforts on perceptually significant components. Quantization often follows as the primary lossy mechanism to further reduce coefficient precision. The achieved by these transforms simplifies quantization and , as independent coefficients require less bitrate for representation compared to the original correlated , leading to higher ratios without uniform across the signal. Historically, the DCT gained prominence through its adoption in the JPEG still image compression standard, finalized in 1992, where it enabled efficient lossy coding of continuous-tone images by processing blocks. This standardization demonstrated the practical efficacy of , influencing subsequent formats in compression.

Quantization and Prediction

Quantization serves as the primary mechanism for introducing controlled loss in lossy compression by mapping continuous or high-precision input values to a of levels, thereby reducing the data representation to fewer bits while inevitably producing errors. This process partitions the input into , assigning each to a representative value, known as the reconstruction level, which introduces a quantization error defined as e = x - \hat{x}, where x is the original input and \hat{x} is the quantized output. The error arises because the exact value x is replaced by the nearest level, and its magnitude depends on the size and input ; for instance, in high-rate approximations, the distortion is roughly proportional to the variance times $2^{-2R}, where R is the rate in bits per symbol. Uniform quantization employs equal-sized intervals across the input range, simplifying implementation and suiting signals with uniform distributions or high signal-to-noise ratios, but it can inefficiently allocate levels for non-uniform signals like speech or images. In contrast, non-uniform quantization uses varying interval sizes, often finer in regions of high perceptual importance—such as low-amplitude signals in audio—to minimize subjective distortion through perceptual weighting, as seen in techniques that compress the before uniform quantization and expand it afterward. These approaches optimize the design, such as via the , which iteratively refines levels and boundaries to minimize for a given . Prediction enhances lossy compression by exploiting statistical redundancies in data, estimating subsequent values from prior ones to encode only the residuals, which are then quantized to introduce loss. Intra-frame prediction operates spatially within a single frame, using neighboring samples to forecast a current value, such as in image coding where a is predicted from adjacent pixels via linear filters; the residual is quantized and transmitted, reducing the of the encoded signal. Inter-frame prediction extends this temporally across frames, predicting from previous reconstructed frames to capture motion or evolution in sequences like video, again quantizing the difference to balance compression and ; this yields prediction gains, for example, up to $1/(1 - r^2) for first-order Markov processes with r. A classic example is Differential Pulse Code Modulation (DPCM), widely used in , where a linear predictor estimates the next sample from past ones, quantizes the prediction error, and reconstructs the signal at the , achieving significant bit-rate savings over direct while introducing granular noise as the primary distortion. Vector quantization extends scalar methods by treating groups of input samples as multidimensional vectors, mapping them jointly to entries—predefined representative vectors—to exploit inter-sample correlations for greater efficiency. The , a finite of vectors, is designed via clustering algorithms like k-means to minimize average distortion, such as , enabling lower rates for equivalent quality compared to scalar quantization. This technique is particularly effective for correlated data like speech parameters or blocks, though it requires larger codebooks and more complex searches, often mitigated by tree-structured approximations. Rate-distortion optimization integrates quantization and by systematically balancing the trade-off between distortion D (e.g., ) and rate R (bits required), ensuring efficient allocation of resources across units. This is typically formulated as minimizing the J = D + \lambda R, where \lambda is a multiplier that slopes the of feasible rate-distortion points, allowing adaptation to constraints like target bit budgets in or video . In practice, it guides decisions such as quantizer selection or mode choice, as applied in standards like and MPEG, to achieve operating points that maximize quality per bit while respecting perceptual or objective fidelity limits.

Media Applications

Image Compression

Lossy image compression techniques exploit the limitations of to discard data that has minimal impact on perceived quality, enabling significant file size reductions for still images. These methods prioritize preserving details over and high-frequency spatial , which the eye is less sensitive to. Quantization serves as the primary mechanism for introducing controlled loss in these processes. The JPEG standard, formalized in ISO/IEC 10918-1:1994, represents a foundational approach to lossy image compression using (DCT) coding. In the encoding pipeline, input images are first converted from RGB to color space to separate (Y) from (Cb, Cr) components, allowing coarser quantization of color data. The image is divided into 8x8 pixel blocks, each undergoing a forward DCT to transform spatial data into frequency-domain coefficients, emphasizing low-frequency components that carry most visual energy. These coefficients are then quantized using application-defined tables, followed by zigzag scanning to reorder them from low to high frequency for efficient . Finally, is applied to the scanned coefficients, with DC values encoded differentially across blocks and AC coefficients using run-length and amplitude coding. JPEG supports baseline sequential mode for straightforward single-scan encoding of 8-bit images with 1-4 components, and mode for multi-scan transmission that refines image quality gradually by selection (grouping bands) and successive approximation (bit-plane refinement). Compression levels are controlled via a quality factor on a 1-100 , where higher values reduce quantization scaling to retain more detail, while lower values increase it for greater —though even at 100, minor rounding losses occur. Typical quality settings of 75-90 balance file size and visual fidelity for photographic images. Common artifacts in JPEG-compressed images include blocking, visible as grid-like discontinuities at block boundaries due to independent quantization, and ringing, oscillatory distortions around sharp edges from in the inverse DCT. These are exacerbated at low bit rates, degrading perceived quality in or high-contrast regions. often involves post-processing with deblocking filters that adaptively block edges based on local variance, or deringing filters that suppress high-frequency oscillations while preserving edges; such techniques can improve PSNR by 1-3 without altering the core . Subsequent standards build on these principles for enhanced efficiency. , introduced by in 2010 and standardized in RFC 6386, employs intra-frame coding for lossy compression, using block prediction from neighboring pixels, DCT on residuals, and arithmetic to achieve 25-34% smaller files than at equivalent quality. HEIF (High Efficiency Image Format), defined in ISO/IEC 23008-12:2017, uses HEVC (H.265) intra-frame encoding within an container, supporting features like layered images and transparency for up to 50% better compression than . More recently, (AV1 Image File Format), specified by the in 2019 and registered as image/avif by IANA, leverages video codec intra-frames in a HEIF container, offering superior web efficiency with 20-50% size reductions over and growing adoption in browsers since 2020 for and wide-color-gamut images. JPEG XL, standardized as ISO/IEC 18181-1:2022 by the , introduces a modern royalty-free format supporting both lossy and with improved efficiency over (up to 60% size reduction at similar quality) and features like , , and lossless from legacy files. It uses a with tools such as the Fuchsia transform and adaptive quantization, achieving broad browser support by 2025 for web and professional imaging applications.

Video Compression

Video compression is a cornerstone of lossy compression techniques applied to moving images, exploiting both spatial within and temporal across to achieve significant data reduction while maintaining perceptual quality. Unlike still , which operates on individual , video codecs incorporate inter-frame to model motion, allowing for bitrates as low as 1-4 Mbps for high-definition content in modern standards. This approach is essential for streaming, , and , where uncompressed raw video can exceed 100 Mbps per stream. The MPEG family of standards, developed by the under ISO/IEC, forms the backbone of video . , standardized in 1995, enabled and DVD with ratios up to 50:1 for standard-definition video, supporting bitrates from 1.5 to 15 Mbps. H.264/AVC (), jointly developed by and MPEG and finalized in 2003, introduced more efficient tools like variable block sizes and improved , achieving 50% bitrate savings over at equivalent quality, widely used in Blu-ray, streaming, and mobile video. H.265/HEVC (), released in 2013, further advances this with larger coding tree units and advanced motion vector prediction, offering up to 50% better than H.264 for and 8K resolutions, though at higher computational cost. Complementing these, AV1, developed by the and released in 2018, is a alternative that rivals HEVC's efficiency with up to 30% bitrate reduction over H.264, gaining adoption in web video platforms like due to its open-source nature. Versatile Video Coding (VVC/H.266), standardized by and ISO/IEC in 2020, builds on HEVC with enhanced tools for higher resolutions and immersive media, providing up to 50% bitrate savings over HEVC at equivalent quality for 8K and beyond, though requiring significantly more encoding/decoding power. As of 2025, sees increasing deployment in professional broadcasting, streaming, and hardware like set-top boxes, supported by profiles for low-latency and . Central to these standards is motion estimation and compensation, which predict frame content from previous or future frames to minimize residual data. Block matching divides frames into macroblocks (typically 16x16 pixels) and searches for the best-matching block in a reference frame, formalized as minimizing the : \min_{mv} \sum |f(t) - f(t-1 + mv)| where f(t) is the current frame at time t, f(t-1) is the reference frame, and mv is the motion vector. This process, often refined with quarter-pixel accuracy in H.264 and later, captures object movement efficiently. Frames are classified as I-frames (intra-coded, self-contained like images), P-frames (predicted from prior frames), and B-frames (bi-directionally predicted from past and future frames), with B-frames providing the highest compression by referencing multiple references but increasing decoding latency. , similar to intra-frame methods, is applied to residuals after . Standards define to balance complexity and performance. The profile in H.264 suits low-latency applications like video conferencing with simpler and no B-frames, while the High profile adds features like 8x8 transforms and CABAC for superior efficiency in broadcast and storage. HEVC extends this with Main and Main 10 profiles for support, and offers similar tiers for progressive enhancement. Bitrate control mechanisms further optimize delivery: constant bitrate (CBR) maintains steady output for to avoid buffering, whereas (VBR) allocates more bits to complex scenes for consistent quality, often using two-pass encoding where the first pass analyzes the video and the second encodes accordingly. Despite these advances, lossy video compression introduces visible artifacts. Motion blur arises from inaccurate estimation in fast-moving scenes, smearing details across frames, while mosquito noise manifests as ringing or halos around edges due to quantization of high-frequency components in motion-compensated residuals. These are more pronounced at low bitrates, prompting perceptual models in modern codecs to prioritize human vision .

Audio Compression

Lossy audio compression leverages the limitations of human auditory perception, particularly through psychoacoustic principles that allow the removal of inaudible signal components while preserving perceived quality. Central to this approach are critical bands, which represent frequency ranges where the ear's resolution is roughly constant, modeled on scales like the or Equivalent Rectangular Bandwidth (ERB) scales. These bands, numbering about 24 for audible frequencies, enable efficient encoding by grouping spectral energy and focusing compression on perceptually relevant details. Psychoacoustic models analyze the input signal to identify masked regions, ensuring quantization noise falls below auditory thresholds. Frequency masking, or simultaneous masking, occurs when a louder sound (masker) at frequency f_m renders quieter sounds (maskees) nearby inaudible within the same or adjacent bands due to the ear's limited selectivity. The masking effect spreads asymmetrically: stronger toward higher frequencies (upward spread) and weaker downward, quantified by a spreading that raises the of hearing. In MPEG standards, this is approximated on the z, where the masking T_q(z) for a maskee at z due to a masker at z_m follows a form like T_q(z) = a \cdot 10^{b(z - z_m)} for z > z_m, with parameters a and b derived from empirical data (e.g., steeper slope of -15 to -27 per upward, shallower +8 to +30 per downward). This allows bit allocation to prioritize unmasked frequencies. Temporal masking complements frequency masking by exploiting the ear's sluggish response to rapid changes: a loud sound elevates the hearing threshold for subsequent (post-masking, up to 200 ms) or preceding (pre-masking, 5-20 ms) quieter sounds in the same range. The temporal spreading function models this , often exponentially, as T_t(t) = T_m \cdot e^{-t / \tau}, where \tau is a varying by signal level (longer for sustained tones). Combined, these masking effects guide noise shaping in , confining errors to imperceptible regions. The Audio Layer III () standard, finalized in , exemplifies these principles in a widely adopted format for general audio. It employs a filterbank: a 32-subband polyphase followed by a (MDCT) on overlapping blocks of 576 or 192 samples, providing fine (down to 41.67 Hz) for transient handling and pre-echo avoidance. The psychoacoustic model (Model 1 or 2) computes masking thresholds via FFT , identifies tonal/ maskers, and allocates bits to subbands based on signal-to-masking ratios (SMR), ensuring below thresholds using Huffman-coded quantized MDCT coefficients. Bit rates are dynamically adjusted via a reservoir mechanism. For stereo audio, MP3 incorporates joint stereo coding to exploit inter-channel redundancies. Intensity stereo encodes high-frequency bands with a single mono signal modulated by channel-specific intensity factors, preserving spatial cues without full separation. Mid-side (M/S) coding transforms left-right channels into sum (mid) and difference (side) signals, quantizing the often low-energy side channel more coarsely while reconstructing the stereo image at decoding. These techniques reduce bitrate needs by 20-30% at low rates without perceptual loss. Advanced Audio Coding (AAC), defined in ISO/IEC 14496-3 () as MP3's successor, enhances these methods for better efficiency at low bitrates. AAC uses a pure MDCT filterbank with longer windows (1024-2048 samples) for improved frequency resolution, a more sophisticated psychoacoustic model incorporating temporal masking delays, and tools like perceptual noise substitution for noisy signals. It supports multichannel audio and variable rates, achieving transparent quality at lower bitrates than MP3. Opus, standardized as RFC 6716 by the IETF in 2012, is a versatile royalty-free for both speech and , using a hybrid (linear prediction for speech) and CELT (MDCT for ) structure with dynamic switching based on content. It supports bitrates from 6 to 510 kbit/s, frame sizes as low as 2.5 ms for low latency, and features like and , outperforming in quality at bitrates below 128 kbit/s and widely adopted in , VoIP, and streaming services as of 2025. Typical bitrates for near-CD quality (44.1 kHz, 16-bit stereo) in these formats are around 128 kbps, where artifacts are minimal for most listeners, balancing file size and fidelity; higher rates like 192-256 kbps approach transparency.

Specialized Applications

Speech and 3D Graphics

Lossy compression for speech signals primarily relies on parametric models that synthesize voice based on vocal tract characteristics rather than directly encoding waveforms, enabling efficient representation at low bitrates suitable for real-time transmission. Linear Predictive Coding (LPC) forms the foundation of many such techniques by modeling the spectral envelope of speech through a linear prediction filter that estimates current samples from past ones, capturing formants essential to speech intelligibility. The core LPC synthesis equation is given by \hat{s}(n) = \sum_{k=1}^p a_k s(n-k) + G u(n), where \hat{s}(n) is the predicted speech sample, a_k are the predictor coefficients, p is the prediction order (typically 10-12 for speech), G is the gain, and u(n) is the excitation signal. This approach discards fine waveform details in favor of parameters that can be quantized, achieving compression ratios far beyond waveform coders while preserving perceived quality. Building on LPC, (CELP) enhances synthesis by using a to select an optimal sequence that minimizes prediction error through analysis-by-synthesis optimization, allowing high-quality speech at bitrates as low as 4.8 kbps. Modern standards like , standardized in 2012, integrate CELP-based methods (via its component) for speech up to 20 kHz, operating effectively in the 6-24 kbps range for voice applications, and supports hybrid modes for both speech and music. Similarly, the (EVS) codec, developed by in 2014, employs a CELP core with super-wideband extension up to 20 kHz, targeting 5.9-24 kbps for conversational quality in mobile networks, with quantization applied to LPC parameters and indices to bitrate and . These trade-offs prioritize intelligibility over exact reproduction, as minor parameter distortions remain imperceptible in voiced segments but can introduce artifacts at very low bitrates. Quantization of these parameters further reduces data by mapping continuous values to discrete levels, typically using for efficiency. Such speech compression techniques find primary application in Voice over IP (VoIP) systems, where low-latency encoding at constrained bitrates ensures reliable transmission over packet-switched networks without excessive bandwidth demands. For 3D graphics, lossy compression addresses the high storage and transmission costs of polygonal meshes and associated textures by approximating geometry and visuals while maintaining interactive rendering quality. Mesh simplification reduces vertex count through edge-collapse operations guided by error metrics like distance to the original surface, enabling progressive transmission and level-of-detail adjustments with minimal perceptual loss in complex models. Texture compression employs block-based methods, such as BC7, which partitions 4x4 texel blocks into subsets and uses endpoint interpolation with indices for high-fidelity RGB/RGBA encoding at 8 bits per texel, supporting Direct3D 11 and later APIs for real-time graphics. Google's Draco library, released in 2017, combines predictive geometry encoding with entropy coding for meshes and point clouds, achieving up to 90% size reduction for typical assets while preserving visual fidelity through edgebreaker traversal and quantization of positions and attributes. Trade-offs in 3D compression emphasize visual fidelity over geometric precision, as small vertex perturbations or texture approximations are often imperceptible in rendered scenes, particularly under shading and lighting; for instance, Draco balances compression speed and ratio via tunable quantization levels. These methods are critical for virtual reality (VR) and augmented reality (AR) applications, where compressed 3D models enable efficient streaming of immersive environments over bandwidth-limited connections.

Scientific and Other Data

Lossy compression plays a crucial role in managing the vast volumes of numerical generated by scientific simulations and observations, where storage and transmission constraints are severe, yet fidelity to underlying physical phenomena must be maintained. Applications span climate modeling, , and astronomical data, often employing error-bounded techniques to ensure that compression-induced errors do not compromise downstream analyses. For instance, in climate modeling, lossy compression reduces data volumes from high-resolution simulations while preserving key statistical properties like and patterns. Similarly, in , it targets quality scores in sequencing data to enable efficient storage without significantly affecting variant calling accuracy. For data, such methods compress astronomical images while safeguarding photometry results essential for detection. The SZ compressor exemplifies error-controlled approaches, providing pointwise absolute or relative error bounds for floating-point and scientific datasets across simulations and instruments. Key techniques include autoencoders for , which learn compact latent representations of high-dimensional scientific arrays, and floating-point quantization with user-specified tolerances to approximate values within acceptable error margins. Autoencoders, particularly hierarchical variants, achieve substantial for large-scale outputs by reconstructing with controlled distortion. Quantization methods, such as block floating-point schemes, scale and round values to lower precision levels, ensuring errors remain below predefined thresholds suitable for numerical . Standards like ZFP, a library for compressed floating-point arrays, support high-throughput with fixed-rate or error-bounded modes optimized for to spatially correlated from physics . MGARD, a multigrid-based framework, enables multilevel refactoring and with guaranteed error control, applicable to structured and unstructured meshes in scientific workflows. Evaluation metrics emphasize relative error, defined as \frac{|x - \hat{x}|}{|x|} < \epsilon for original value x and reconstruction \hat{x}, ensuring proportional accuracy across data scales common in scientific domains. This metric underpins relative-error-bounded compressors like , which adapt to data magnitudes for consistent quality. In the 2020s, efforts have intensified around , where lossy methods address I/O bottlenecks in petabyte-scale simulations by integrating with HPC workflows for in-situ compression. A primary challenge lies in balancing scientific accuracy with aggressive size reduction, as ratios up to 100:1 can be achieved—such as with ZFP on correlated floating-point fields—but require careful error tuning to avoid altering physical insights or statistical validity. techniques for time-series data, like those in outputs, can further enhance ratios by exploiting temporal correlations within error bounds.

Evaluation

Information Loss and Transparency

In lossy compression, information loss primarily occurs through the irreversible removal of elements that are imperceptible to human sensory perception, such as subtle spatial variations in images or inaudible frequency components in audio signals. This approach exploits perceptual redundancies, discarding details below the thresholds of human vision or hearing while preserving essential structural and semantic content. The discarded cannot be recovered upon , distinguishing lossy methods from lossless ones, but the loss is engineered to minimize noticeable . A key concern is the accumulation of loss across multiple compression-decompression cycles, known as , where artifacts from initial encoding propagate and amplify in subsequent generations. This compounding effect arises because each cycle introduces additional quantization errors or approximations, leading to progressive distortion that becomes more perceptible over iterations, particularly in formats like for images or for audio. Transparency refers to the bitrate or quality level at which the compressed output is perceptually indistinguishable from the original, meaning no audible or visible differences can be detected under typical conditions. For example, in , (AAC) achieves transparency at approximately 192 kbps for signals in many listening scenarios, balancing with . This threshold varies by content and but represents the "transparent bitrate" where further increases yield diminishing perceptual returns. Objective metrics quantify information loss by comparing the original and reconstructed signals. The (PSNR) measures as the ratio of the maximum signal power to the power of corrupting noise, calculated as \text{PSNR} = 10 \log_{10} \left( \frac{\text{MAX}^2}{\text{MSE}} \right), where MAX is the maximum possible signal value and MSE is the between original and compressed versions; higher PSNR values indicate less loss, with typical ranges of 30–50 dB for acceptable quality in images and video. Another metric, the Structural Similarity Index (SSIM), evaluates perceived changes in , , and , defined as \text{SSIM}(x, y) = [l(x, y)] \cdot [c(x, y)] \cdot [s(x, y)], with l(x, y) = \frac{2\mu_x \mu_y + C_1}{\mu_x^2 + \mu_y^2 + C_1}, c(x, y) = \frac{2\sigma_x \sigma_y + C_2}{\sigma_x^2 + \sigma_y^2 + C_2}, and s(x, y) = \frac{\sigma_{xy} + C_3}{\sigma_x \sigma_y + C_3}, where \mu denotes means, \sigma variances and covariance, and C stabilization constants; SSIM values near 1 signify high structural . Perceptual models guide loss minimization by incorporating human visual or auditory sensitivities, particularly through Just Noticeable Difference (JND) thresholds, which define the minimum distortion level undetectable by observers. JND-based approaches, such as those modeling masking or adaptation, allow compressors to allocate bits preferentially to perceptible regions, enabling up to 15–20% bitrate savings without quality loss in image and video applications. Subjective evaluation complements objective metrics via the (MOS), a standardized scale from 1 (bad) to 5 (excellent) derived from human listener or viewer ratings in controlled tests. MOS assesses overall perceptual quality, accounting for nuances like fatigue or context that metrics like PSNR overlook, and is integral to validating transparency in audio and video compression standards.

Compression Ratios and Efficiency

The in lossy compression is defined as the ratio of the original data size to the compressed data size, quantifying the reduction in or requirements. For images, typical ratios range from 10:1 to 50:1 depending on quality settings and content, as seen in where medium-quality encoding often achieves around 10:1 to 20:1 without severe degradation. In video, ratios can extend to 20:1 to 200:1 for standards like MPEG-4, balancing bitrate and perceptible quality. Efficiency is commonly measured using bits per pixel (BPP) for images, which represents the average number of bits needed to each after ; lower BPP values indicate higher efficiency, such as reducing from 24 BPP in uncompressed RGB to 1-4 BPP in compressed formats like . For video codecs, the Bjøntegaard Delta rate (BD-rate) provides a standardized by integrating rate-distortion curves to compute average bitrate savings at equivalent levels, often expressed as a improvement over a reference . Compared to , which typically yields 2:1 to 5:1 ratios for media data while preserving all , lossy methods achieve 5-50 times higher ratios by discarding perceptually irrelevant details, though this introduces irreversible . Efficiency varies with content dependency; smooth gradients in images or low-motion videos compress more effectively (higher ratios) than noisy or high-detail content due to better predictability in . also influences practical efficiency, as advanced codecs like HEVC (H.265) offer about 50% bitrate savings over H.264 at similar quality but require 2-10 times more encoding time owing to larger sizes and more modes. Recent benchmarks highlight 's gains, delivering approximately 30% better compression efficiency than HEVC (negative 30% BD-rate) across diverse content from 2018 to 2025 evaluations, while (, H.266) provides additional 20-40% efficiency improvements over HEVC as of 2025, often outperforming in high-resolution scenarios. These ratios are optimized near transparency thresholds where further compression yields in quality preservation.

Practical Challenges

Editing and Transcoding

Editing lossy compressed media introduces significant challenges due to the need for decoding and subsequent re-encoding, which exacerbates compression artifacts through a process known as generational loss. In image editing, for instance, operations such as cropping or resizing a file require decompressing the image, applying modifications, and recompressing it, often at the same or lower quality level. This re-compression amplifies visible artifacts like blocking, where pixelation appears along 8x8 block boundaries, as the quantization errors from the initial compression interact with new transformations. Additionally, editing can lead to color shifts, particularly in regions with subtle gradients, where the and introduce inaccuracies that propagate during requantization. Similar issues arise in audio editing; altering pitch in lossy compressed files, such as those using or , can amplify quantization noise, as frequency domain modifications redistribute errors across the spectrum, making subtle distortions more audible in the altered signal. Transcoding, the conversion of media from one lossy format to another, compounds these problems by necessitating a full decode-encode cycle, which introduces cumulative distortions. For video, converting from one compressed format to another involves decoding the source stream and re-encoding it, leading to drift accumulation where prediction errors from and residual quantization propagate across frames, causing temporal inconsistencies like blurring or ghosting in motion-heavy scenes. This drift arises because the decoder's reconstructed frames deviate from the original, and subsequent encoding builds predictions on these imperfect references, resulting in error buildup over multiple generations. In cascaded compression scenarios, such as repeated for across platforms, these effects intensify, reducing overall even if the target bitrate remains constant. To mitigate generational loss, workflows often incorporate non-destructive techniques, where modifications are stored as or layered adjustments without altering the underlying compressed until final export. Working with or lossless intermediate formats during editing preserves original quality, avoiding intermediate re-compressions. workflows further address these issues by generating low-resolution, lightweight versions of high-quality source files for editing; these proxies undergo any necessary re-compressions without affecting the originals, which are linked and substituted only during final rendering to minimize artifact accumulation.

Scalability and Resolution Adjustment

Lossy compression techniques often incorporate scalability to allow adaptation of the compressed data to varying network conditions, device capabilities, or user preferences without requiring complete re-encoding. This is achieved through layered bitstream structures that enable partial decoding for lower resolutions, frame rates, or quality levels. In , Progressive JPEG exemplifies this by organizing the DCT coefficients into multiple scans, permitting a coarse approximation of the image to be displayed first, with successive scans refining the detail. For video, Scalable Video Coding (SVC), an extension to H.264/AVC defined in Annex G of Recommendation H.264, introduces spatial, temporal, and quality (SNR) through a base layer and enhancement layers. The base layer provides a low-resolution or low-quality version compatible with decoders, while enhancement layers add higher (spatial , e.g., from quarter to full size), more frames (temporal via hierarchical B-frames), or reduced quantization noise (SNR ). This layered approach allows extraction of subsets of the for targeted decoding, reducing needs by up to 50% in adaptive scenarios compared to simulcasting multiple independent streams. The Scalable High Efficiency Video Coding (SHVC) extension to HEVC (H.265), specified in Annexes F and G of ITU-T Recommendation H.265, builds on this with improved efficiency, supporting spatial ratios of 1.5x or 2x between layers and SNR scalability through medium grain or coarse grain quality refinement. Enhancement layers in SHVC use inter-layer prediction, such as upsampling the base layer via dedicated filters specified in the standard for spatial alignment, to minimize redundancy while preserving compression gains of 30-50% over non-scalable HEVC for multi-resolution delivery. These scalability features enable dynamic adjustment, such as downsampling by halving or to fit lower bitrates, followed by upscaling at the using methods like to approximate higher quality. In practice, protocols like (), standardized in ISO/IEC 23009-1, leverage scalable bitstreams to switch layers in based on , ensuring seamless playback across devices. For instance, DASH segments can include multiple representations, allowing clients to select appropriate scalability layers without . However, non-scalably designed coders can introduce mismatch artifacts, such as drift between encoder and predictions, leading to accumulating errors in enhancement layers if inter-layer references misalign. This drift, exacerbated in SNR scalability, can manifest as blocking or blurring artifacts, requiring careful mode decisions to limit overhead to under 10% in /SHVC. Modern codecs like support spatial and temporal scalability through multi-layer tiling and temporal sublayers, but SNR scalability remains limited, often relying on external enhancements rather than native fine-grained layers.

AI-Driven Methods

Recent advancements in lossy compression have leveraged , particularly techniques, to surpass traditional methods in rate-distortion performance and perceptual quality. Neural autoencoders form the core of many end-to-end learned compression systems, where an encoder maps input to a compact latent representation, followed by quantization and decoding to reconstruct the output. These models are trained jointly to minimize a rate-distortion loss, enabling the network to learn data-specific transformations that capture essential features more efficiently than hand-engineered transforms like . Generative adversarial networks (GANs) have been integrated to mitigate compression artifacts, such as blocking or blurring, by training a discriminator to distinguish real from reconstructed images, forcing the generator (decoder) to produce more realistic outputs. This adversarial training enhances perceptual fidelity beyond pixel-wise metrics like . For instance, a fully convolutional residual network using GANs effectively removes compression artifacts, improving visual quality at low bitrates. Prominent examples include Google's Neural Image Compression framework, introduced in 2018, which employs variational autoencoders with a scale hyperprior to model spatial dependencies in the . Models by Ballé et al. demonstrate superior rate-distortion curves compared to BPG on standard datasets while maintaining similar PSNR levels. These systems often incorporate learned perceptual losses, such as those based on LPIPS, which align better with human than traditional distortions, leading to visually preferable reconstructions at equivalent rates. A key benefit of these AI-driven approaches is support for compression through manipulation of the , allowing dynamic adjustment of quality without retraining. By 2025, extensions to scientific compression have emerged, such as error-bounded methods using neural autoencoders like AE-, which ensure reconstruction errors stay within user-defined thresholds while achieving 100%-800% higher compression ratios than traditional compressors like on multidimensional simulation . DeepSZ applies similar principles to compress deep weights with guaranteed accuracy loss bounds, facilitating efficient of AI models themselves. Despite these gains, AI-driven methods face challenges, including the need for large, diverse training datasets to generalize across data types, which can introduce biases if not representative. Additionally, the computational overhead during encoding and decoding remains high, often requiring specialized to match performance of classical codecs, though ongoing optimizations aim to address this.

Hardware Acceleration

Hardware acceleration plays a crucial role in enabling real-time lossy compression for applications demanding high throughput, such as video streaming and scientific data processing, by leveraging specialized processors to offload computationally intensive tasks from general-purpose CPUs. are commonly used in encoders to optimize fixed-function operations in standards like H.264, where integrates dedicated encoding hardware directly into the CPU die for efficient video compression. This approach achieves speeds exceeding 300 frames per second on modern Intel processors, significantly reducing encoding latency compared to software-only implementations. Graphics Processing Units (GPUs) excel in parallelizing transforms essential to lossy compression algorithms, such as the (DCT) in encoding. NVIDIA's NVENC, a dedicated ASIC on RTX GPUs, accelerates video compression using codecs like H.264, HEVC, and , delivering up to 4x faster export times in tools like while maintaining comparable quality to CPU encoding. For , advanced GPU-accelerated decoders extending the nvJPEG library achieve throughputs that outperform CPU-based libjpeg-turbo by up to 51x on high-end GPUs like the A100. In scientific data contexts, the CuSZ framework further demonstrates GPU potential, providing error-bounded lossy compression up to 370x faster than a single CPU core and 13x faster than multi-core CPU setups on datasets like those from high-performance computing simulations. These technologies yield substantial gains, including 10-100x speedups in compression throughput and improved power efficiency, particularly beneficial for devices where constrains processing. For instance, NVENC's support on RTX 50-series GPUs offers 43% better efficiency than H.264 at equivalent bitrates, enabling video at lower bandwidths without quality loss. AI-driven lossy compression methods, such as neural autoencoders, also benefit from on platforms like Apple's Neural Engine, which provides up to 26x peak throughput improvements for transformer-based models since the A11 Bionic in 2017. By 2025, developments in Field-Programmable Gate Arrays (FPGAs) have advanced custom scientific compressors, such as FPGA-enhanced implementations of hyperspectral lossy algorithms like HyperLCA, which adaptively control distortion for real-time data from . These FPGA designs achieve high-speed processing tailored to constraints, outperforming general-purpose hardware in specialized error-bounded scenarios. Edge chips further reduce latency in compression tasks, enabling on-device processing for applications with minimal data transmission delays, as seen in 2025 market advancements emphasizing energy-efficient inference. Despite these benefits, involves trade-offs between fixed-function , which offer high efficiency for specific tasks but lack flexibility, and programmable options like GPUs or FPGAs, which support diverse algorithms at the cost of higher power consumption and design complexity.

References

  1. [1]
    [PDF] Fundamentals of Data Compression - Stanford Electrical Engineering
    Sep 9, 1997 · In general lossy compression may also include a lossless compression component, to squeeze a few more bits out. 14. Page 15. Compression Context.
  2. [2]
    [PDF] Introduction to Data Compression - CMU School of Computer Science
    Jan 31, 2013 · Lossless algorithms are typically used for text, and lossy for images and sound where a little bit of loss in resolution is often undetectable, ...Missing: fundamentals | Show results with:fundamentals
  3. [3]
    Lossy compression (article) - Khan Academy
    Lossy compression algorithms are techniques that reduce file size by discarding the less important information. Nobody likes losing information, ...Missing: definition principles authoritative
  4. [4]
    Lossy Compression - an overview | ScienceDirect Topics
    Lossy compression is a type of compression where some data are lost from the original message sequence [86]. Lossy compression techniques are usually used to ...
  5. [5]
  6. [6]
    Audio Compression - an overview | ScienceDirect Topics
    Compression methods often exploit psychoacoustic properties, such as auditory masking, to remove data that is not perceptible to human listeners, thereby ...
  7. [7]
    Psychovisual-based distortion measure for monochrome image ...
    In this paper we describe a quantitative distortion measure for judging the quality of compressed monochrome images based on a psycho-visual model.
  8. [8]
    [PDF] Lossy Image Compression and Scalar Quantization
    Useful for analog to digital conversion. □ With entropy coding, it yields good lossy compression. □ Lloyd algorithm works very well in practice, ...Missing: workflow | Show results with:workflow<|control11|><|separator|>
  9. [9]
    Adaptive quantization in differential PCM coding of speech
    Adaptive quantization in differential PCM coding of speech. Abstract: We describe an adaptive differential PCM (ADPCM) coder which makes instantaneous ...
  10. [10]
    [PDF] Hybrid Lossy Compression Methods Can Confidently Optimize Wide ...
    The primary goal of this work is to research and create a workflow for scientists to transfer data between computing clusters faster while maintaining data ...
  11. [11]
    A Guide to Lossy vs Lossless Compression | NinjaOne
    Oct 21, 2025 · Lossy compression is characterized by its ability to significantly reduce file sizes by discarding bits of information considered less critical ...An Overview Of Lossy... · An Overview Of Lossless... · Optimizing Your Digital...Missing: principles authoritative
  12. [12]
    5 Key Differences Between Lossless and Lossy Compression
    Jan 28, 2025 · Streaming platforms use lossy compression to deliver videos efficiently without overwhelming bandwidth. Casual image sharing on social media ...
  13. [13]
    Lossy Compression in Streaming: Benefits & Challenges - FastPix
    Oct 23, 2025 · Lossy compression reduces video size, optimizing bandwidth, but may impact quality, balancing performance and user experience in streaming.
  14. [14]
    Lossy Compression Explained: Benefits, Uses & How It Works
    Jul 14, 2025 · The key advantage of lossy compression is its ability to drastically reduce file sizes, which is crucial for faster transmission over networks ...
  15. [15]
    How Better Data Compression Leads to Energy Savings
    By reducing the size of data files, data compression helps to minimize storage and transmission requirements, leading to significant energy savings.
  16. [16]
    The Internet Is Changing the Music Industry
    Aug 1, 2001 · This represents a compression ratio of 11:1. Although MP3 has a much lower bit rate than traditional standards, music quality does not ...
  17. [17]
    MP3, AAC, WAV, FLAC: all the audio file formats explained
    Feb 10, 2025 · A lossless file, the FLAC (Free Lossless Audio Codec) is compressed to nearly half the size of an uncompressed WAV or AIFF of equivalent sample ...
  18. [18]
    How Data Lake Compression Reduces Carbon Emissions - Granica
    Apr 18, 2024 · Data compression not only helps reduce carbon footprint and cost but can also enhance performance for enterprise applications by speeding up ...Missing: lossy savings
  19. [19]
    None
    ### Summary of Key Concepts in Transform Coding
  20. [20]
    Discrete Cosine Transform | IEEE Journals & Magazine
    Discrete Cosine Transform. Abstract: A discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed.
  21. [21]
    The JPEG still picture compression standard - IEEE Xplore
    ... JPEG standard includes two basic compression methods, each with various modes of operation. A DCT (discrete cosine transform)-based method is specified for ...
  22. [22]
    Lossy Compression Basics and Quantization - GitHub Pages
    This is a fundamental part of lossy compression. Entropy coding is still widely applicable and typically used as the final step after quantization.Missing: workflow | Show results with:workflow
  23. [23]
    [PDF] Fundamentals of Quantization - Stanford Electrical Engineering
    Mar 20, 2006 · In lossy case require a measure of Quality of a quantizer quantifying the loss or quality of the resulting reproduction in comparison to the ...
  24. [24]
    Nonuniform Quantizer - an overview | ScienceDirect Topics
    While a uniform quantizer has the same step size throughout the input range, a nonuniform quantizer does not. For a uniform quantizer, the range and the step ...
  25. [25]
    [PDF] Predictive Coding
    ▫ JPEG-LS lossless compression standard. ▫ Lossy predictive coding: DPCM ... Example of intraframe DPCM coding prediction error coding. 1 bit/pixel. 2 bit ...
  26. [26]
    Vector Quantization and Signal Compression - SpringerLink
    Book Title: Vector Quantization and Signal Compression · Authors: Allen Gersho, Robert M. · Series Title: The Springer International Series in Engineering and ...
  27. [27]
    [PDF] Rate-Distortion Methods for Image and Video Compression
    Sep 2, 1998 · Abstract. In this paper we provide an overview of rate-distortion R-D based optimization techniques and their practical application to image ...
  28. [28]
    [PDF] itu-t81.pdf
    This CCITT Recommendation | ISO/IEC International Standard was prepared by CCITT Study Group VIII and the Joint. Photographic Experts Group (JPEG) of ISO/IEC ...
  29. [29]
  30. [30]
    JPEG Image Compression - Interactive Tutorial
    Feb 12, 2016 · The JPEG quality factor is a number between 0 and 100 that associates a numerical value with a particular compression level.
  31. [31]
    [PDF] Removal Of Blocking Artifacts From JPEG-Compressed Images ...
    Ringing artifacts are a class of artifacts that occur specifically due to low-pass filter processing. To understand how they occur, a short account of the ...
  32. [32]
    [PDF] Compression Artifact Reduction with Adaptive Bilateral Filtering
    The blocking artifacts and other compression artifacts, such as the ringing problem, often truncate the high-frequency DCT coefficients and become more severe ...
  33. [33]
    Compression Techniques | WebP - Google for Developers
    Aug 7, 2025 · Lossy compression: The lossy compression is based on VP8 key frame encoding. VP8 is a video compression format created by On2 Technologies as a ...
  34. [34]
  35. [35]
    AV1 Image File Format (AVIF)
    Oct 16, 2025 · The AV1 Image File Format supports progressive image decoding through layered images. An AVIF file is designed to be a conformant [HEIF] file ...
  36. [36]
    Psychoacoustic Models for Perceptual Audio Coding—A Tutorial ...
    This paper provides a tutorial introduction of the most commonly used psychoacoustic models for low bitrate perceptual audio coding.Psychoacoustic Models For... · 3. Coding Of Stereo Signals · 3.1. Binaural Hearing
  37. [37]
    [PDF] AUDIO CODING STANDARDS - MP3-Tech.org
    Hence the bit allocation is just to apportion the total number of bits available for the quantization of the subband signals to minimize the audibility of the ...
  38. [38]
    [PDF] Psychoacoustic Model
    A component (at a particular frequency) masks components at neighboring frequencies. Such masking may be partial. m Temporal Masking. When two tones (samples) ...
  39. [39]
    [PDF] TITLE PAGE PROVIDED BY ISO - SRS
    For different combinations of bitrate and sampling frequency different bit allocation tables exist (3-Annex. B, Table 3-B.2 "LAYER II BIT ALLOCATION TABLES").
  40. [40]
    [PDF] Design of the Audio Coding Standards for MPEG and AC-3
    This dissertation considers the design for the filterbank, psychoacoustic model, stereo matrix, and bit allocation/ quantization. This dissertation summarizes ...<|separator|>
  41. [41]
    [PDF] The Use of FFT and MDCT in MP3 Audio Compression
    How is the FFT implemented in MP3 encoding? ○ What is the Modified Discrete Cosine. Transform? ○ How is the MDCT implemented in MP3.
  42. [42]
    ISO/IEC 14496-3:2009 - Coding of audio-visual objects
    ISO/IEC 14496-3:2009 integrates many different types of audio coding: natural sound with synthetic sound, low bitrate delivery with high-quality delivery ...
  43. [43]
    [PDF] ISO/IEC 14496-3 - SRS
    ISO/IEC 14496-3 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and ...
  44. [44]
    [PDF] SPEECH COMPRESSION 1. Linear Predictive Coding (LPC) 2. LPC ...
    LPC Speech compression consists of parts. 1. Segment the sampled speech signal into short intervals (10-30 milliseconds long). These segments are called ...
  45. [45]
    [PDF] Linear Predictive Coding and the Internet Protocol A survey of LPC ...
    Figure 9.1 from this seminal paper depicts the LP parameters being extracted using the autocorrelation method and transmitted to a decoder with voicing ...
  46. [46]
    [PDF] “Code-excited Linear Prediction (CELP): High Quality Speech at ...
    We describe in this paper a code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences ...
  47. [47]
    RFC 6716 - Definition of the Opus Audio Codec - IETF Datatracker
    This document defines the Opus interactive speech and audio codec. Opus is designed to handle a wide range of interactive audio applications.
  48. [48]
    Enhanced Voice Services Codec for LTE - 3GPP
    Nov 7, 2014 · EVS is the first 3GPP conversational codec offering up to 20 kHz audio bandwidth, delivering speech quality that matches other audio input such as stored music.
  49. [49]
    [PDF] Codec for Enhanced Voice Services (EVS) - ETSI
    This Technical Specification has been produced by the 3rd Generation Partnership Project (3GPP). The contents of the present document are subject to continuing ...
  50. [50]
    Speech Coder and Compression Software | VOCAL Technologies
    VOCAL's Speex voice codec software solution for custom HD VoIP and file-based speech compression applications is optimized leading DSPs and processors.
  51. [51]
    [PDF] Technologies for 3D mesh compression: A survey
    In this paper, we performed a survey on current 3D mesh compression techniques ... Kobbelt, Simplification and compression of 3D meshes, in: Tutorials on ...
  52. [52]
    BC7 Format - Win32 apps - Microsoft Learn
    Dec 14, 2022 · The BC7 format is a texture compression format used for high-quality compression of RGB and RGBA data. For info about the block modes of the BC ...Missing: AMD | Show results with:AMD
  53. [53]
    Draco 3D Graphics Compression - Google
    Draco is an open-source library for compressing and decompressing 3D geometric meshes and point clouds. It is intended to improve the storage and transmission ...
  54. [54]
    Introducing Draco: compression for 3D graphics
    Jan 13, 2017 · Draco can be used to compress meshes and point-cloud data. It also supports compressing points, connectivity information, texture coordinates, ...
  55. [55]
    Google's Draco for Mixed Reality Applications: Compression Test
    Jan 8, 2025 · The description of Draco from the GitHub repo reads “Draco is a library for compressing and decompressing 3D geometric meshes and point clouds.
  56. [56]
    Evaluating lossy data compression on climate simulation ... - GMD
    We find that applying lossy data compression to climate model data effectively reduces data volumes with minimal effect on scientific results. We apply lossy ...
  57. [57]
    Performance evaluation of lossy quality compression algorithms for ...
    Jul 20, 2020 · Lossy genomic data compression, especially of the base quality values of sequencing data, is emerging as an efficient way to handle this ...
  58. [58]
    Lossy Compression of Integer Astronomical Images Preserving ...
    Nov 14, 2024 · This paper presents a novel lossy compression technique that is able to preserve the results of photometry analysis with high fidelity
  59. [59]
    [PDF] Fast Error-bounded Lossy HPC Data Compression with SZ
    The method linearizes data, uses curve-fitting models for predictable data, and lossy compression for unpredictable data, with a compression ratio of 3.3/1 - ...
  60. [60]
    zfp | Computing - Lawrence Livermore National Laboratory
    zfp is a BSD licensed open-source library for compressed floating-point and integer arrays that support high throughput read and write random access.
  61. [61]
    MGARD: A multigrid framework for high-performance, error ... - arXiv
    Jan 11, 2024 · With exceptional data compression capability and precise error control, MGARD addresses a wide range of requirements, including storage ...
  62. [62]
  63. [63]
    Floating Point Compression: Lossless and Lossy Solutions
    Our zfp compressor for floating-point and integer data often achieves compression ratios on the order of 100:1, i.e., to less than 1 bit per value of compressed ...
  64. [64]
  65. [65]
  66. [66]
    [PDF] On Perceptual Lossy Compression - arXiv
    Jun 5, 2021 · Abstract. Lossy compression algorithms are typically de- signed to achieve the lowest possible distortion at a given bit rate.
  67. [67]
  68. [68]
    MPEG-4 scalable lossless audio transparent bitrate and its application
    **Summary of Transparent Bitrate for AAC Audio:**
  69. [69]
    [PDF] On the Computation of PSNR for a Set of Images or Video - arXiv
    Apr 30, 2021 · This paper investigates different approaches to computing PSNR for sets of images, single video, and sets of video and the relation between them ...
  70. [70]
    Image quality assessment: from error visibility to structural similarity
    We introduce an alternative complementary framework for quality assessment based on the degradation of structural information.
  71. [71]
    [PDF] A Survey of Visual Just Noticeable Difference Estimation
    The JND threshold reveals the visual redundancy, and thus is useful for perception oriented visual signal processing, e.g., perceptual signal compression, image ...<|separator|>
  72. [72]
    Lossy Data Compression: JPEG - Stanford Computer Science
    The baseline algorithm, which is capable of compressing continuous tone images to less that 10% of their original size without visible degradation of the image ...<|separator|>
  73. [73]
    Image and Video Processing
    However, lossy compression methods such as MPEG-4 result in compression ratios from 20 to 200 depending on the video stream. Even MPEG-4 is the most ...Missing: typical | Show results with:typical
  74. [74]
    Understand the concept of "Bpp" and "Mbps" to define your ... - intoPIX
    Sep 30, 2020 · The "bits per pixel" (bpp) refers to the sum of the "number of bits per color channel" i.e. the total number of bits required to code the color ...The BPP "bits-per-pixel" concept · Chroma subsampling to...
  75. [75]
    Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution ...
    Jan 8, 2024 · The Bjøntegaard Delta (BD) method proposed in 2001 has become a popular tool for comparing video codec compression efficiency.
  76. [76]
    [PDF] Image compression overview - arXiv
    Sep 14, 2014 · For lossless methods, we can get the average of 3-4 times smaller files than the original ones. With lossy methods, we can obtain ratios up to.Missing: numerical | Show results with:numerical
  77. [77]
    Lossy compression of x-ray diffraction images
    For these images, the "theoretical maximum compression ratio" ranged from 1.2 to 4.8 with mean 2.7 and standard deviation 0.7. The values for Huffmann encoding ...
  78. [78]
    AV1 vs HEVC: Which Codec is Best for You? - Gumlet
    May 3, 2023 · AV1 offers 30% better performance than HEVC. Cons: The AV1 codec is one of the slowest in terms of encoding/decoding efficiencies and ...Difference Between AV1 vs... · AV1 vs. HEVC: Which One is...
  79. [79]
    Understanding The Effectiveness of Lossy Compression in Machine ...
    Mar 23, 2024 · Data compression is generally divided into two categories: lossless compression and lossy compression. Lossy compression methods can now be ...
  80. [80]
    Transients + Noise Audio Representation for Data Compression and ...
    The purpose of this paper is to demonstrate a low bitrate audio ... Transients + Noise Audio Representation for Data Compression and Time / Pitch Scale Modi ...<|separator|>
  81. [81]
    [PDF] Drift Compensation for Reduced Spatial Resolution Transcoding
    Aug 1, 2002 · This paper discusses the problem of reduced-resolution transcoding of compressed video bitstreams. An anal- ysis of drift errors is provided ...
  82. [82]
    (PDF) Drift compensation for reduced spatial resolution transcoding
    Aug 5, 2025 · This paper discusses the problem of reduced-resolution transcoding of compressed video bitstreams. An analysis of drift errors is provided ...
  83. [83]
    Edit faster with the proxy workflow in Premiere Pro
    Sep 19, 2024 · Adobe Premiere Pro logo. Craft the perfect story with Premiere Pro Find the best-in-class video-editing tools all in one place.
  84. [84]
    Reimagining the Possibilities of Proxy Workflows for Media Production
    Aug 24, 2022 · By using a highly compressed alternative, work can continue on the substitute material that is more appropriate for the remote circumstances.Missing: generational lossy
  85. [85]
  86. [86]
    Dynamic adaptive streaming over HTTP (DASH) — Part 1 ... - ISO
    This document primarily specifies formats for the Media Presentation Description and Segments for dynamic adaptive streaming delivery of MPEG media over HTTP.
  87. [87]
    [PDF] Overview of the Scalable Video Coding Extension of the H.264/AVC ...
    Differences between these prediction loops lead to a “drift” that can accumulate over time and produce annoying artifacts. However, the scalability bit stream ...<|control11|><|separator|>
  88. [88]
    Variational image compression with a scale hyperprior - arXiv
    Feb 1, 2018 · We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture ...
  89. [89]
    Deep Generative Adversarial Compression Artifact Removal - arXiv
    Apr 8, 2017 · We present a feed-forward fully convolutional residual network model trained using a generative adversarial framework.
  90. [90]
    [1912.08771] Computationally Efficient Neural Image Compression
    Dec 18, 2019 · We apply automatic network optimization techniques to reduce the computational complexity of a popular architecture used in neural image compression.
  91. [91]
    DeepSZ: A Novel Framework to Compress Deep Neural Networks ...
    In this paper, we propose DeepSZ: an accuracy-loss expected neural network compression framework, which involves four key steps: network pruning, error bound ...
  92. [92]
    [PDF] Computationally-Efficient Neural Image Compression with Shallow ...
    Neural image compression methods have seen increas- ingly strong performance in recent years. However, they suffer orders of magnitude higher computational ...
  93. [93]
    Intel GPU | Jellyfin
    This tutorial guides you on setting up full video hardware acceleration on Intel integrated GPUs and ARC discrete GPUs via QSV and VA-API.
  94. [94]
    Oh yeah, 380 fps transcoding (via intel Quicksync)
    Mar 15, 2023 · qsv transcode sometimes hits 400fps, it's speed is varying (I guess read/write speed of disk?), but it's constantly above 365 fps, seems to average around 380 ...
  95. [95]
    NVIDIA NVENC Obs Guide | GeForce News
    Jan 30, 2025 · The latest AV1 codec on NVIDIA GeForce RTX 50 series is 5% more efficient than the previous generation and ~43% more efficient than H.264. This ...
  96. [96]
    Export up to 4X faster with hardware encoding (NVENC) in Premiere ...
    Mar 25, 2021 · And this results in HUGE time-saving differences! Use the NVIDIA encoder or NVENC ... video is sponsored by NVIDIA #PremierePro #NVENC #NVIDIA.
  97. [97]
    [PDF] Accelerating JPEG Decompression on GPUs - arXiv
    Nov 17, 2021 · For GPU-accelerated computer vision and deep learning tasks, such as the training of image classification models, efficient JPEG decoding is ...
  98. [98]
    [PDF] CuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression ...
    Sep 30, 2020 · CuSZ is an optimized GPU-based error-bounded lossy compression framework for scientific data, the first of its kind on GPUs.
  99. [99]
    Deploying Transformers on the Apple Neural Engine
    Jun 6, 2022 · The 16-core Neural Engine on the A15 Bionic chip on iPhone 13 Pro has a peak throughput of 15.8 teraflops, an increase of 26 times that of ...Missing: 2020s | Show results with:2020s
  100. [100]
    (PDF) FPGA-Based Hyperspectral Lossy Compressor With Adaptive ...
    Aug 27, 2025 · In this paper, a transform-based lossy compressor, HyperLCA, has been extended to include a run-time adaptive distortion feature that brings ...
  101. [101]
    Edge AI in Embedded Devices: What's New in 2025 for IoT and EVs
    Sep 26, 2025 · The 2025 Edge AI Technology Report highlights that edge AI is central to minimizing data transmission, reducing latency, and cutting energy ...
  102. [102]
    How "exactly" are AI-accelerator chip ASICs built differently than ...
    Jan 10, 2023 · AI ASICs have fixed-function for specific tasks, like image recognition, while GPUs are general-purpose. ASICs are faster for neural networks, ...<|separator|>