Fact-checked by Grok 2 weeks ago

Arithmetic coding

Arithmetic coding is a form of entropy encoding used in lossless data compression, where a sequence of symbols from a finite is represented as a single fractional number within the unit interval [0, 1), achieved by recursively subdividing the interval according to the cumulative probabilities of the symbols. This method allows for compression rates that approach the theoretical Shannon entropy limit, outperforming fixed-length and many variable-length codes like by avoiding the overhead of codeword boundaries. The foundational ideas for arithmetic coding trace back to Claude Shannon's 1948 paper on information theory, which introduced the concept of encoding messages as points in a probability space. Practical algorithmic developments emerged in the 1970s, with Richard Pasco's 1976 PhD thesis at Stanford University proposing efficient source coding algorithms based on first-in-first-out (FIFO) interval partitioning for fast compression. Concurrently, Jorma Rissanen formalized the generalized Kraft inequality for arithmetic codes in a 1976 IBM paper, enabling last-in-first-out (LIFO) implementations and proving their optimality under certain conditions. Glen G. Langdon and Jorma Rissanen further advanced the technique in 1979, introducing approximations to eliminate floating-point multiplications and improve computational efficiency for practical use. At a high level, arithmetic coding operates by initializing a [0, 1) and iteratively narrowing it for each symbol in the message: the interval is scaled and shifted based on the symbol's probability range within the current cumulative distribution, resulting in a subinterval whose length approximates the message's probability. Any number within the final subinterval serves as the code, which is then quantized to a binary string for ; decoding reverses this by expanding intervals and selecting symbols whose subintervals contain the received value. To mitigate precision issues in finite-precision , implementations often use approximations, such as to maintain interval width within predefined bounds. Key advantages of arithmetic coding include its adaptability to dynamic symbol probabilities in a single pass, support for non-stationary sources, and separation of modeling from coding, allowing integration with predictive models for enhanced performance. It has been widely adopted in international standards, including the MQ-coder in for , context-based adaptive binary arithmetic coding (CABAC) in H.264/AVC and HEVC for video, and arithmetic entropy coding in for bi-level images.

Fundamentals

Definition and Principles

Arithmetic coding is an entropy encoding technique used for lossless data compression, where an entire sequence of symbols from a source is represented by a single fractional number within the unit [0, 1). This method maps the message to a subinterval of [0, 1) by successively narrowing an initial coding interval based on the conditional probabilities of the symbols. Each symbol is assigned a subinterval whose is proportional to its probability, ensuring that the within the interval uniquely identifies the while minimizing the required precision. The core principle involves iteratively shrinking the current coding as each is processed, with the final 's length approximating the of the message relative to the underlying probability model. More probable result in larger subintervals and thus less reduction in size, allowing the to approach the theoretical minimum bits needed for representation. This cumulative narrowing converges such that any number within the final can serve as the , enabling near-optimal without the need to encode boundaries explicitly. Unlike block-based coders such as , which assign integral-length codes to individual symbols and may waste up to one bit per symbol due to constraints, arithmetic coding achieves compression arbitrarily close to the limit for any size by treating the entire as a single entity. It requires a probability model to determine the interval sizes for each symbol, which estimates the likelihoods based on the source statistics.

Probability Models

In arithmetic coding, a probability model defines a over the of symbols to be encoded, mapping each symbol to a probability value such that the probabilities to 1. This distribution determines the relative sizes of subintervals within the unit interval [0,1), with more probable symbols assigned larger intervals to enable efficient compression. Probability models can be static or dynamic. Static models employ fixed probabilities that do not change during encoding, often derived from prior knowledge of the data's statistics, which simplifies implementation at the cost of adaptability to varying input characteristics. In contrast, dynamic models adjust probabilities based on observed data, offering greater adaptability for non-stationary sources. Cumulative probability tables are constructed from the model's to facilitate subdivision. For a s_i with probability p(s_i), the subinterval bounds are set as low = cumulative probability up to the previous symbol (i.e., \sum_{k=1}^{i-1} p(s_k)) and high = low + p(s_i). These tables, often implemented as arrays of partial sums, allow quick lookup of bounds during encoding. The accuracy of the probability model is crucial for compression performance, as it enables the encoded bitstream length to approach the Shannon entropy H = -\sum p(s) \log_2 p(s), the theoretical lower bound for lossless encoding. Inaccurate models lead to suboptimal interval assignments, resulting in longer outputs that exceed the entropy limit.

Encoding and Decoding

Overview

Arithmetic coding encodes a by representing it as a single fractional value within the [0,1), achieving efficient compression through cumulative probability assignments to symbols. The encoding process begins with the full interval [0,1). For each symbol in the input message, the current interval is subdivided proportionally according to the probabilities of the possible symbols from a predefined model, and the subinterval corresponding to the actual symbol is selected as the new current interval. This subdivision narrows the interval iteratively with each symbol processed, refining the representation of the message. Upon completing the message, any number within the final narrowed interval serves as the encoded output, often selected as the lower end or midpoint for simplicity in transmission. The decoding process reverses this by starting with the received encoded value, which lies within the initial [0,1) interval. For each successive symbol position up to the known message length, the decoder examines the current interval and identifies the symbol whose subinterval contains the encoded value, based on the same probability model used for encoding. The interval is then updated to that subinterval, and the process repeats, reconstructing the original message symbol by symbol until the full length is decoded. Synchronization between the encoder and is maintained through a shared probability model, which must be identical for both to correctly subdivide intervals, along with a prefixed message length that informs the of the exact number of symbols to . This approach ensures unambiguous without additional termination symbols in static models.

Detailed Encoding Process

The arithmetic encoding process represents a sequence of symbols from a source by iteratively narrowing an initial probability interval based on the symbols' probabilities, producing a binary code that corresponds to a point within the final subinterval. This approach achieves compression by assigning code lengths approximately equal to the negative logarithm of the symbol probabilities, leveraging the full precision of the interval rather than fixed-length bins. The algorithm operates on a probabilistic model that provides cumulative probabilities for each symbol, where the cumulative probability up to but not including symbol s, denoted C(s), is the sum of probabilities of all symbols preceding s in the alphabet. The process initializes with a code interval [ \text{low}, \text{high} ) = [0, 1) and an empty output bitstream. This interval represents the full range of possible messages under the model, with length 1 corresponding to total probability mass. For each symbol s_i in the input message, the interval is updated to the subinterval proportional to the probability of s_i given the current model. Specifically, the new bounds are computed as: \text{new_low} = \text{low} + (\text{high} - \text{low}) \cdot C(s_i) \text{new_high} = \text{low} + (\text{high} - \text{low}) \cdot \big( C(s_i) + P(s_i) \big) where P(s_i) is the probability of symbol s_i. The interval is then set to [ \text{new_low}, \text{new_high} ), narrowing it such that its length becomes (\text{high} - \text{low}) \cdot P(s_i), preserving the proportional representation of the message's probability. This update is repeated for all symbols, resulting in a final interval whose length equals the probability of the entire message under the model. The derivation of the interval update formula follows from the goal of mapping the message to a contiguous subinterval of the unit interval, where the position encodes the cumulative probability and the length encodes the message probability. Starting from the initial interval [0, 1), the first symbol s_1 maps to [C(s_1), C(s_1) + P(s_1)), which has length P(s_1). For the second symbol s_2, the subinterval is scaled within the current bounds: the offset C(s_2) shifts the start, and the scaling factor P(s_2) determines the length, yielding the affine transformation above. Iteratively, this ensures the final interval [L, L + P], where P = \prod P(s_i), contains all points that uniquely identify the message when distinguished from adjacent intervals for other messages. To avoid precision loss in finite arithmetic, the formula is often implemented in integer form to prevent underflow as the interval shrinks. Using fixed-precision integers (e.g., 16-bit words), the interval is represented as [\text{low}, \text{high}] with range r = \text{high} - \text{low} + 1, and updated as: \text{new_low} = \text{low} + \left\lfloor \frac{r \cdot \text{cum_freq}}{\text{total_freq}} \right\rfloor \text{new_high} = \text{low} + \left\lfloor \frac{r \cdot (\text{cum_freq} + \text{freq})}{\text{total_freq}} \right\rfloor - 1 where \text{cum_freq} is the integer cumulative frequency of all symbols strictly preceding s, \text{freq} is the frequency of s, and \text{total_freq} normalizes to probabilities. The floor and adjustments ensure integer bounds while approximating the fractional scaling, maintaining near-equivalent subinterval lengths without floating-point operations. This integer derivation scales the computations to the available precision, outputting bits periodically when the interval falls into specific quarters (e.g., when \text{low} \geq 0.5 or \text{high} < 0.5) to recenter and expand the interval, preventing underflow. Upon processing all symbols, a final code value c is selected such that \text{low} \leq c < \text{high}, typically by taking the integer part of \text{low} scaled to the precision or using the accumulated output bits from scaling steps. To ensure unambiguity, additional termination bits (often 1-2 bits) are appended to distinguish the final interval from overlapping ones, such as outputting "0" if the interval is in the lower half or "1" otherwise, followed by a second bit if needed. In practice, the renormalization outputs during encoding provide most bits, with termination ensuring the decoder can resolve the exact point. To signal the end of the message, an end-of-file (EOF) symbol is encoded as the final symbol using the same interval update, narrowing the interval further by the EOF probability (typically fixed at a small value like $10^{-6} in adaptive models). This EOF ensures the decoder halts after reconstructing it. For handling unseen symbols in adaptive models, an escape symbol is incorporated into the probability distribution with probability proportional to the estimated mass of unseen symbols, such as \frac{k}{k + n} where k is a constant (e.g., 1) and n is the number of seen symbols. When encoding an unseen symbol u, the escape symbol is first encoded via the interval update, prompting the decoder to switch to a fallback model (e.g., uniform over the full alphabet); then u is encoded under this fallback, often as its index in binary arithmetic-coded form. This mechanism allows dynamic alphabet extension without model failure.

Detailed Decoding Process

The decoding process in arithmetic coding reverses the encoding by recovering the original sequence of symbols from the transmitted code value c \in [0, 1), using the same probability model as the encoder to partition the unit interval. The decoder initializes the current interval as [ \text{low}, \text{high} ) = [0, 1). It also requires prior knowledge of the message length n to determine the number of symbols to decode. For each symbol position i = 1 to n, the decoder computes the relative position of c within the current interval as r = \frac{c - \text{low}}{\text{high} - \text{low}}. It then identifies the symbol s such that the cumulative probability up to the previous symbol satisfies \text{cum\_prob\_prev}(s) \leq r < \text{cum\_prob}(s), where \text{cum\_prob}(s) = \sum_{t \leq s} p(t) and \text{cum\_prob\_prev}(s) = \sum_{t < s} p(t), with p(t) denoting the probability of symbol t from the shared model. The symbol s is output as the i-th decoded symbol. The interval is then updated to the subinterval corresponding to s that contains c: \text{low} \leftarrow \text{low} + (\text{high} - \text{low}) \cdot \text{cum\_prob\_prev}(s) \text{high} \leftarrow \text{low} + (\text{high} - \text{low}) \cdot \bigl( \text{cum\_prob}(s) - \text{cum\_prob\_prev}(s) \bigr) This renormalization ensures the interval shrinks appropriately while keeping c within it. Upon reaching the n-th symbol, decoding terminates. The final interval [\text{low}, \text{high}) is verified to contain c, confirming the integrity of the recovered message under the assumed model.

Illustrative Example

To illustrate the encoding and decoding processes, consider the message "WIKI" consisting of four symbols from the alphabet {W, I, K}, where each symbol has an equal probability of p = \frac{1}{3}. Assume the order W, I, K with cumulative probabilities before each: C(W) = 0, C(I) = \frac{1}{3}, C(K) = \frac{2}{3}. The encoding begins with the initial interval [0, 1). For the first symbol W, the interval narrows to [0, \frac{1}{3}). For the second symbol I, it becomes [\frac{1}{9}, \frac{2}{9}). For the third symbol K, the interval updates to [\frac{5}{27}, \frac{6}{27}). For the final symbol I, the interval is further refined to [\frac{16}{81}, \frac{17}{81}) of length \frac{1}{81} \approx 0.0123. A code value within this final interval, such as 0.20 (approximately \frac{16.2}{81}), is selected to represent the entire message. This code requires approximately 6.7 bits to transmit, compared to the message's entropy of \log_2(81) \approx 6.34 bits. Decoding starts with the code 0.20 and the initial interval [0, 1). Since 0 ≤ 0.20 < \frac{1}{3}, the first symbol is W; the interval updates to [0, 1/3), and the code is rescaled to r = 0.20 / (1/3) ≈ 0.60. For the second symbol, 0.60 falls in [1/3, 2/3) relative, yielding I; rescaled r = (0.60 - 1/3) / (1/3) ≈ 0.80. For the third, 0.80 ≥ 2/3, so K; rescaled r = (0.80 - 2/3) / (1/3) ≈ 0.20. For the fourth, 0.20 < 1/3, so I, recovering the original message "WIKI".

Implementation Aspects

Precision Management and Renormalization

In fixed-point implementations of arithmetic coding, the coding interval [low, high) narrows progressively as symbols are encoded, and with finite precision (e.g., k-bit words), the interval width can shrink below the representable resolution of 2^{-k}, causing underflow where further subdivisions become indistinguishable and numerical instability arises. To mitigate this, renormalization rescales the interval to restore sufficient precision before underflow occurs. The algorithm checks the condition when the interval lies entirely within [0, 0.5) or [0.5, 1) (or equivalently, while (high - low) < 0.5); upon triggering, it outputs the most significant bit (MSB) b of low to the bitstream (b = 0 if low < 0.5, else 1), then shifts the interval left by setting low = 2 \times (low - b \times 2^{k-1}) and the width (high - low) = 2 \times (high - low), effectively discarding the MSB and scaling up the remaining bits. This process repeats as needed after each interval update, ensuring the width remains large enough for accurate probability subdivisions without requiring multiplications beyond integer shifts. During encoding, the bits output via renormalization form the compressed bitstream; at termination after encoding an end-of-file (EOF) symbol, continue renormalization to flush all bits from low, then append additional follow bits (e.g., a sequence of zeros for the remaining precision) to ensure the codeword specifies a point c uniquely within [low, high) without ambiguity. In decoding, bits are read sequentially to reconstruct an approximation of c, decoding symbols until the EOF symbol is reached, at which point decoding halts as the code point falls within the final interval. The choice of precision, such as 32-bit integers for low and width (scaled to represent fractions of the unit interval), balances computational efficiency and accuracy; with this, renormalization maintains stability provided the minimum symbol probability exceeds roughly 2^{-32}, and the final interval length exceeding 2^{-k} guarantees unique decoding by preventing overlap with adjacent intervals. Error bounds from finite precision are negligible if k is sufficiently large relative to the message length and entropy, with redundancy introduced by renormalization typically under 2 bits per renormalization step.

Adaptive Models

In adaptive arithmetic coding, the probability model is dynamically updated during the encoding and decoding process to reflect the emerging statistics of the input data, enabling effective compression of sources with unknown or varying symbol probabilities. This approach contrasts with static models by allowing the coder to learn the data distribution on the fly, starting typically with uniform initial probabilities represented by setting frequency counts to 1 for each symbol in the alphabet. As each symbol is processed, the model adapts by incrementing the count for the observed symbol, thereby refining probability estimates based on empirical frequencies without requiring a separate modeling pass. This method was pioneered in the context of binary image compression, where bit-level adaptations proved essential for handling non-stationary patterns. The core update rule involves recalculating probabilities as the ratio of a symbol's count to the total accumulated counts across all symbols, formulated as p_s = \frac{c_s}{\sum_i c_i}, where c_s is the count for symbol s and \sum_i c_i is the total count. To maintain computational efficiency and fixed-size storage for counts, scaling is applied periodically: when the total count reaches a threshold such as $2^{16}, all counts are halved (with rounding up to preserve integers), effectively downweighting early observations while emphasizing recent data and exploiting temporal locality in the source. This scaling avoids the overhead of full renormalization after every symbol and ensures the model remains responsive to changes. Alternative update strategies, such as exponential smoothing with a factor like 0.96 for retaining prior probabilities, can also be used to balance adaptation speed and stability. Synchronization between the encoder and decoder is achieved through identical parallel updates: after each symbol is encoded on the sender side, the same increment and scaling operations are performed on the receiver side following decoding, keeping their models in lockstep without any explicit model transmission. This ensures reliable reconstruction even for streaming data, as both sides converge to the same probability estimates over the sequence. For non-stationary sources, such adaptation yields compression ratios that approach the instantaneous entropy, as the model tightens intervals more effectively once frequencies stabilize—for instance, in a binary sequence beginning with equal probabilities (0.5 each), a streak of one symbol shifts estimates toward 0.9 or higher, reducing code lengths for subsequent matches and outperforming fixed models on evolving data.

Sources of Inefficiency

Arithmetic coding implementations, while approaching theoretical entropy limits, incur practical inefficiencies due to computational demands in interval updates. The core encoding and decoding processes rely on multiplications to scale the current interval by symbol probabilities, such as updating the lower bound with low = low + range \times cum, where cum is the cumulative probability and range is the interval length. These operations, often requiring floating-point or fixed-point multiplications, are computationally slower than the integer additions and bit manipulations typical in Huffman coding, leading to reduced encoding/decoding speeds, especially on hardware without optimized floating-point units. To mitigate multiplication overhead, many implementations approximate these operations using precomputed table lookups for probability scaling, avoiding on-the-fly multiplications at the cost of introducing quantization errors. For instance, tables store rounded or quantized values for range \times p_i, where p_i is the symbol probability, which simplifies hardware or software execution but results in slight deviations from exact interval partitioning. This approximation equates to a minor distortion in the effective probability model, with the resulting quantization error bounded by the table precision; for example, using 12-bit approximations can limit the maximum coding inefficiency to less than 4% of the file size in skewed distributions. Such errors accumulate over symbols, reducing overall compression efficiency compared to ideal arithmetic coding. Additional overhead arises from startup and termination procedures necessary to synchronize the encoder and decoder and ensure unambiguous decoding. At startup, the initial interval is typically set to [0, 1), but practical integer arithmetic requires scaling to a large fixed precision (e.g., $2^b - 1), which can introduce a redundant bit in the code stream. Termination involves emitting sufficient bits to specify the final interval uniquely after an end-of-file symbol, often adding 2 bits in basic schemes or up to 9 bits when blocking is used to bound message lengths, particularly impacting efficiency for shorter messages under 2^14 bytes. These fixed costs dilute the per-symbol efficiency, though their relative impact diminishes with longer sequences. Collectively, these inefficiencies—arising from arithmetic operations, approximations, and boundary handling—result in practical arithmetic coders achieving compression rates typically within 0.25% to 1% of the theoretical for messages of 10^5 bytes or more, with losses under 4% even in low-precision approximations. Despite this gap, arithmetic coding remains superior to for alphabets exceeding a few symbols, as the per-symbol overhead in Huffman grows logarithmically with alphabet size. Renormalization techniques, which rescale intervals during processing, help manage precision-related aspects of these inefficiencies but do not eliminate the underlying computational costs.

Theoretical Perspectives

Generalized Radix Change Interpretation

Arithmetic coding can be theoretically interpreted as a generalized radix change, analogous to representing a number in a mixed radix numeral system where the radix (base) varies across positions based on symbol probabilities. In this framework, each symbol in the message corresponds to selecting a "digit" within a position whose effective base is determined by the inverse of the symbol's probability, i.e., base = 1/p(symbol). This varying-base system allows the encoding to adapt to the source's probability distribution, unifying arithmetic coding with positional numeral systems like factorial or primorial numbering, but tailored to probabilistic constraints. The encoding process constructs the final code value through a cumulative product of interval lengths, reflecting the progressive narrowing of the probability interval. For a sequence of symbols s_1, s_2, \dots, s_n with probabilities p(s_i), the lower bound L of the final interval is given by L = \sum_{i=1}^n \alpha_i \prod_{j=1}^{i-1} p(s_j), where \alpha_i is the cumulative probability offset for symbol s_i relative to preceding symbols in the alphabet at step i (i.e., the sum of probabilities of symbols lexicographically before s_i). The upper bound U is then L + \prod_{i=1}^n p(s_i). Any value in [L, U) serves as the code, typically chosen as the shortest binary fraction within the interval. This formulation arises from the interval subdivision process, where each symbol refines the current subinterval proportionally to its probability. Mathematically, this code value represents a fractional expansion in a non-uniform radix system, where the place values are the cumulative products of the symbol probabilities rather than fixed powers of a constant base. Equivalently, L corresponds to the value of the probability integral transform applied to the message sequence under the source distribution, mapping the discrete sequence to a point in the unit interval [0, 1) that preserves the probabilistic ordering. This equivalence highlights arithmetic coding's foundation in information theory, as the code length approximates the self-information of the message. This radix change perspective explains arithmetic coding's efficiency in achieving entropy rate without the prefix constraints of discrete codes like Huffman, as the continuous interval representation avoids integer boundary issues and allows fractional bit allocations proportional to -\log_2 p(s_i). By treating the message as a point in a probability-ordered continuum, it enables optimal packing of multiple messages into a shared code space, limited only by the total probability mass.

Compression Limits and Asymptotic Equipartition

Arithmetic coding achieves compression rates that approach the fundamental limit established by Shannon's source coding theorem, which states that no lossless encoding scheme can compress a discrete memoryless source to fewer than H(X) bits per symbol on average, where H(X) is the entropy of the source, and that this limit is attainable asymptotically for block codes of increasing length. This theorem implies that the minimal average code length L satisfies H(X) \leq L < H(X) + 1 for uniquely decodable codes, with equality approachable in the limit. In arithmetic coding, the encoded output length is determined by the negative logarithm base 2 of the final interval length, which equals the self-information -\log_2 P(m) of the message m, allowing the code to match the exact information content without the integral bit constraints of fixed-length or prefix codes. The asymptotic equipartition property (AEP) provides the theoretical foundation for why arithmetic coding attains this limit for long sequences from independent and identically distributed (i.i.d.) sources. According to the AEP, for a sequence of n i.i.d. symbols X_1, X_2, \dots, X_n drawn from distribution p(x), the normalized negative log-probability -\frac{1}{n} \log_2 p(X_1^n) converges in probability to the entropy H(X), so P(X_1^n) \approx 2^{-n H(X)} for typical sequences, which comprise nearly all the probability mass (over $1 - \epsilon for any \epsilon > 0 and sufficiently large n). In arithmetic coding, the cumulative interval narrowing for a typical sequence thus yields a final interval of length approximately $2^{-n H(X)}, necessitating roughly n H(X) bits to specify a point within it, achieving the Shannon limit with vanishing overhead per symbol as n grows. The number of typical sequences is about $2^{n H(X)}, confirming that n H(X) bits suffice to index them uniquely, aligning the coding process with the source's intrinsic redundancy. For i.i.d. sources, unique decodability in arithmetic coding holds provided the arithmetic precision exceeds the message entropy, ensuring that the final intervals for distinct messages remain disjoint and non-overlapping within the unit interval. This condition guarantees that any point selected from the final interval unambiguously identifies the original sequence, as the encoding maps each message to an exclusive subinterval whose length matches its probability, preventing ambiguity even under finite-precision approximations when the bit depth surpasses n H(X). Thus, arithmetic coding realizes the theoretical minimum bits required, -\log_2 of the final interval length, which asymptotically equals the message self-information and converges to n H(X) for typical inputs.

Comparison with Huffman Coding

, a -based method, assigns integer-length binary codes to symbols based on their probabilities, ensuring no codeword is a of another to enable instantaneous decoding. This condition, governed by the Kraft inequality, necessitates rounding optimal code lengths (derived from negative log probabilities) to the nearest integer, which introduces inefficiency when probabilities are not (powers of 1/2). As a result, can waste up to nearly 1 bit per symbol in the worst case, particularly for symbols with probabilities close to 1 or 0, where the code length exceeds the ideal fractional value. This limitation becomes more pronounced with larger sizes, as the constraint amplifies cumulative bit waste across diverse symbols, leading to compressed outputs farther from the limit. In contrast, arithmetic coding overcomes these issues by representing an entire as a single fractional interval within [0,1), allowing effective use of fractional bits per symbol without blocking or prefix restrictions. This enables arithmetic coding to approach the entropy bound more closely for arbitrary probability distributions, achieving optimal compression regardless of whether probabilities are dyadic. On average, arithmetic coding provides 5-15% better compression ratios than for typical data sources like text or images with skewed probabilities, as demonstrated in evaluations on standard corpora, approaching the limit more closely with typical reductions of approximately 0.05-0.2 bits per character for text. However, this efficiency comes with trade-offs: is generally faster due to its reliance on precomputed lookup tables for encoding and decoding, making it simpler to implement and suitable for applications. Arithmetic coding, involving iterative and precision management, is more computationally complex, though modern implementations can mitigate this through optimizations like . Huffman coding suffices and is often preferred for small alphabets (e.g., or up to 256 symbols) or speed-critical scenarios, such as embedded systems, where its table-driven approach minimizes overhead without significant loss relative to limits.

Connections to Other Entropy Coders

serves as an -based approximation of arithmetic coding, avoiding the need for floating-point operations by maintaining and updating intervals representing probability subranges. This approach scales [0,1) to a large integer range, typically 2^{32} or larger, and performs renormalization by shifting and outputting digits when the range falls below a threshold, thereby approximating the fractional arithmetic of traditional arithmetic coding with fixed-precision integer arithmetic. Modern implementations, such as the Context-based Adaptive Binary Arithmetic Coding (CABAC) in H.264/AVC video compression, employ range coding variants to achieve efficient binary entropy coding while ensuring computational efficiency on hardware. Asymmetric Numeral Systems (ANS) represent a table-based method that achieves compression rates comparable to while offering superior speed, particularly through streamlined updates and renormalization. In ANS, are encoded by mapping a value to a new via precomputed tables derived from probabilities, maintaining theoretical ties to arithmetic coding's subdivision but using a single rather than range bounds. This results in overheads as low as 0.001 bits per , similar to arithmetic coding, but with decoding speeds up to 50% faster than for large alphabets, making ANS suitable for high-throughput applications. Arithmetic coding is frequently integrated into hybrid compression pipelines, such as those combining it with the (BWT) for preprocessing text data. The BWT rearranges input strings to cluster similar characters, enhancing local predictability, after which arithmetic coding exploits the resulting skewed distributions—often post-move-to-front transformation—for final entropy encoding, yielding compression ratios around 2.34-2.52 bits per byte on benchmark corpora. Similarly, archivers like employ arithmetic coding as the backend entropy stage, paired with advanced context-mixing models and dictionary-based predictors to blend multiple probability estimates, achieving state-of-the-art lossless compression on diverse data types. In a broader theoretical , arithmetic coding, , and ANS can all be viewed as mechanisms for sampling and encoding sequences according to posterior probabilities conditioned on a predictive model, effectively generating codewords that approximate the negative log-probability under the model's . This unified perspective highlights their shared goal of approaching the Shannon entropy limit by adaptively partitioning the based on conditional posteriors.

Applications

In Data Compression Standards

Arithmetic coding plays a significant role in various general-purpose data standards and archivers, where it serves as an efficient encoding technique, often integrated to enhance ratios in archival formats. The PAQ series of archivers employs adaptive arithmetic coding within a context-mixing framework, blending multiple predictive models to estimate symbol probabilities and encode data, resulting in top-tier ratios for text and general archival files on benchmarks such as the Large Text Compression Benchmark. Similarly, cmix, a high-performance context-mixing , integrates arithmetic coding to combine predictions from diverse models, consistently achieving leading ratios for diverse datasets while trading off encoding speed. In the JPEG XL standard, finite-precision binary is applied in the mode to encode quantized transform coefficients and other residuals, enabling efficient of legacy images with typical size reductions of 16-22% compared to the original files. Arithmetic coding is commonly integrated with LZ77 as a post-prediction coder in schemes, where it optimally encodes the literals, match lengths, and distances output by the stage, approaching the theoretical limit for the predicted symbols.

In Multimedia and Image Codecs

Arithmetic coding plays a pivotal role in modern codecs, particularly for video, image, and , where it enables efficient encoding of transformed and quantized data to minimize bitrate while preserving quality. In video coding standards, variants like context-adaptive arithmetic coding (CABAC) adapt probability models based on contextual from previously coded symbols, achieving significant gains over simpler methods. This is crucial for handling the statistical non-stationarity in signals, such as motion vectors and transform coefficients in video frames or subbands in images. In the H.264/AVC and HEVC video coding standards, CABAC serves as the primary entropy coder, binarizing input symbols into binary sequences before applying arithmetic coding with context-specific probability estimates. For H.264/AVC, CABAC replaces the baseline context-adaptive variable-length coding (CAVLC), providing bitrate reductions of 10-20% for typical high-definition content at equivalent video quality. In HEVC, CABAC is mandatory and refined with more contexts and parallel decoding modes to support higher resolutions, contributing to the standard's overall 50% bitrate savings over H.264/AVC. These enhancements make CABAC suitable for streaming and applications. For image compression, the JPEG 2000 standard employs the MQ-coder, a fixed-point binary arithmetic coder, to entropy-encode quantized wavelet coefficients after the . The MQ-coder processes bits in a context-dependent manner, using 19 predefined context models to exploit spatial and spectral redundancies in the coefficients, enabling progressive transmission and scalable bit-rate control essential for professional imaging workflows like medical and . In lossless audio compression, the MPEG-4 Audio Lossless Coding (ALS) standard incorporates an adaptive coder as an optional method for encoding prediction residuals alongside , allowing selection between arithmetic and Golomb-Rice coding based on signal characteristics for optimal efficiency. This flexibility supports compression ratios competitive with other lossless formats while maintaining perfect reconstruction. In contrast, the Free Lossless Audio Codec () relies on adaptive Rice coding for residuals but shares the goal of exploiting audio predictability without loss. Recent video codecs like and () further evolve arithmetic coding: uses a non-binary arithmetic coder for multi-symbol probabilities, yielding up to 30% bitrate reductions over HEVC in streaming scenarios; enhances CABAC with doubled context sets and improved probability updates, supporting its 30-50% efficiency gains for ultra-high-definition content.

History and Developments

Origins and Key Inventors

The principle of arithmetic coding emerged from early theoretical explorations in during the 1960s, particularly through 's work on interval algorithms for encoding sequences. Elias developed the concept of representing symbol sequences by iteratively narrowing intervals on the unit [0,1), laying the groundwork for mapping probabilistic events to fractional values, though his ideas were not formally published at the time and were later referenced in subsequent literature. The modern formulation of arithmetic coding was proposed in 1976 by Jorma Rissanen at and at , marking the transition from theory to practical variable-to-fixed length coding via interval subdivisions. In his seminal paper, Rissanen described encoding algorithms that assign codewords based on cumulative probabilities within shrinking intervals, achieving lengths close to the entropy bound and generalizing the Kraft inequality for decodability. Concurrently, Pasco's dissertation introduced source coding techniques using similar interval-based methods to enable fast of data streams, emphasizing finite-precision approximations for real-world application. A key refinement came in 1979 from Glen G. Langdon and Jorma Rissanen, who formalized arithmetic coding as a versatile encoding framework adaptable to various alphabets and probability models. Their work highlighted the method's ability to encode entire sequences into a single fractional number, outperforming fixed-length codes by avoiding integer boundaries and enabling seamless integration with predictive modeling. Early implementations of arithmetic coding were pursued in the within IBM's research labs, focusing on adaptations for efficiency in constrained environments like image processing. This effort led to the Q-Coder in 1988, an adaptive arithmetic coder that employed and probability estimation to support high-speed encoding and decoding without multiplications, paving the way for its use in standards-compliant systems. The development of arithmetic coding in the was closely tied to several key patents held by , particularly those related to the Q-Coder, an adaptive binary arithmetic coding technique pioneered by researchers including Glen G. Langdon Jr. A foundational patent, US Patent 4,905,297, filed in November 1988 and granted in February 1990 to inventors William B. Pennebaker, James L. Mitchell, Glenn L. Kaplan, and Langdon, covered an arithmetic coding encoder and decoder system optimized for binary data compression, forming the basis for efficient implementations in hardware and software. This patent, along with earlier related filings such as US Patent 4,652,856 (filed February 4, 1986, granted 1987), protected core aspects of probability and renormalization in arithmetic coding. Under pre-1995 U.S. patent law, these expired 17 years after issuance, with US 4,905,297 lapsing in February 2007, thereby allowing royalty-free adoption in open-source and standards-based applications. Mitsubishi Electric contributed refinements to the Q-Coder, leading to the QM-Coder used in the standard for bi-level , ratified in 1992. Patents such as Japanese Patent 2,128,110 (filed 1989) covered enhancements to renormalization and probability adaptation, held jointly or collaboratively with and later Technologies (formerly ). These were made available on a basis for JBIG compliance, but broader patent pools influenced hybrid compression approaches; for instance, the Unisys LZW patent (US 4,558,302, granted 1985, expired 2003) created licensing barriers for dictionary-based methods, indirectly promoting arithmetic coding variants in standards like and early formats to avoid such entanglements. No major litigation directly targeted arithmetic coding hybrids, but resolutions in LZW-related disputes by the early , including cross-licensing agreements, facilitated unrestricted experimentation with combined coders. Following the expiration of core and patents around 2007, arithmetic coding saw widespread integration into open standards without royalty concerns for foundational techniques. The H.264/AVC video standard, released in 2003, incorporated Context-Adaptive Binary Arithmetic Coding (CABAC) as an optional entropy coder, drawing on Q-Coder principles but featuring adaptations developed by Fraunhofer HHI and partners. Specific CABAC implementations were covered by patents in the H.264 pool, such as those filed in the early , all of which expired by August 2025 (e.g., European Patent EP 1 709 801, expired January 2025; final patents in and , August 2025). As of 2025, this complete expiration has enabled fully royalty-free adoption of CABAC in devices and software. Post-2020 patent activity on arithmetic coding remains limited, with no dominant new claims on core algorithms; filings primarily address niche optimizations, such as machine learning-enhanced probability models for genomic data compression (e.g., WO 2022/008311, filed 2020). These are rare and focused on software implementations rather than fundamental mechanics, reflecting the maturity of the technology and its public-domain status for standard uses in and databases.

Benchmarks and Performance

Technical Characteristics

Arithmetic coding exhibits linear , requiring time to encode or decode a sequence of n symbols, though the practical performance is influenced by a high constant factor arising from arithmetic operations such as multiplications for interval scaling and bit shifts for . These operations, including up to two multiplications and several additions or shifts per symbol, contribute to the overhead, particularly in fixed-precision implementations that approximate divisions to avoid costly computations. Memory usage in arithmetic coding varies by model type; static models employ compact probability tables proportional to the alphabet size, typically requiring on the order of a few kilobytes for common alphabets like 256 symbols in byte-oriented coding. In contrast, adaptive models demand more for maintaining counters or frequency trees, such as Fenwick trees that use approximately n words for an n-symbol alphabet to support dynamic updates in O(log n) time per symbol. The algorithm scales effectively with large symbol alphabets, such as 256 for byte streams, by directly incorporating cumulative probabilities without the exponential codeword growth seen in alternatives like . For even greater efficiency in applications with non-binary symbols, variants like Context-based Adaptive Binary Arithmetic Coding (CABAC) binarize inputs to reduce the alphabet to two symbols, enabling faster processing via multiplication-free operations while preserving compression gains. Arithmetic coding is sensitive to bit errors, as a single transmission error can desynchronize the encoder and decoder intervals, leading to propagation of decoding failures throughout the stream. However, its design provides inherent robustness through the final interval selection, where the codeword is chosen from the narrowed range to minimize length and incorporate rescaling mechanisms that maintain precision and mitigate overlap risks via unused "leakage" regions.

Benchmark Results

Arithmetic coding generally achieves compression ratios 10-20% superior to Huffman coding on standard text and data corpora, such as English text and binary files, due to its ability to encode fractional bits per symbol. For instance, on datasets including English text, arithmetic coding yields an average 12.3% improvement in compression ratio over Huffman, reducing file sizes from 50 KB to 45 KB for a 100 KB text sample. In high-performance archivers like PAQ8px, which employs with advanced context mixing, compression on the enwik8 benchmark (a 100 MB subset of text) reaches approximately 1.36 bits per byte, demonstrating near-optimal approximation for large-scale text data. As of 2014 on contemporary hardware, encoding speeds for in software implementations ranged from 120-160 MB/s on the book1 , slower than Huffman's 250-280 MB/s but comparable to some (ANS) variants like tANS at 160 MB/s. Decoding speeds were 70-80 MB/s for versus Huffman's 300 MB/s and tANS's 280 MB/s, highlighting a where prioritizes over throughput. More recent optimizations as of 2025, such as variants, achieve over 500 MB/s and multiple GB/s on modern multi-core CPUs. In contemporary applications from the 2020s, arithmetic coding variants deliver significant gains; for example, JPEG XL's modular entropy coder provides 20-50% bitrate reductions over JPEG-LS in lossless mode across image datasets like photos (11% improvement) and scanned books (26% improvement), enabling smaller file sizes without quality loss. Similarly, AV1's arithmetic coding supports hardware-accelerated encoding, with Ada architectures achieving up to 40% greater efficiency than H.264 equivalents at equivalent quality, often enabling real-time 8K60 encoding in optimized pipelines. Key performance factors include model adaptation and alphabet size; adaptive arithmetic coding outperforms static variants by dynamically updating probabilities to match evolving data statistics, yielding 5-15% better ratios on non-stationary sources like . For alphabets, efficiency remains high, approaching within 1-5% of the theoretical limit, though larger alphabets (e.g., 256 symbols) allow even closer optimality by reducing per-symbol overhead.
BenchmarkArithmetic CodingHuffman CodingNotes/Source
Compression Ratio (avg. improvement)12.3% betterBaselineText/binary datasets
enwik8 (bpb)1.36 (lower )PAQ8px v191
Encoding Speed (MB/s, book1, as of 2014)120254Software on ~2014 CPU
Recent Encoding Speed (MB/s, as of 2025)>500Optimized on modern CPUs
JPEG XL vs. JPEG-LS (books, % bitrate savings)26%Lossless images

References

  1. [1]
    [PDF] An Introduction to Arithmetic Coding
    Arithmetic coding is a data compression technique that encodes data (the data string) by creating a code string which represents a fractional value on the ...
  2. [2]
    [PDF] SOURCE CODING ALGORITHMS FOR FAST DATA ...
    SOURCE CODING ALGORITHMS FOR FAST DATA COMPRESSION. A DISSERTATION. SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING. AND THE COMMITTEE ON GRADUATE ...Missing: thesis | Show results with:thesis
  3. [3]
    Generalized Kraft Inequality and Arithmetic Coding - IEEE Xplore
    >Volume: 20 Issue: 3. Generalized Kraft Inequality and Arithmetic Coding. Publisher: IBM. Cite This. PDF. J. J. Rissanen. All Authors. Sign In or Purchase.
  4. [4]
    [PDF] ARITHMETIC CODING FOR DATA COIUPRESSION
    Pasco, R. Source coding algorithms for fast data compression. Ph.D. thesis, Dept. of Electrical Engineering, Stanford Univ., Stanford,. Calif., 1976. An ...
  5. [5]
    [PDF] Highly Efficient, Low Complexity Arithmetic Coder for JPEG2000
    ABSTRACT. Arithmetic coding is employed in image and video coding schemes to reduce the statistical redundancy of symbols emitted by coding engines.
  6. [6]
    Enhanced Binary MQ Arithmetic Coder with Look-Up Table - MDPI
    In JPEG2000 standard, MQ arithmetic coding was selected as a baseline profile, replacing Huffman coding. H.264/AVC, which is a video compression standard, ...
  7. [7]
    Arithmetic coding | IBM Journal of Research and Development
    J. Rissanen, "Generalized Kraft Inequality and Arithmetic Coding," IBM J. Res. Develop. 20, 198 (1976). Digital Library.
  8. [8]
    [PDF] Arithmetic Coding for Data Compression
    Arithmetic coding provides an e ective mechanism for remov- ing redundancy in the encoding of data. We show how arithmetic coding works and describe an e ...
  9. [9]
    [PDF] Arithmetic Coding Revisited - Stanford University
    RISSANEN, J. AND LANGDON, G. G. 1979. Arithmetic coding. IBM J. Res. Dev. 23, 2 (Mar.),. 149 –162.
  10. [10]
    [PDF] Lecture 9: Arithmetic Coding and Universal Codes
    simply take the decimal part of the midpoint: L(xn)+R(xn). 2. = ¯F(xn). The midpoint could have a very long expansion, so we are going to round it off after ˜m ...
  11. [11]
    [PDF] Arithmetic Coding
    message length m be also be transmitted . The ... “ Page 47. 47. Arithmetic Coding Advantages. ○ “There are workarounds to prefix codes that give.
  12. [12]
    Arithmetic coding for data compression | Communications of the ACM
    Cleary, J.C., and Witten, I.H. A comparison of enumerative and adaptive ... Arithmetic coding for data compression. Information systems · Data management ...
  13. [13]
    [PDF] Arithmetic Coding for Finite-State Noiseless Channels - DSpace@MIT
    Rissanen, J. J. (1976) "Generalized Kraft inequality and arithmetic coding," I.B.M. J. Res. Develop. 20, 198-203. Rissanen, J. and G. G. Langdon, Jr. (1979) ...
  14. [14]
    None
    ### Summary of Finite-Precision Arithmetic Coding in Q-Coder (Pennebaker et al., 1988)
  15. [15]
    Compression of Black-White Images with Arithmetic Coding
    Arithmetic coding permits the compression of binary sequences where the ... Langdon; J. Rissanen. All Authors. Sign In or Purchase. 272. Cites in. Papers. 46.
  16. [16]
    [PDF] Practical Implementations of Arithmetic Coding
    [48] J. J. Rissanen, \Generalized Kraft Inequality and Arithmetic Coding," IBM J. Res. Develop.20 (May 1976), 198{203. [49] J. J. Rissanen & G. G. Langdon ...
  17. [17]
    [PDF] Introduction to Arithmetic Coding Theory and Practice
    Apr 21, 2004 · This introduction to arithmetic coding is divided in two parts. The first explains how and why arithmetic coding works.
  18. [18]
    None
    Summary of each segment:
  19. [19]
    Arithmetic Coding | IBM Journals & Magazine - IEEE Xplore
    Arithmetic Coding ; Page(s): 149 - 162 ; Date of Publication: March 1979 ; ISSN Information: Print ISSN: 0018-8646. Electronic ISSN: 0018-8646 ; Persistent Link: ...Missing: DOI | Show results with:DOI
  20. [20]
    [PDF] The Asymptotic Equipartition Property
    The Asymptotic Equipartition Property (AEP) was first stated by Shannon in his original 1948 paper [238], where he proved the result for i.i.d. processes and.
  21. [21]
    [PDF] Lecture 7 — Jan 31 7.1 Outline 7.2 Recap 7.3 Limitations of Huffman ...
    Despite the fact that Huffman codes achieve optimal compression rate, they are seldom used in practice because of the following limitations:.Missing: inefficiency | Show results with:inefficiency
  22. [22]
    [PDF] Analysis of Arithmetic and Huffman Compression Techniques by ...
    Aug 8, 2021 · Simulation results show that the Arithmetic coding exhibits higher CR than Huffman coding, but smaller PSNR values. Index Terms: Discrete ...
  23. [23]
    On the Overhead of Range Coders - Xiph.org
    This article examines the relative compression inefficiency of using (byte-wise) range coding instead of (bit-wise) arithmetic coding.
  24. [24]
    [PDF] Context-based adaptive binary arithmetic coding in the H.264/AVC ...
    In this paper, we present a description of the main elements of the. CABAC algorithm in its final standardized form as specified in. [1]. Unlike the ...Missing: seminal | Show results with:seminal
  25. [25]
    Asymmetric numeral systems: entropy coding combining speed of ...
    Nov 11, 2013 · Asymmetric numeral systems (ANS) is a new approach to accurate entropy coding, which allows to end this trade-off between speed and rate.
  26. [26]
    None
    ### Summary of Arithmetic Coding with BWT in the Document
  27. [27]
    The PAQ1 Data Compression Program - ResearchGate
    This paper describes the PAQ1 lossless data compression program. PAQ1 is an arithmetic encoder using a weighted average of five bit-level predictors.
  28. [28]
    There were a lot of encoders at the time using this general scheme ...
    ... PKZIP converged on, except it used adaptive arithmetic coding. PK replaced that with static huffman coding, trading a little compression for a lot of speed ...
  29. [29]
    The PAQ Data Compression Programs - Matt Mahoney
    PAQ is a series of open source data compression archivers that have evolved through collaborative development to top rankings on several benchmarks.
  30. [30]
    Large Text Compression Benchmark - Matt Mahoney
    cmix v1 is a free, open source (GPL) file compressor by Byron Knoll, Apr. 16, 2014. It is a context mixing compressor with dictionary preprocessing based on ...
  31. [31]
    Context based arithmetic coding - GitHub Pages
    This lectures asks the question - how do we compress a Markov/stationary source? We will study one technique for doing so - context based arithmetic coding.
  32. [32]
    JPEG XL FAQ: Everything You Need to Know!
    How much file size does lossless JPEG XL save? Transcoding JPEGs without loss reduces their size by approximately 16% to 22%. Lossy is even more robust, between ...Introduction · Lossless transcoding from JPEG · What coding tools does JPEG...
  33. [33]
    JPEG XL
    Existing JPEG files can be losslessly transcoded to JPEG XL, significantly reducing their size.
  34. [34]
    Entropy coding in Oodle Data: the big picture | The ryg blog
    Jul 9, 2021 · Anyway, it is common for LZ77-derived algorithms with a Huffman or arithmetic coding backend to use odd alphabet sizes; when you're assigning ...
  35. [35]
    [PDF] Performance Evaluation of Error Resilience Techniques in H.264 ...
    Compared to CAVLC,. CABAC can typically provide reductions in bit-rate of the order of 10–20 %for the same objective video quality when coding SD/HDTV signals.<|control11|><|separator|>
  36. [36]
    Context-Based Adaptive Binary Arithmetic Coding (CABAC)
    CABAC has been adopted as a normative part of H.264/AVC as well as of the HEVC draft standard; in H.264/AVC, it is one of two alternative methods of entropy ...
  37. [37]
    [PDF] coding of prediction residual in mpeg-4 standard for - Yuriy A. Reznik
    We describe two alternative schemes for encoding of predic- tion residual adopted in the MPEG-4 ALS standard for loss- less audio coding. We explain choices of ...
  38. [38]
    [PDF] An Overview of Core Coding Tools in the AV1 Video Codec
    This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.
  39. [39]
  40. [40]
    Overview of the basic principles of the Q-Coder adaptive binary ...
    Jan 1, 1988 · The Q-Coder is a new form of adaptive binary arithmetic coding. The binary arithmetic coding part of the technique is derived from the basic ...
  41. [41]
    Arithmetic coding encoder and decoder system - Google Patents
    The present invention relates to compressing incoming data by arithmetic coding encoding and retrieving the original data by arithmetic coding decoding.
  42. [42]
    The Unisys/CompuServe GIF Controversy - MIT
    In July 1999, Unisys decided to use its LZW patent against some Web sites that use GIFs. ... when does your patent on the LZW compression algorithm expire? do you ...
  43. [43]
    Genomic information compression by configurable machine learning ...
    WO2022008311A1 - Genomic information compression by configurable machine learning-based arithmetic coding - Google Patents ... 62/971,293, filed February 7, 2020 ...
  44. [44]
    Arithmetic Coding Patents and Patent Applications (Class 382/247)
    Search for Arithmetic Coding Patents and Patent Applications (Class 382/247) Filed with the USPTO. ... Filed: September 14, 2020. Date of Patent: August 30, 2022.
  45. [45]
    [PDF] arXiv:2302.00819v1 [cs.IT] 2 Feb 2023
    Feb 2, 2023 · This introduction to arithmetic coding is divided in two parts. The first explains how and why arithmetic coding works.
  46. [46]
    Error correcting arithmetic coding for JPEG 2000 - ACM Digital Library
    Entropy coding, and specifically arithmetic codes are particularly sensitive to bit errors. Indeed, due to the memory inherent to the arithmetic coding a ...
  47. [47]
    [PDF] A Comparative Study on the Performance of Huffman Coding and ...
    Abstract: This paper conducts a systematic and comprehensive comparison of the performance differences between Huffman coding and arithmetic coding in terms ...
  48. [48]
    02-01-14 - Understanding ANS - 3 - cbloom rants
    Feb 1, 2014 · 2. Huffman encode is still significantly faster than tANS encode. 3. "rANS" and "arith" almost have their encode and decode speeds swapped.
  49. [49]
    [PDF] Comparison of Lossless Image Formats - arXiv
    JPEG XL (2020) [1, 2] is a new image format that sup- ports both lossy and lossless compression. ISO/IEC standard is still under development. Its basic building.
  50. [50]
    AV1 Encoding and Optical Flow: Video Performance Boosts and ...
    Sep 22, 2022 · To achieve 42 dB PSNR, AV1 video has a 7 Mbps bit rate while H.264 has upwards of 12 Mbps. Across all resolutions, AV1 encoding averages 40% ...
  51. [51]
    Arithmetic Encoding vs Adaptive Arithmetic Encoding
    Nov 3, 2020 · You use adaptive arithmetic coding if the frequencies change throughout the process. For example you take some signal, preprocess it to remove some redundancy ...