Fact-checked by Grok 2 weeks ago

Singleton bound

In , the Singleton bound provides an upper limit on the size of a with specified length and minimum distance, stating that for a q-ary code C of length n and minimum distance d, the number of codewords |C| satisfies |C| \leq q^{n-d+1}. Named after American mathematician Richard C. Singleton, who introduced it in , the bound applies to both linear and nonlinear codes and represents one of the earliest and simplest fundamental limits on code performance. For linear codes over a \mathbb{F}_q with dimension k, the bound equivalently implies that the minimum distance d satisfies d \leq n - k + 1. The proof of the Singleton bound relies on the : consider the projection of the code onto any n - d + 1 coordinates; if |C| > q^{n-d+1}, then at least two distinct codewords must agree on those coordinates, implying their is at most d-1, which contradicts the minimum assumption. Codes achieving equality in the Singleton bound are known as maximum separable (MDS) codes, a class that includes trivial examples like repetition codes and the full space, as well as nontrivial constructions such as Reed-Solomon codes when the alphabet size q is sufficiently large relative to n. While the bound is relatively loose for small alphabets like codes (where stronger bounds like the Hamming or Plotkin bounds apply), it is asymptotically tight for large q and plays a key role in understanding the trade-offs between rate, , and error-correcting capability in applications such as , communications, and .

Fundamentals

Statement of the Bound

In , a over an of size q is a C \subseteq \Sigma^n, where \Sigma is the with |\Sigma| = q and n is the code length, consisting of M = |C| codewords, each a sequence of n symbols from \Sigma. The dimension k of the code is defined such that M = q^k, representing the number of information symbols encoded. The minimum ( d is the smallest number of positions in which any two distinct codewords differ. The Singleton bound states that for any such q-ary of length n and minimum distance d, the size satisfies M \leq q^{n - d + 1}. Equivalently, in terms of , d \leq n - k + 1. This bound holds for arbitrary codes, independent of the alphabet size q beyond the basic setup. This inequality implies that a code with minimum distance d can detect up to d-1 s and correct up to t = \lfloor (d-1)/2 \rfloor errors in a received word, as the separation between codewords limits the in decoding. Codes achieving in the bound, where d = n - k + 1, are called maximum distance separable (MDS) codes and represent optimal performance in terms of error correction capability for given parameters. In the asymptotic regime, as block length n \to \infty, the rate R = k/n and relative distance \delta = d/n satisfy R \leq 1 - \delta. This provides a fundamental tradeoff between information rate and error resilience for large-scale codes.

Proof of the Bound

The Singleton bound establishes an upper limit on the minimum distance d of a block code over an alphabet of size q with length n and q^k codewords, showing that d \leq n - k + 1. To derive this, consider the projection of the code onto any n - d + 1 coordinate positions, which effectively punctures (deletes) the remaining d - 1 positions to form a shortened code of length n - d + 1. Suppose two distinct codewords agree on these n - d + 1 positions; then they can differ only in the punctured d - 1 positions, yielding a of at most d - 1. This contradicts the minimum distance d of the , so the projections must all be distinct. Since there are at most q^{n - d + 1} possible distinct tuples over the of size q, the can have at most q^{n - d + 1} codewords. Thus, q^k \leq q^{n - d + 1}, implying k \leq n - d + 1 or equivalently d \leq n - k + 1. This argument relies on an injective mapping from codewords to their projections on the n - d + 1 positions, ensuring no collisions and bounding the code size directly. It holds for arbitrary (nonlinear) without assuming any linear structure, as the applies generally to the finite alphabet. To see the bound's tightness in the equality case, assume d = n - k + 1; then the projection onto any k positions must be injective, allowing exactly q^k codewords if the code achieves the bound.

Linear Codes

Application to Linear Codes

A linear [n, k, d]_q code C is defined as a k-dimensional subspace of the vector space \mathbb{F}_q^n, where \mathbb{F}_q is the finite field with q elements, n is the code length, k is the dimension, and d is the minimum Hamming weight (or distance) of its nonzero codewords. The codewords are all linear combinations of k linearly independent basis vectors, and the total number of codewords is q^k. The Singleton bound specializes naturally to linear codes, stating that d \leq n - k + 1. This follows immediately from the general bound applied to block codes, as the code size M = q^k implies \log_q M = k, so the inequality M \leq q^{n - d + 1} yields the same relation. A direct proof exploiting the linear structure uses the parity-check matrix H, an (n-k) \times n matrix of rank n-k such that C = \{ \mathbf{c} \in \mathbb{F}_q^n \mid H \mathbf{c}^T = \mathbf{0} \}. Since the column space of H has dimension n-k, any collection of n-k+1 columns of H must be linearly dependent. Thus, there exists a nonzero vector \boldsymbol{\lambda} \in \mathbb{F}_q^{n-k+1} such that the linear combination of those columns with coefficients from \boldsymbol{\lambda} is the zero vector, implying H \boldsymbol{\lambda}^T = \mathbf{0}. The vector \boldsymbol{\lambda}^T \in \mathbb{F}_q^n (with zeros elsewhere) is then a nonzero codeword of weight at most n-k+1, so d \leq n - k + 1. This bound is crude in the sense that the actual minimum distance d is the size of the smallest linearly dependent set of columns of H, which may be strictly less than n-k+1 if lower-weight dependencies exist. An alternative proof leverages the G, a k \times n matrix whose rows form a basis for C. Without loss of generality, assume G is in systematic form [I_k \mid P], where I_k is the k \times k and P is k \times (n-k). Codewords are then of the form \mathbf{c} = ( \mathbf{u}, \mathbf{u} P ) for \mathbf{u} \in \mathbb{F}_q^k. To establish the bound, consider puncturing C by deleting any d-1 coordinates; the resulting map is a linear injection from the k-dimensional space C to \mathbb{F}_q^{n-d+1}, because if the first n-d+1 symbols of a codeword are zero, its weight is at most d-1 < d, implying it is the zero codeword. The image is thus a k-dimensional subspace of \mathbb{F}_q^{n-d+1}, requiring k \leq n - d + 1, or equivalently d \leq n - k + 1. The rank properties of G further highlight the bound: the minimum distance condition ensures that every d-1 columns of G are linearly independent, and the overall rank k limits the possible independence in larger sets, aligning with the Singleton limit via dimension arguments. Examples of linear codes achieving the Singleton bound include trivial cases. The full space \mathbb{F}_q^n forms an [n, n, 1]_q code with d = 1 = n - n + 1. The repetition code, generated by the all-ones vector repeated n times, is a [n, 1, n]_q code with d = n = n - 1 + 1, where every nonzero codeword has full weight n. These illustrate how the bound is tight for extreme dimension values, though nontrivial achieving codes () exist beyond these.

Relation to Other Bounds

The Singleton bound offers a general upper limit on the size of a code C with length n, minimum distance d, and alphabet size q, stating |C| \leq q^{n-d+1}, which applies universally regardless of q or n. In comparison, the Hamming bound, derived from sphere-packing considerations around each codeword to ensure non-overlapping balls of radius t = \lfloor (d-1)/2 \rfloor, provides a tighter constraint for many practical cases, particularly when the code can correct up to t errors, but it depends on the specific error-correcting capability and alphabet geometry. For binary codes at moderate relative distances \delta = d/n, the Hamming bound often yields a lower upper bound on the rate than the Singleton bound, though the latter remains valid without such assumptions. The Plotkin bound further refines upper limits for codes with high rates or \delta > 1/2, especially over small alphabets like , where it exploits pairwise distance properties to cap the code size more stringently than the Singleton bound. Since the Singleton bound's asymptotic form does not incorporate q, it becomes relatively loose for large q, allowing potentially larger codes, whereas the Plotkin bound leverages alphabet-specific inequalities to perform better in high-rate regimes over fixed small q. For instance, in settings, the Plotkin bound can exclude rates above $1 - 2\delta for \delta > 1/2, improving on Singleton's $1 - \delta. Complementing these upper bounds, the Gilbert-Varshamov bound serves as a lower bound on achievable code sizes, guaranteeing the existence of codes with at least $1 - h_q(\delta), where h_q is the q-ary function, thus contrasting the bound's restriction on maximum distance for a given . This highlights the gap between upper and lower limits, with Gilbert-Varshamov showing that rates approaching its curve are attainable, often surpassing what fixed-alphabet constraints under might suggest. The Singleton bound achieves tightness for codes with small dimension k (low ) or large d (high \delta), where few codewords are possible, but for linear codes, the Hamming and Plotkin bounds frequently offer stricter estimates, as seen in applications like error-correcting codes over GF(2). Asymptotically, the Singleton bound yields R + \delta \leq 1, a q-independent linear that bounds the fundamental tradeoff between and relative distance across all code families.

MDS Codes

Definition and Properties

Maximum distance separable (MDS) codes are a class of error-correcting codes that achieve equality in the Singleton bound, attaining the maximum possible minimum distance for given code length n, k, and alphabet size q. Specifically, an [n, k, d]_q code is MDS if its minimum distance satisfies d = n - k + 1. This optimality makes MDS codes particularly valuable in applications requiring efficient error correction and rate-distortion trade-offs. A key algebraic property of linear MDS codes is that the dual code is also MDS. For an [n, k]_q linear MDS code with generator matrix G, any k columns of G are linearly over the \mathbb{F}_q. Equivalently, in the parity-check matrix H (of (n-k) \times n), any n-k columns are linearly . These properties ensure the code's separability, allowing systematic encoding in any choice of k positions. The error-correcting capability of an MDS code follows directly from its minimum : it can correct up to t = \lfloor (d-1)/2 \rfloor = \lfloor (n-k)/2 \rfloor errors, which is optimal for the given rate k/n. The Singleton defect, defined as s = n - k + 1 - d, measures how far a code is from MDS optimality; for MDS codes, s = 0, while s > 0 indicates suboptimality. Trivial MDS codes always exist and include the repetition code [n, 1, n]_q (which repeats a symbol n times), the even-weight (or parity-check) code [n, n-1, 2]_q, and the full space [n, n, 1]_q (all possible vectors). Nontrivial MDS codes exist only for specific parameter sets, such as when n \leq q + 1 over \mathbb{F}_q, though their beyond trivial cases is limited by the MDS conjecture.

Constructions and Examples

One of the simplest constructions of maximum distance separable (MDS) codes are the trivial examples, which achieve the Singleton bound for extreme dimension parameters. These include the repetition code [n, 1, n]_q, where all symbols are identical and the minimum distance equals the length; the parity-check code [n, n-1, 2]_q, which adds one overall parity symbol to detect single errors; and the full space code [n, n, 1]_q, consisting of all possible vectors with minimum distance 1. These codes exist over any \mathbb{F}_q for arbitrary length n and are MDS by definition, though they offer limited practical utility due to their degenerate error-correcting capabilities. A foundational non-trivial construction is the Reed-Solomon code, introduced by Reed and Solomon in 1960. These are linear MDS codes over \mathbb{F}_q of length n \leq q, dimension k, and minimum d = n - k + 1, obtained by evaluating all polynomials of degree at most k-1 in \mathbb{F}_q at n distinct points in \mathbb{F}_q. The generator matrix has rows corresponding to powers of these evaluation points, ensuring the Vandermonde guarantees the MDS . Reed-Solomon codes are widely used in applications like digital communications and due to their optimal trade-off between rate and error correction. To extend the length beyond q, the extended Reed-Solomon code appends an additional coordinate corresponding to evaluation at the "point at infinity," yielding an MDS code of length n = q + 1, k, and d = n - k + 1 over \mathbb{F}_q. This construction involves modifying the parity-check matrix to include a row of ones for the infinity point, preserving the MDS property while increasing the code length by one. Extended Reed-Solomon codes maintain efficient encoding and decoding algorithms similar to their non-extended counterparts. Beyond algebraic constructions like Reed-Solomon codes, derived from curves provide MDS examples for longer lengths relative to the field size. A notable instance is the Hermitian code, constructed from the Hermitian curve x^{q+1} + y^{q+1} + z^{q+1} = 0 over \mathbb{F}_{q^2}, where q is a . These codes have length n = q^3 + 1, and for appropriate choices of the Riemann-Roch space (e.g., multiples of the point at infinity), they achieve MDS parameters such as dimension k up to roughly q^2 and distance d = n - k + 1. Hermitian codes thus surpass the length limit of Reed-Solomon codes, enabling stronger error correction in scenarios requiring n > q + 1. The existence of non-trivial MDS codes is constrained by the MDS for q-ary linear codes, which posits that the maximum length n for a code of k (with $2 \leq k \leq q-1) is at most q + 1, except in specific cases: when q is even and k = 3 or k = q - 1, where n \leq q + 2. This highlights the rarity of long MDS codes and guides efforts, with Reed-Solomon and extended variants saturating the bound for most parameters. The following table summarizes known maximum lengths n for non-trivial MDS codes over small fields \mathbb{F}_q with $2 \leq k \leq q-1, based on exhaustive classifications; entries indicate the largest n achieving MDS for some k in that range.
qMaximum n for MDS codes
46
56
78
810
910
1112
1314
1618
1718
1920
3234
These values align with the MDS conjecture, with exceptions only for even q in the specified k cases.

History

Discovery by Singleton

Richard C. Singleton (1928–2007) was an American mathematician and computer scientist whose work advanced information theory and coding. He received BS and MS degrees in electrical engineering from MIT in 1949 and 1950, respectively, followed by an MBA from Stanford University in 1952 and a PhD in theoretical statistics from Stanford in 1960. Singleton joined SRI International (formerly Stanford Research Institute) in 1952 as a junior research engineer and remained there for 44 years, rising to staff scientist in the Information Technologies Practice before transferring to SRI Consulting in 1996. In 1964, formulated the Singleton bound in his paper "Maximum distance q-nary codes," published in the IEEE Transactions on . Although the bound was first proved by H. F. in 1952 for orthogonal arrays, Singleton reformulated it for error-correcting codes. The research was motivated by the exploration of maximum-distance separable (M.D.S.) codes to construct constant-weight codes with large size and distance, applicable as superimposed codes. While drawing on contemporary interests in error-correcting codes, including references to Peterson's work and Reed-Solomon codes, the paper centered on over finite alphabets of size q. Singleton's bound applies to general q-nary codes with N = q^k codewords of length n = k + r, establishing that the minimum d satisfies d ≤ r + 1; this result holds independently of q (whether prime or ) and encompasses both linear and nonlinear codes. This formulation emphasized the bound's generality for nonlinear codes, providing a straightforward upper limit on achievable without reliance on the specific field structure. The bound offered a simple, field-independent upper limit on code performance at a time when linear coding theory was gaining prominence through algebraic constructions. Its introduction marked a key early contribution to bounding techniques in block coding, facilitating subsequent analysis of code optimality.

Subsequent Developments

The concept of maximum distance separable (MDS) codes, which achieve equality in the Singleton bound, was introduced by Singleton in his 1964 paper. Reed-Solomon codes, introduced earlier in 1960 by Irving S. Reed and Gustave Solomon, served as a seminal example of such codes, though their connection to the Singleton bound was recognized retrospectively. These developments shifted focus from the bound itself to explicit constructions and properties of optimal codes, laying the groundwork for broader applications in error correction. In the 1980s, the MDS conjecture was formalized within , positing that over a GF(q), a non-trivial [n, k, d = n - k + 1]_q MDS code satisfies n ≤ q + 1 except for specific cases where q is even and k = 3 or k = q - 1, in which case n ≤ q + 2. This conjecture, rooted in geometric interpretations from the 1950s but adapted to codes, spurred geometric constructions using projective planes, where arcs in projective geometries yield MDS codes with lengths up to q + 2 under certain conditions. These constructions, explored by J. W. P. Hirschfeld and others, highlighted the interplay between finite geometries and coding optimality. The 1990s and 2000s extended MDS code applications beyond classical error correction to emerging fields like network coding and . In network coding, MDS codes enabled efficient schemes, as demonstrated by the operator channel framework introduced by Ralf Koetter and Frank R. Kschischang, where they achieve capacity bounds in lossy networks. In , variants of the incorporated MDS codes for enhanced security in code-based , leveraging their error-correcting properties for post-quantum schemes. Concurrently, quantum MDS codes emerged, with early constructions using CSS codes over finite fields to attain the quantum Singleton bound, providing optimal for qubit stabilization. Post-2010 advancements generalized the Singleton bound to specialized settings, including lattice-based schemes for modulation in communication systems. A lattice Singleton bound was derived for infinite constellations, bounding the dimension and minimum distance in lattice codes used for high-rate modulation, with implications for wireless channels. Similarly, locally recoverable codes (LRCs) adopted a generalized Singleton bound, with optimal constructions achieving d = n - k - \lceil k/r \rceil + 2. These extensions addressed practical scalability in big data and cloud environments. Despite progress, open problems persist, particularly proofs of the MDS conjecture for large field sizes q and the asymptotic existence of MDS codes with lengths approaching q + 1 for arbitrary dimensions k. These challenges drive ongoing research in and combinatorial constructions to resolve the conjecture's full scope.

References

  1. [1]
  2. [2]
    [PDF] Notes 4: Elementary bounds on codes 1 Singleton bound
    For code families over a fixed alphabet such as binary codes, substantial improvements to the Singleton bound are possible. We turn to such bounds next. 2 The ...
  3. [3]
    [PDF] Lecture 4 1 Review of last lecture 2 Singleton bound 3 Hamming ...
    Feb 19, 2008 · Today we are going to discuss limitations of codes. More specifically, we will see rate upper bounds of codes, including Singleton bound, ...
  4. [4]
    [PDF] Bounds on Code Parameters - Henry D. Pfister
    Jan 16, 2025 · The family of codes that meet the Singleton bound is called maximum dis- tance separable (MDS) codes. We will later see an algebraic family of ...
  5. [5]
    [PDF] Notes 2: Gilbert-Varshamov bound 1 Asymptotically good codes and ...
    The Gilbert-Varshamov bound is the maximal size of a q-ary code of block length n and distance d, satisfying Aq(n, d) ≥ qn Volq(n, d − 1) = qn Pd−1.
  6. [6]
    Codes of Small Defect | Designs, Codes and Cryptography
    The code C is called MDS, or maximum distance separable, if the minimum distance d meets the Singleton bound, ie d = n-k+1.
  7. [7]
    [PDF] ON ADDITIVE MDS CODES OVER SMALL FIELDS
    Dec 11, 2020 · An MDS code over A with d = 1 is Ak and with k = 1 is the repetition code, so these are trivial. For d = 2, an MDS code is equivalent to a ...
  8. [8]
    [PDF] Reed-Solomon Codes
    Extended RS Codes. • An (n,k) RS code over GF(q) with n = q-1 can be extended twice to a (q+1,k) MDS code. • There is a technique for constructing such codes.
  9. [9]
    [PDF] TWISTED HERMITIAN CODES 1. Introduction Reed-Solomon and ...
    The Hermitian code determined by α is the algebraic geometry code C(αP∞) = ev (L(αP∞)). Notice that C(αP∞) is a code of length q3, dimension at least α +1−g, ...
  10. [10]
    [math/0211107] On Near-MDS Elliptic Codes - arXiv
    Nov 6, 2002 · The main conjecture on maximum distance separable (MDS) codes states that, execpt for some special cases, the maximum length of a q-ary linear MDS code is q+1.
  11. [11]
    [PDF] Classification of 2 and 3 dimensional MDS codes for 4 ≤ q ≤ 32
    Tables 6–9 are analogous to the previous Tables 1–4 for weak equivalence. Let ν(n, k, q) denote the number of weakly non-equivalent linear MDS codes of length n ...
  12. [12]
    [PDF] SRI Alumni Association Newsletter • August 2007
    Richard (Dick) C. Singleton, a distin- guished Mathematical Statistician and. Computer Scientist, passed away at his daughter's home on Easter Sunday, April 8 ...
  13. [13]
    [PDF] Maximum Distance Q-Nary Codes - of Luca Giuzzi
    In this paper, the class of codes with d = r + 1 and N = qk is considered. These codes will be called maximum-distance separable (M.D.S.)Missing: URL | Show results with:URL
  14. [14]
    [PDF] Maximum distance q -nary codes - Semantic Scholar
    Maximum distance q -nary codes · R. Singleton · Published in IEEE Transactions on… 1 April 1964 · Mathematics, Computer Science.Missing: original | Show results with:original
  15. [15]
    [1301.6456] A Singleton Bound for Lattice Schemes - arXiv
    Jan 28, 2013 · In this paper, we derive a Singleton bound for lattice schemes and obtain Singleton bounds known for binary codes and subspace codes as special ...