Fact-checked by Grok 2 weeks ago

4B5B

4B/5B is a block coding scheme used in data communications that maps every group of 4 data bits (a ) into a predefined 5-bit code group, thereby expanding the by 25% to ensure reliable transmission over . This encoding method addresses challenges in and signal synchronization by guaranteeing a minimum number of transitions in the transmitted signal; specifically, each 5-bit code contains no more than three consecutive zeros and at least two transitions, preventing long runs of identical bits that could lead to timing errors in asynchronous systems. The scheme operates at 80% efficiency, as only 4 of the 5 bits per group carry actual data, with the extra bit used for coding purposes. Additionally, 4B/5B supports special non-data symbols—such as (I), Start-of-Stream (J/K), and End-of-Stream (T/R)—which are essential for frame delimiting, error detection, and link management in networked environments. Introduced as part of standards for high-speed local area networks, 4B/5B is prominently featured in the Physical Coding Sublayer (PCS) of Fast Ethernet, defined by the IEEE 802.3u-1995 standard, where it enables 100 Mbps data rates by transmitting at an effective symbol rate of 125 Mbps. In 100BASE-TX over Category 5 twisted-pair cabling, the 4B/5B-encoded bits are scrambled to reduce electromagnetic interference before further processing with Multi-Level Transmit-3 (MLT-3) line coding, while 100BASE-FX over fiber optic uses Non-Return-to-Zero Inverted (NRZI) encoding on the 4B/5B output. The technique also forms a core component of the Fiber Distributed Data Interface (FDDI), an ANSI X3.166 standard for 100 Mbps token-passing networks over multimode fiber, where it combines with NRZI to support ring topologies in backbone applications. Although largely superseded by more advanced encodings in modern Gigabit and higher Ethernet variants, 4B/5B remains influential for its role in bridging the gap from 10 Mbps to faster speeds while maintaining compatibility with existing physical layer principles.

Fundamentals

Definition and Encoding Principle

4B5B is a block coding line code used in data communications that maps each group of 4 data bits, known as a nibble, to a unique 5-bit code group for transmission over a physical medium. This encoding scheme introduces a 25% overhead, as the 5-bit symbols require a higher signaling rate than the original data rate; for example, to achieve 100 Mbps of data throughput, the physical layer must operate at 125 MHz. The selection of 5-bit symbols from the 32 possible combinations ensures properties beneficial for reliable transmission, such as sufficient signal transitions for clock synchronization. The core encoding principle involves dividing the incoming into 4-bit s, either from input or by serializing the bits into groups of four, and then substituting each with a predefined 5-bit via a . Out of the 16 possible 4-bit values (from 0000 to ), each is assigned one of 16 carefully chosen 5-bit symbols designed to limit long runs of identical bits and maintain overall balance in the signal. This mapping avoids invalid or unused 5-bit patterns, reserving some for control purposes, while the process ensures that the encoded stream can be decoded unambiguously at the receiver. In operation, the encoder aggregates input bits into and applies the mapping table to generate the 5-bit output stream, which is then typically further encoded (e.g., using NRZI) for the physical medium. The , conversely, identifies valid 5-bit symbols in the incoming stream, maps them back to the original 4-bit using the inverse table, and reassembles the data. For example, the nibble 1010 ( A) is encoded as the 5-bit symbol 10110, which provides the necessary transitions for reliable detection. This mechanism was originally specified in the ANSI FDDI standard and later incorporated into IEEE 802.3u for .

Key Properties and Benefits

The 4B5B encoding scheme incorporates a run-length limited (RLL) , ensuring no more than three consecutive zeros within any 5-bit , which guarantees at least one bit transition every five bits. This design facilitates reliable from the data stream without requiring a separate , as the frequent transitions allow phase-locked loops (PLLs) to synchronize effectively. By limiting long runs of identical bits, 4B5B enhances signal integrity in both optical and electrical transmission media, reducing the risk of timing and improving overall system performance. Regarding DC balance, the selected 5-bit symbols provide a bounded disparity, with an average of roughly equal numbers of 1s and 0s across multiple encoded groups, though short-term imbalances can reach 2/5 ones (40% ). This partial balancing minimizes baseline wander in AC-coupled systems, such as those using capacitors or transformers, thereby supporting stable long-distance transmission without excessive low-frequency distortion. The scheme's 16 valid symbols out of 32 possible 5-bit combinations enable basic error detection, as invalid patterns signal potential transmission errors, offering single-error detection capability with minimal overhead. Key benefits of 4B5B include improved , with only 25% overhead (transmitting 5 bits for every 4 data bits), compared to 100% overhead in encoding, which doubles the rate for self-clocking. This allows higher effective data rates over constrained media, while the self-clocking nature eliminates the need for dedicated clock lines, simplifying hardware design. As a simpler predecessor to 8B10B, 4B5B employs fixed mappings without running disparity management, reducing encoding/decoding complexity at the cost of less stringent control.

Encoding Details

Data Symbols

In 4B5B encoding, the data symbols represent the core mechanism for transmitting , where each group of 4 bits (a ) from the input is mapped to a specific 5-bit code group. This mapping ensures reliable transmission over the physical medium by guaranteeing sufficient signal transitions for while maintaining a of 125 Mbaud for 100 Mbps . The 16 possible 4-bit data values, ranging from 0000 ( 0) to 1111 ( F), are encoded into predefined 5-bit patterns selected from the possible 5-bit combinations to meet encoding constraints. The complete mapping for the 16 data symbols is shown in the following table, including binary representations and equivalents. Symbol names use the conventional 4-bit notation (e.g., 0 for 0000), as defined in IEEE 802.3u for ; FDDI uses similar mappings but assigns additional meanings to some codes for signaling.
4-bit Data (Binary / Hex)5-bit Code (Binary / Hex)Symbol Name
0000 / 011110 / 1E0
0001 / 101001 / 091
0010 / 210100 / 142
0011 / 310101 / 153
0100 / 401010 / 0A4
0101 / 501011 / 0B5
0110 / 601110 / 0E6
0111 / 701111 / 0F7
1000 / 810010 / 128
1001 / 910011 / 139
1010 / A10110 / 16A
1011 / B10111 / 17B
1100 / C11010 / 1AC
1101 / D11011 / 1BD
1110 / E11100 / 1CE
1111 / F11101 / 1DF
The selection of these 5-bit code groups adheres to specific criteria to optimize characteristics. Each is chosen to contain no more than one and no more than two trailing zeros, preventing any four consecutive zeros across adjacent s even if the input data ends with three zeros followed by a starting with one zero. This constraint, known as a (0,3) run-length limited (RLL) , ensures frequent transitions for reliable without excessive run-lengths of identical bits. Additionally, the codes are designed with a distribution of even and odd (based on the number of 1s), averaging three 1s per , which contributes to DC by minimizing the accumulation of long-term voltage on the line, though full often relies on complementary techniques like in certain implementations. These data symbols are used to encode the actual bits in the transmitted stream, with the 4-bit s processed sequentially from the (e.g., in Ethernet). After encoding, the 5-bit code groups are serialized into a continuous bit stream at the line rate and further modulated (e.g., via NRZI for ). For instance, consider an 8-bit byte with value 0x2A (binary 00101010), split into nibbles 0010 (hex 2) and 1010 (hex A). The first nibble encodes to 10100 (2), and the second to 10110 (A), resulting in a 10-bit sequence 1010010110 transmitted over the medium. This process repeats for the entire data frame, excluding control symbols used for framing. At the receiver, decoding involves synchronizing to the 5-bit code group boundaries and validating each received 5-bit pattern against the predefined mapping table. Valid patterns are directly mapped back to the corresponding 4-bit , reconstructing the original . If an invalid 5-bit pattern is detected (e.g., due to or bit s), it is flagged as a decoding (denoted as symbol V), and the receiver may signal disparity or initiate error handling, such as frame discard, without attempting to recover the intended 4 bits. This error detection capability enhances reliability in high-speed links.

Control and Command Symbols

In 4B5B encoding, six special symbols—H, I, J, K, R, and T—are defined outside the standard 16 symbols to handle framing, , idle periods, termination, and signaling. These symbols are assigned unique 5-bit patterns that violate the run-length constraints of symbols (no more than three consecutive zeros and ensuring at least two transitions per symbol), making them invalid for interpretation and thus easily detectable by the decoder. This design allows the (PCS) to insert information transparently without ambiguity. Additionally, any received 5-bit pattern not matching a valid or is categorized as an invalid V. The specific mappings for these symbols are as follows:
Symbol5-Bit Binary CodeHexEquivalent 4-Bit Input (if applicable)Role
I111111FN/AIdle (line state during no transmission)
J11000180101First part of start delimiter
K10001110101Second part of start delimiter
H00100041000Error propagation
T011010D0000First part of end/terminate delimiter
R00111070000Second part of end/terminate delimiter
VVarious (unused 5-bit patterns, e.g., 00000, 00001)N/AN/AInvalid or error-indicating received pattern
These mappings ensure the symbols cannot be decoded as valid data nibbles. For instance, J and K both derive from the 4-bit pattern 0101 but use distinct 5-bit extensions to form the start delimiter pair. The I symbol fills idle periods on the link, ensuring continuous transitions for clock recovery. The primary roles of these symbols center on frame delimiting and link management in standards like (100BASE-X). The JK pair functions as the start-of-stream delimiter (SSD), replacing the initial two nibbles of the MAC frame preamble to signal the onset of data transmission and synchronize the receiver. The T symbol, paired with an R symbol, forms the end-of-stream delimiter (ESD) to mark frame termination, allowing the receiver to detect the end of valid data. The H symbol is inserted by the transmitter to propagate error conditions, such as those signaled by the MAC layer's transmit error pin, ensuring errors are visible to the receiver without corrupting data interpretation. The I symbol maintains the link during idle times by providing a continuous pattern with transitions. The V category flags transmission errors or noise at the receiver. Unlike data symbols, which carry payload, these control symbols enable robust synchronization and error handling by providing out-of-band signaling. Control symbols are generated and inserted at the PCS transmit process, typically based on MAC layer primitives, prior to 4B5B encoding and subsequent line coding (e.g., NRZI). On the receive side, the PCS decoder identifies these unique patterns immediately after line decoding, converting them back to special signals (e.g., SSD or ESD events) for the MAC layer while stripping them from the data stream. This integration ensures frame boundaries are clearly delimited, preventing data symbols from being mistaken for control information and maintaining DC balance and clock recovery during idle periods filled with continuous I symbols.

Implementation

Clock Recovery

In 4B5B encoding, the selection of 5-bit code groups ensures no more than three consecutive zeros across adjacent symbols, guaranteeing at least one logic transition every five bits in the serial data stream. This run-length limitation provides sufficient signal edges for circuits, such as phase-locked loops (PLLs) or delay-locked loops (DLLs), to extract and synchronize the bit clock from the incoming data without requiring a dedicated clock channel. The encoding process transmits 5-bit symbols serially at a fixed rate, for example, 125 MHz to achieve 100 Mbps effective data throughput in implementations. At the receiver, edge-detection circuits identify transitions in the encoded stream, enabling a PLL to phase-align the local clock to these edges and recover the precise bit timing, ultimately delineating symbol boundaries for decoding back to the original 4-bit data. This self-clocking approach simplifies cabling by eliminating the need for separate clock lines, offering advantages over unencoded (NRZ) schemes that can suffer from long runs of identical bits leading to ambiguous synchronization. When combined with non-return-to-zero inverted (NRZI) modulation—where a transition occurs for each '1' bit—the 4B5B codes further enhance transition density, ensuring reliable clock extraction even in noisy environments. Clock recovery in 4B5B systems must address challenges like timing from transmission impairments or slight mismatches between transmitter and clocks. Elastic buffers at the absorb these variations by temporarily storing incoming symbols and adjusting the output rate to match the local clock domain, preventing or corruption. In (100BASE-TX and 100BASE-FX), 4B5B encoding precedes NRZI modulation, with the recovered 125 MHz clock feeding into downstream processing after elastic buffering to maintain synchronization.

Signal Integrity and DC Balance

In 4B5B encoding, DC balance is maintained statistically through the selection of 5-bit symbols that exhibit limited individual disparity, with each symbol containing two, three, or four 1s, yielding an average of approximately 3 ones per symbol (61% ones). Unlike more advanced codes such as 8B/10B, 4B5B employs fixed mappings from 4 bits to 5 bits without running disparity tracking or inversion, which provides adequate balance for short transmission bursts typical in protocols like FDDI where cumulative wander remains minimal. This approach enhances by reducing low-frequency spectral components in the transmitted waveform, thereby preventing baseline shift or wander in AC-coupled receivers and transformers that could otherwise degrade eye opening. Additionally, the guaranteed minimum of two transitions per symbol helps limit by ensuring sufficient high-frequency content for reliable signal recovery without excessive equalization demands. For error handling, the monitors incoming 5-bit groups against the valid set; detection of invalid codes—such as those not assigned to or functions—triggers flags, often manifesting as violations that halt processing and alert higher layers. Optional disparity monitoring can provide further detection of accumulated imbalances, though it is not mandated in standard 4B5B implementations. In practical optical applications, such as FDDI networks using LED or transceivers, the code's supports driver by minimizing low-frequency distortions that could cause nonlinear response or clipping. For longer runs where statistical alone may prove insufficient, 4B5B can be paired with additional to further randomize the bit stream and constrain disparity growth. Short-term DC variations can reach 40-80% ones in worst-case sequences.

History and Applications

Development and Adoption Timeline

The development of 4B5B encoding emerged in the early 1980s amid the rapid expansion of fiber-optic technologies for high-speed data networks, driven by the need for reliable line codes to support emerging optical transmission standards. In October 1982, the American National Standards Institute (ANSI) chartered its Accredited Standards Committee X3T9.5 to create a high-performance fiber-optic networking specification, which laid the groundwork for incorporating block coding techniques like 4B5B to ensure signal synchronization and DC balance in fiber environments. This effort built on prior block coding methods used in telecommunications, adapting them for the demands of 100 Mbps fiber rings. 4B5B gained prominence through its integration into the (FDDI) standard, where it served as the core encoding for the to map 4-bit data nibbles into 5-bit symbols, enabling efficient transmission over multimode . The FDDI Media (MAC) layer was approved by ANSI X3T9.5 as X3.139-1987 on November 5, 1986, with the protocols following shortly thereafter in 1988, marking the first major of 4B5B for commercial fiber-optic LANs. This adoption was fueled by the fiber optics surge, as utilities and providers deployed optical infrastructure for higher needs. In 1989, the () began developing a multichannel audio digital interface, leading to the adoption of 4B5B for serial transmission in what became the (AES10), finalized in 1991; this extended the code's utility to applications requiring low-latency, balanced signaling over coaxial or links. By 1995, 4B5B was incorporated into variants under IEEE 802.3u, particularly for 100BASE-TX and 100BASE-FX, where it provided a 25% overhead for and error detection on twisted-pair and media, accelerating its use in networking. The IEEE 802.3u was officially approved that year, standardizing 4B5B across ANSI and IEEE frameworks. Post-1995, 4B5B influenced subsequent , notably 8B10B, which expanded the principle to 8-bit data for (IEEE 802.3z, 1998) to handle higher rates while maintaining similar benefits for disparity control and transition density. However, with the shift to faster interfaces like using , 4B5B saw no significant updates after 2000, transitioning to legacy status in modern standards bodies such as ANSI, IEEE, and , as fiber-optic demands evolved toward greater efficiency.

Primary Uses in Standards

4B5B encoding is primarily employed in the of several legacy networking and audio standards to facilitate reliable serialization, DC balance, and over optical or . It maps 4-bit symbols into 5-bit groups, expanding the 100 Mbps rate to 125 Mbps for , often in conjunction with NRZI signaling. This scheme ensures frequent transitions in the signal stream, aiding without excessive bandwidth overhead. In the (FDDI) standard, developed for token-passing ring networks, 4B5B is integral to the , encoding data at 100 Mbps over fiber-optic cables before NRZI , resulting in a 125 Mbps line rate. This application supports dual-ring topologies for redundancy in local area networks, with the encoding ensuring no more than three consecutive zeros to maintain . FDDI's adoption in enterprise backbones during the leveraged 4B5B for its efficiency in multimode fiber environments up to 2 km. The IEEE 802.3u specification for incorporates 4B5B in the 100BASE-FX variant, enabling 100 Mbps full-duplex transmission over multimode fiber optic cables. Here, 4B5B encodes frame data into 5-bit symbols at the , paired with NRZI for optical signaling, to achieve an 80% encoding efficiency and support segment lengths up to 2 km. This fiber-based implementation avoids the used in the twisted-pair 100BASE-TX counterpart, focusing instead on direct serialization for low-latency campus networking. In the AES10 standard for Multichannel Audio Digital Interface (), 4B5B supports the serial transmission of up to 64 channels of at a 100 Mbps data rate, expanded to 125 Mbps via encoding, over or fiber-optic links. It handles 32-bit channel words compliant with , including audio samples, validity bits, and metadata, with sync symbols inserted periodically for frame alignment. This configuration allows operation at sampling rates from 32 kHz to 96 kHz, making MADI suitable for professional audio routing in broadcast and studio environments. Legacy applications of 4B5B appear in certain (ATM) adaptations, such as physical layer interfaces operating at 25.6 Mbps using the block code to achieve a 32 Mbaud symbol rate. Similarly, early mappings of FDDI or ATM traffic onto SONET/SDH OC-3 (155 Mbps) frames utilized 4B5B-encoded streams within the payload, though these have been largely supplanted by more efficient schemes like 64B/66B in modern Ethernet standards beyond 100 Mbps. Across all implementations, 4B5B operates at the for bit serialization and is occasionally combined with scramblers to further randomize the signal and mitigate .

References

  1. [1]
    2.2 Encoding - Computer Networks: A Systems Approach
    The idea of 4B/5B is to insert extra bits into the bit stream so as to break up long sequences of 0s or 1s. Specifically, every 4 bits of actual data are ...
  2. [2]
    [PDF] Ethernet Theory of Operation - Microchip Technology
    Feb 1, 2008 · In addition to the physical encoding of MLT3,. 100Base-TX introduces a logical encoding called 4B/5B, or sometimes “Block Coding”. There are two ...<|control11|><|separator|>
  3. [3]
    [PDF] ANSI Fiber Distributed Data Interface (FDDI) Standards - Bitsavers.org
    In. 4B/5B, the encoding is performed four bits at a time. Each four bits of data are encoded into a symbol with five cells such that each cell contains a single ...
  4. [4]
  5. [5]
    Canova Tech - IEEE 802
    Nov 7, 2017 · 4B/5B Encoding. Name. 4b. 5b. Name. 4b 5b. Special Function. 0. 0000. 11110. I. -. 11111. SILENCE. 1. 0001. 01001. J. -. 11000. SYNC. 2. 0010.
  6. [6]
    4B/5B code - CS485 Sylabus
    The data is first transformed using the 4B/5B encoding scheme. The encoded result is transmitted using the NRZ code as its basic transmission scheme ...
  7. [7]
    Fiber-Optic Encoding - Dr. Howard Johnson
    Jan 10, 2002 · The 4B5B code first popularized by FDDI and then 100BASE-FX Ethernet partially addresses the AC-coupling problem. Of the 32 possible 5-bit code ...
  8. [8]
    Data Encoding Techniques, manchester encoding, 8b/10b
    A tutorial describing Data Encoding Techniques including manchester encoding, NRZ, NRZ-I and 8B/10B. Also covers technologies such as MLT-3, PAM5 and FSR.
  9. [9]
    [PDF] LAN Addresses and ARP
    Nov 27, 2000 · ❒ 4B/5B: 80% efficiency. ❒ Used in 100Mbit Ethernet. 4-Bit Data Symbol. 5-Bit Code. 0000. 11110. 0001. 01001. 0010. 10100. 0011. 10101. 0100.
  10. [10]
    [PDF] 147.2 Service primitives and interfaces - IEEE 802
    ESDOK 5B symbol defined as 'R' in 4B/5B encoding. SILENCE 5B symbol defined as 'I' in 4B/5B encoding. ENCODE In the PCS transmit process, this function takes ...
  11. [11]
    [PDF] Physical layer: chips versus bits • Link layer and media access ...
    4B/5B. • Every 4 bits of data encoded in 5 chips. • 5-bit codes selected to have no more than one leading 0 and no more than two trailing 0s. - thus, never get ...
  12. [12]
    6 Links - An Introduction to Computer Networks
    In 4B/5B encoding, for each 4-bit “nybble” of data we actually transmit a designated 5-bit symbol, or code, selected to have “enough” 1-bits. A symbol in this ...
  13. [13]
    [PDF] BCM5221
    Nov 18, 2002 · The 4B5B encoding is shown in Table 1 on page 5. The transmit packet is encapsulated by replacing the first 2 nibbles of preamble with a start ...
  14. [14]
    [PDF] word, known as JK, is encoded, it appears as 11000 10001, and the ...
    The 4B5B decoder at the receiving end of the communications link should receive only the 16 valid replacement symbols shown in the left-hand half of Table 3–1 ...Missing: JKHTV binary
  15. [15]
    [PDF] Canova Tech - IEEE 802
    Nov 7, 2017 · Values: TRUE or FALSE. • SYNC: 5B symbol defined as 'J' in 4B5B encoding. • SSD: 5B symbol defined as 'K' in 4B5B encoding. • ESD: 5B symbol ...
  16. [16]
    [PDF] Development of Ethernet / Physical Layer Aspects - arXiv
    The encoding used in the fiber channel is 8B/10B scheme which is quite similar to 4B/5B case; moreover, this encoding scheme has the property of DC balancing ...
  17. [17]
    4 Links - An Introduction to Computer Networks
    Although 100-Mbps Ethernet uses 4B/5B encoding, it does not make use of special non-data symbols for packet padding. Gigabit Ethernet uses PAM-5 encoding (2.3 ...Missing: IEEE | Show results with:IEEE<|control11|><|separator|>
  18. [18]
    [PDF] Physical Layer Ethernet Clock Synchronization - DTIC
    Nov 15, 2010 · The clock from the analog line is recovered using a clock recovery block generating a 125 MHz signal. This signal is divided down by a ...
  19. [19]
    Highly Accurate Timestamping for Ethernet‐Based Clock ...
    Mar 19, 2012 · Compared to the original Ethernet, Fast Ethernet introduced 4B/5B line encoding and the Idle code-group (clause 24.2.2.1.2 of [6]). The ...
  20. [20]
    [PDF] Error Characteristics of FDDI
    It consists of four data-bits 0000 and using the. 4B/5B coding, it is encoded into the ve code-bits 11110, which in turn result in the transition sequence shown ...
  21. [21]
  22. [22]
    [PDF] AES10-2020 - Audio Engineering Society
    This standard describes the data organization and electrical characteristics for a multichannel audio digital interface (MADI). It includes a bit-level ...
  23. [23]
    8B/10B Encoding: Advantages and Disadvantages
    8B/10B is a block coding technique that, similar to 4B/5B encoding, uses redundancy to ensure synchronization and improve performance.<|control11|><|separator|>
  24. [24]
    Fiber Distributed/Copper Distributed Data Interface (FDDI/CDDI)
    FDDI uses a dual-ring architecture to provide redundancy. Sub-Protocols. Single Attachment Stations (SAS); 4B/5B encoding; Dual Attachment Stations (DAS) ...
  25. [25]
    [PDF] FDDI
    After 4b/5b encoding, the FDDI signal has a bit rate of 125 Mb/s. With ... The FDDI protocols are described in a number of ANSI standards and working documents.<|control11|><|separator|>
  26. [26]
    [PDF] Introduction to Fast Ethernet - Contemporary Controls
    Although the words “Fast Ethernet” are not used, the. IEEE 802.3u was adopted as the Fast Ethernet standard ... 4B/5B encoding. 4B/5B. Data transfers over ...
  27. [27]
    AES Standard » AES10-2020 - Audio Engineering Society
    AES10-2020: AES Recommended Practice for Digital Audio Engineering - Serial Multichannel Audio Digital Interface (MADI) ... 4B5B coding scheme. (18 pages).
  28. [28]
    [PDF] Serial Multichannel Audio Digital Interface (MADI)
    This standard describes the data organization for a multichannel audio digital interface. It includes a bit-level description, features in common with the ...
  29. [29]
    [PDF] The ATM Forum
    (R) The line symbol rate shall be 32 Mbaud ± 100 ppm. Due to the use of the 4B5B block code, the bit rate is 25.6 Mbit/sec ± 100 ppm. receive directions.
  30. [30]
    [PDF] ATM Multimode Fiber Transceivers for SONET OC-3/SDH STM-1 in ...
    SONET OC-3 (SDH STM-1) physical layers for ATM and other services ... 4B/5B encoded physical layer per the FDDI PMD standard. Transmitter Sections.