Fact-checked by Grok 2 weeks ago

Head-of-line blocking

Head-of-line blocking (HOL blocking) is a performance-limiting in computer where a delay or loss affecting the first packet or request in a prevents subsequent packets or requests from being processed, even if they could otherwise proceed independently. This issue arises in various queuing systems, including switches, protocols, and application-layer communications, leading to increased and reduced throughput. In packet-switched networks, HOL blocking commonly occurs at input of switches or routers when destined for a congested output block for other uncongested outputs on the same input , resulting in or added delay on otherwise idle paths. This effect is particularly pronounced in shared-medium environments like LANs, where verifying the absence of HOL blocking is a key metric for device performance. At the , protocols like exacerbate HOL blocking due to their ordered delivery semantics: a lost packet requires retransmission before any subsequent data in the stream can be delivered, stalling the entire regardless of the independence of the affected data. In contrast, the (SCTP) mitigates this by supporting multiple independent streams within a single association, allowing delivery of data from unaffected streams while a head packet on one stream is delayed. Application-layer protocols such as suffer from HOL blocking during request pipelining, where a slow or stalled response on a shared blocks subsequent responses, prompting clients to open multiple parallel as a . addresses this through stream multiplexing over a single , enabling interleaved, independent processing of multiple requests and responses to avoid blocking across streams. Similarly, the transport protocol, built on , further reduces HOL blocking by isolating effects to specific streams within a multiplexed , using independent per-stream ordering and acknowledgments to ensure only affected data on that stream awaits retransmission.

Fundamentals

Definition and Causes

Head-of-line (HOL) blocking occurs when a packet or positioned at the front of a cannot be immediately or forwarded due to contention or unavailability at its intended destination, thereby delaying all subsequent items in the despite those items potentially being ready for by available downstream resources. The primary causes of HOL blocking stem from the strict first-in, first-out () queuing discipline commonly employed in packet-switched systems, which enforces sequential regardless of individual packet destinations or resource states. This discipline leads to inefficiencies when the head packet is stalled by a busy output or link, as seen in input-queued architectures where a single shared per input exacerbates the issue. Additional triggers include inherent dependencies in ordered data streams that mandate in-sequence delivery and scenarios, such as when the leading item monopolizes shared buffers or transmission links, preventing access for trailing packets that could otherwise proceed. A basic illustration of HOL blocking appears in a router's input operating under : if the foremost packet is destined for a congested output , it holds up packets immediately behind it that are targeted at unoccupied output ports, even though those latter packets could be transmitted without delay. This effect limits overall system throughput, as subsequent packets remain idle despite available capacity elsewhere. In queuing theory, such blocking contributes to increased average waiting times, consistent with relating queue length to arrival rate and delay.

Mechanisms in Queuing Systems

Head-of-line (HOL) blocking prominently affects input queuing systems in packet switches, where each input maintains a single for all outgoing packets regardless of their destination output . In such architectures, a packet at the head of the destined for a congested output can subsequent packets in the same that are destined for available output s, preventing them from being forwarded even though the switch fabric could otherwise support their transmission. This phenomenon arises because the switch can typically transfer only one packet per input per time slot, leading to inefficient utilization of the internal fabric. In contrast, output queuing places dedicated queues at each output , eliminating HOL blocking within the queuing discipline since packets are buffered based on their destination, allowing independent processing without intra- dependencies. Virtual output queues (VOQs) represent a mitigation strategy in input queuing designs, where each input port maintains separate queues for packets destined to each possible output port, thereby preventing the classic HOL blocking across different destinations from the same input. With VOQs, a blocked head packet in one output-specific queue does not impede packets in other VOQs at the same input, enabling higher fabric utilization. However, VOQs do not fully eliminate vulnerability to HOL blocking, as contention among multiple inputs for the same output can still cause delays if scheduling algorithms fail to resolve conflicts efficiently, potentially leading to suboptimal matching in crossbar fabrics. The propagation of HOL blocking occurs as the head packet monopolizes critical resources, such as buffer space or link bandwidth, thereby starving tail packets that could otherwise proceed immediately. In a queue, this manifests as the head packet's service time extending the wait for all subsequent packets, regardless of their individual service requirements or destination availability. The severity of this blocking can be assessed by the proportion of a packet's total queuing time spent idle due to the head packet's delay, highlighting the inefficiency introduced by strict ordering. For instance, in a simple single-server , the delay experienced by the nth arriving packet includes the cumulative service times of all preceding n-1 packets, even if the nth packet's service could be expedited under a different . To illustrate this mathematically, consider an M/M/1 model as a basic abstraction for a queuing system prone to HOL effects, where arrivals follow a Poisson process with rate λ and service times are with rate μ > λ. The average queuing delay W for a packet is given by the sum of the residual service time of the packet in service upon arrival and the service times of packets already queued ahead, yielding: W = \frac{\lambda}{\mu(\mu - \lambda)} This formula captures how the head packet's service contributes to the wait of all followers, amplifying delays under load ρ = λ/μ approaching 1, where HOL-like dependencies dominate the system's behavior. In this model, the blocking effect is inherent to the single-queue discipline, mirroring the resource starvation in more complex switch queues. In multi-stage queuing systems, such as those employing wormhole routing in interconnection networks, HOL blocking can cascade across stages, creating dependency chains that propagate delays backward and amplify congestion. Under wormhole routing, packets are transmitted as continuous streams of flits, and if the head flit of a packet is blocked at an intermediate stage due to contention for a downstream link, the entire packet—including tail flits—remains stalled, occupying buffers and channels across multiple stages. This leads to higher-order HOL blocking, where a single blockage at one stage induces cascading effects in upstream queues, particularly under nonuniform traffic patterns, resulting in severe throughput degradation and potential network saturation. Such chains exacerbate the initial blocking by creating feedback loops that affect unrelated flows sharing the path.:1025-1034))

Impacts in Networking

Performance Degradation

Head-of-line (HOL) blocking significantly reduces link utilization in networking systems, particularly in input-queued switches employing queues. Under uniform random traffic assumptions, HOL blocking limits the maximum achievable throughput to approximately 58.6% of the physical link capacity, resulting in up to 41% loss in efficiency due to packets being stalled behind those destined for busy outputs. HOL blocking also amplifies , particularly tail latency, where the 99th delays spike dramatically from bursty head packets that hold up queues. In large-scale networks, this effect exacerbates congestion by promoting spreading—where localized bottlenecks affect unrelated flows—contributing to , the excessive queuing that inflates end-to-end delays across the system. Techniques to mitigate HOL, such as virtual output queuing, are essential to curb these tail latency amplifications in distributed environments.

Out-of-Order Delivery Effects

In systems that mandate in-order delivery, such as those employing sequence numbers for reassembly, head-of-line (HOL) blocking exacerbates delays when packets arrive . A blocked head packet forces subsequent out-of-order packets to remain queued in resequencing buffers until the missing head arrives, thereby amplifying the overall head-of-line wait time for the entire stream. This interaction arises because reordering—often due to variable path latencies—prevents immediate delivery, turning a simple HOL delay into a compounded stall that affects all pending packets behind the gap. The presence of HOL blocking further strains resequencing buffers, as out-of-order packets accumulate while awaiting the head, potentially leading to if the fixed is exceeded. In such scenarios, excess packets may be dropped, necessitating retransmissions that worsen and delay. This overflow risk is particularly acute in bandwidth-asymmetric environments, where faster paths deliver packets more rapidly than slower ones, filling buffers disproportionately and stalling the entire flow until reassembly can proceed. A representative example occurs in multipath routing protocols, where traffic is distributed across multiple heterogeneous paths to exploit available . If the head packet on a delayed path lags behind packets from faster paths, those subsequent out-of-order packets cannot be delivered until the sequence gap is filled, blocking the receiver's progress and increasing end-to-end latency—sometimes by hundreds of milliseconds in variable-delay networks like WiFi-LTE aggregates. The scale of this issue is quantified by the reordering size requirement, which must accommodate the maximum out-of-order gap—the largest sequence number difference between consecutive in-order deliverable packets—multiplied by the average packet size to prevent drops. This demand escalates with the frequency of HOL incidents, as persistent blocking widens gaps and necessitates larger allocations to maintain reliable in ordered systems.

Applications in Network Devices

Switches and Buffers

Head-of-line (HOL) blocking manifests prominently in switch architectures that rely on input queuing, particularly shared memory switches where packets from multiple input ports are funneled into a central shared buffer for processing and forwarding. In these designs, each input port maintains a queue, but the head packet at an input can block subsequent packets destined for different outputs if the target output is congested, leading to inefficient buffer utilization across the switch fabric. This issue arises because the shared memory must arbitrate access synchronously, causing idle cycles when the HOL packet cannot proceed. Crosspoint buffered switches, which incorporate small buffers at the crosspoints of the switching fabric to enable non-blocking operation, mitigate input HOL blocking through distributed buffering. Buffer dynamics in these switches amplify HOL blocking, as the head packet from a congested can stall packets from other that have available paths to their destinations. For instance, in a scenario with multiple sharing a central buffer, a single HOL packet awaiting a busy output prevents the switch from forwarding packets from underutilized , resulting in widespread underutilization of the fabric and increased for non-blocked flows. Historical observations of HOL blocking in Ethernet switches, which often used simple input-queued fabrics, highlighted these inefficiencies, prompting the widespread adoption of virtual output queuing (VOQ) in the 2000s to segregate queues by destination and eliminate input-side HOL contention.

Routers and Forwarding Planes

In input-queued routers, head-of-line (HOL) blocking occurs when the packet at the front of an input queue is delayed—often due to contention for the switch fabric or output port—preventing subsequent packets in the same queue from advancing, even if their destinations are available. This phenomenon is exacerbated during (LPM) operations, where the head packet's destination address lookup consumes processing resources on the , serializing access for trailing packets and reducing overall throughput. Under uniform traffic patterns, HOL blocking limits the maximum throughput of input-queued switches to approximately 58.6% of capacity without mitigations like virtual output queuing (VOQ). Output-queued routers address HOL blocking by placing buffers directly at the output ports, allowing multiple input ports to forward packets to a given output without input interference. However, this design necessitates an internal factor—typically equal to the number of input ports—to prevent bottlenecks when multiple packets arrive simultaneously for the same output, ensuring full throughput utilization. Such architectures, while effective, increase hardware complexity and cost, particularly in high-port-count systems. In the forwarding plane, HOL blocking impacts packet processing pipelines where the head packet's route computation or (QoS) classification delays the entire line card queue, as lookups and modifications (e.g., TTL decrement or header checksumming) are often serialized per packet. For instance, in routers employing MPLS, delays in label imposition or for the head packet can propagate blocking to subsequent packets sharing the during transit forwarding. routers mitigate these effects through VOQ implementations in , which dedicate virtual queues per output to isolate traffic and eliminate cross-port HOL interference in the switch fabric. In modern (SDN) routers emerging post-2010, centralized control planes handle route computation, but distributed data planes in programmable switches continue to encounter HOL blocking within input queues during high-speed . Techniques such as per-flow queuing in programmable data planes help alleviate this by parallelizing processing and reducing serialization, though residual HOL risks persist in congested scenarios without sufficient queue granularity.

Applications in Transport Protocols

Reliable Byte Streams

Reliable byte stream protocols, such as , abstract application data as a continuous, ordered sequence of bytes, ensuring reliable delivery through mechanisms like sequence numbering and retransmissions. This semantic requires the receiver to reassemble segments in strict order before passing data to the application, leading to head-of-line (HOL) blocking when a lost or delayed segment at the front of the stream prevents delivery of subsequent, correctly received segments. In , acknowledgments (ACKs) are cumulative, confirming all bytes up to a specific sequence number, while the sender's congestion window limits outstanding unacknowledged data. A loss of the head-of-line segment triggers the sender's window to stall, as further transmissions are blocked until the missing segment is retransmitted and acknowledged, even if later segments arrive out of order and are buffered at the . This exacerbates HOL blocking across the , halting progress regardless of path diversity or subsequent packet success. For instance, in Reno implementations prevalent in the 1990s, a single could impose delays of up to 400 ms at RTT=100-300 ms and low loss rates (≤2%), depending on loss detection via duplicate ACKs or timeouts, as observed in studies of TCP performance for delay-sensitive applications. The additional stream delay due to HOL blocking in such protocols can be approximated as \text{RTT} \times \left(1 + \frac{\text{loss\_rate}}{1 - \text{loss\_rate}}\right) under cumulative ACK semantics, reflecting the geometric expectation of retransmission attempts for the head segment before successful acknowledgment.

Connection-Oriented Protocols

In connection-oriented protocols, head-of-line (HOL) blocking arises from the need to maintain ordered delivery and manage shared resources across streams or paths, leading to that propagate beyond individual flows. While these protocols establish persistent associations to ensure reliability, mechanisms like sequence numbering and congestion control can inadvertently stall progress when one component experiences or . This issue is particularly pronounced in multi-stream and multi-path extensions, where independence between flows is incomplete. In the Transmission Control Protocol (), HOL blocking manifests during connection setup when a packet is lost or delayed, preventing the three-way handshake from completing and thereby blocking subsequent data transmission on that endpoint until retransmission succeeds or times out. This delay affects the entire connection, as no application data can be exchanged until the association is established, potentially queuing multiple pending handshakes from the same source if resources are constrained. The (SCTP), defined in RFC 4960, introduces multi-streaming and multi-homing to mitigate TCP's single-stream limitations, allowing multiple independent streams within an association and across multiple paths for . Congestion control is applied per path, but delays or losses on one path—such as during multi-homing from a primary to a secondary address—can indirectly cause HOL blocking that affects across streams by requiring path switching and potentially limiting overall throughput during the transition. Within each stream, ordered requires reassembly of out-of-order chunks, holding subsequent until gaps are filled, though unordered (via the U-bit ) avoids this per-stream. The effective blocking in multi-homed SCTP associations can thus scale with the maximum path delay, as shared control mechanisms propagate slowdowns from the slowest component. Multipath TCP (MPTCP), as specified in 8684, extends to aggregate across multiple subflows while preserving a single ordered byte stream at the connection level. HOL blocking in one subflow—due to or higher —propagates to the main connection because from all subflows must be reassembled in sequence using 64-bit data sequence numbers (DSNs), stalling of subsequent until the missing segment is retransmitted, potentially on a different subflow. Receivers mitigate this with buffers sized to the maximum round-trip time (RTT) multiplied by aggregate , but persistent subflow issues can still degrade overall by forcing reliance on slower paths. QUIC, standardized in RFC 9000 (developed post-2012), addresses these challenges through stream multiplexing over , enabling multiple independent streams per connection with per-stream flow control and no inter-stream ordering dependencies. This design avoids HOL blocking across streams, as a lost packet impacts only the streams carrying data within it, allowing others to proceed without delay. However, within a single stream, losses still require reassembly before delivery, and if multiple streams share a lost packet, their progress is jointly blocked until retransmission, providing only a partial fix compared to fully isolated flows.

Applications in Web Protocols

HTTP/1.x Limitations

In HTTP/1.0 and HTTP/1.1, the protocol operates over a single connection for multiple requests, enforcing a sequential request-response model that inherently introduces head-of-line (HOL) blocking. When a client issues requests for webpage resources such as , CSS, images, and scripts, each response must be fully received before the next can be processed on the same connection, as the protocol lacks true . A slow or delayed response—such as for a large or server-intensive CSS file—blocks subsequent resources, even if they are ready to transmit, leading to unnecessary latency in resource delivery. To mitigate this, HTTP/1.1 introduced pipelining in RFC 2616 (1999), allowing clients to send multiple requests without waiting for prior responses over a persistent connection. However, pipelining requires servers to deliver responses in the exact order of requests, amplifying HOL blocking when response times vary due to network conditions or server processing. This issue, combined with inconsistent server implementations, intermediary bugs, and resource exhaustion risks, led to widespread non-adoption by browsers; for instance, major browsers like and disabled pipelining by default, favoring domain sharding and multiple connections instead. A practical example occurs when loading a modern webpage with around 10 resources, where a delayed CSS file at the head of the can block JavaScript execution and rendering, adding 200-500 milliseconds of depending on variability and resource sizes. In such cases, the HOL delay quantifies as the cumulative response times of blocked requests minus the time achievable through parallel connections, effectively turning a potentially concurrent fetch into a serialized wait. This metric highlights how HOL blocking undermines the performance gains from persistent connections, often resulting in stalled page loads until the slowest resolves.

HTTP/2 and HTTP/3 Improvements

, as defined in 7540 published in May , introduces a framing layer that breaks HTTP messages into independent frames, enabling over a single . This design allows multiple request-response streams to interleave and process concurrently, each identified by a unique stream ID, thereby eliminating head-of-line (HOL) blocking at the HTTP protocol level that plagued earlier versions like HTTP/1.x. However, since still relies on for transport, it inherits TCP's HOL blocking, where a single lost packet can delay delivery across all multiplexed streams until retransmitted. In practice, HTTP/2's multiplexing enables efficient parallel loading of resources; for instance, a web page requiring simultaneous fetches of images, stylesheets, and JavaScript can process these streams independently over one connection, avoiding the sequential delays and multiple TCP handshakes of HTTP/1.x. Laboratory benchmarks demonstrate that HTTP/2 reduces page load times by 20-50% compared to HTTP/1.1, primarily through decreased connection overhead and better resource utilization. HTTP/3, standardized in RFC 9114 in June 2022, addresses HTTP/2's remaining limitations by mapping HTTP semantics over , a UDP-based transport protocol. incorporates per-stream encryption, congestion control, and reliable delivery, allowing independent loss recovery for each stream without impacting others, thus fully mitigating transport-level HOL blocking. This decoupling ensures that or reordering on one stream—common in mobile or lossy networks—does not stall unrelated streams, enhancing overall protocol resilience. For example, in , a webpage with multiplexed resources can continue delivering content from unaffected streams even if packets for a specific image are lost, preventing widespread stalls that would occur in over . Deployments in mobile environments, such as ride-sharing applications, report additional performance gains of 10-30% in tail-end latencies over , particularly under cellular conditions with variable connectivity. As of 2024, accounted for 20.5% of global web requests, according to data, indicating growing adoption.

Mitigation Techniques

General Strategies

Detection of head-of-line (HOL) blocking typically involves queue performance metrics, such as the ratio of wait times for packets at the head of the compared to those deeper in the , which highlights delays caused by a stalled head packet affecting subsequent ones. In practice, tools like the traffic control () utility enable the collection of queue statistics, including backlog sizes and packet drop rates, to identify patterns indicative of HOL conditions where queues build up without proportional drops. To reduce HOL blocking, scheduling algorithms such as (PQ) allow higher-priority packets to preempt or bypass blocked lower-priority heads, ensuring critical traffic progresses despite congestion at the front. (WFQ) extends this by apportioning bandwidth proportionally among flows based on weights, preventing a single blocked flow from indefinitely delaying others. Additionally, buffer slicing per flow—often implemented as per-flow or —allocates dedicated buffer space to individual flows or destinations, confining HOL effects to within a specific flow rather than the entire queue. Hardware implementations in application-specific integrated circuits () for switches commonly employ virtual output queues (VOQs), where each input maintains separate queues for every possible output port, thereby eliminating HOL blocking across different output destinations. On the software side, extended () programs facilitate dynamic packet reordering within the kernel networking stack, allowing real-time adjustments to queue order and reducing HOL delays in software-defined environments. A representative example is Cisco's (DRR) scheduler, which approximates by cycling through queues with a deficit counter to track service credits, thereby mitigating the unfair service that exacerbates HOL blocking in shared queues.

Protocol Evolutions

The evolution of transport protocols has addressed head-of-line (HOL) blocking primarily through the introduction of multi-streaming capabilities, departing from TCP's strict in-order delivery requirement that causes a single lost packet to stall the entire connection. The (SCTP), standardized in 2960 in 2000, was designed for transporting signaling messages over while supporting multiple independent streams within a single association, allowing data from different streams to be delivered without blocking others even if packets are lost or reordered. This multi-streaming feature decouples stream-specific ordering from overall transmission, preventing the HOL delays inherent in TCP, particularly beneficial in error-prone networks like signaling. Similarly, , defined in 9000 in 2021, builds on to enable independent streams multiplexed over a single connection, where affects only the streams carrying data in that packet, explicitly avoiding TCP-like HOL blocking across streams. At the , protocol shifts have leveraged these transport advancements to minimize HOL impacts in service-oriented architectures. , built over , employs multiplexing to handle multiple remote procedure calls (RPCs) as independent streams on a single connection, avoiding the sequential request-response HOL issues common in RESTful HTTP/1.1 designs where a delayed response blocks subsequent ones. WebSockets, introduced in RFC 6455 in 2011, establish persistent, full-duplex connections for bidirectional messaging, reducing the HOL risks from repeated connection setups in traditional HTTP polling and enabling continuous data flow without per-message handshakes, though still subject to underlying HOL. Awareness of TCP's HOL blocking emerged in the alongside the protocol's widespread adoption for , prompting research into alternatives; this led to multiplexing standards in the , such as (RFC 7540, 2015), and culminated in QUIC's standardization. By 2025, QUIC has achieved widespread adoption, powering over 85% of subsequent fetches in and around 45% in , driven by its integration into HTTP/3. These evolutions have significantly reduced HOL-induced delays compared to in high-loss environments, enabling faster page loads and lower latency in mobile and satellite networks.

References

  1. [1]
    RFC 2285 - Benchmarking Terminology for LAN Switching Devices
    RFC 2285 Benchmarking Terminology February 1998 ; 3.7.3 Head of line blocking ...
  2. [2]
  3. [3]
  4. [4]
    RFC 2960 - Stream Control Transmission Protocol - IETF Datatracker
    In both of these cases the head-of-line blocking offered by TCP causes unnecessary delay. -- The stream-oriented nature of TCP is often an inconvenience ...
  5. [5]
  6. [6]
  7. [7]
    [PDF] The iSLIP scheduling algorithm for input-queued switches
    There is a popular perception that input- queued switches suffer from inherently low performance due to head-of-line (HOL) blocking. HOL blocking arises when.
  8. [8]
    [PDF] Avoiding Head of Line Blocking in Directional Antenna
    Thus, existing implementations which use a single FIFO queue potentially leads to Head of Line blocking if the medium is busy in the direction of the packet at ...
  9. [9]
  10. [10]
    [PDF] Input Versus Output Queueing on a Space-Division Pack& Switch
    Abstract-Two simple models of queueing on an N X N space-division packet switch are examined. The switch operates synchronously with.
  11. [11]
    [PDF] An Optimal Solution to Head-of-Line Blocking - Microsoft
    Al- though this seems high, it is actually correct since a single packet loss causes a large number of packets to be delayed due to head-of-line blocking.
  12. [12]
    [PDF] Revisiting Congestion Control for Lossless Ethernet - USENIX
    Apr 18, 2024 · Testbed and large-scale simulations demonstrate that ACC ameliorates fundamental issues in lossless Ethernet. (e.g., congestion spreading, HoL ...
  13. [13]
    [PDF] The tail at scale - Luiz André Barroso
    We explore how these techniques allow sys- tem utilization to be driven higher with- out lengthening the latency tail, thus avoiding wasteful overprovisioning.
  14. [14]
    [PDF] Tackling the Challenge of Bufferbloat in Multi-Path Transport over ...
    Receive window limitation and head of line (HOL) blocking are the two main factors impact- ing performance. Both are shortly introduced here. In order to ...<|control11|><|separator|>
  15. [15]
    Low Delay Random Linear Coding and Scheduling Over Multiple ...
    Jul 30, 2015 · When the delay on one or more of the paths is variable, as is commonly the case, out of order arrivals are frequent and head of line blocking ...Missing: effects | Show results with:effects
  16. [16]
    [PDF] Input Versus Output Queueing on a Space-Division Pack& Switch
    Abstract-Two simple models of queueing on an N X N space-division packet switch are examined. The switch operates synchronously with fixed-length packets ...
  17. [17]
    [PDF] Output-Based Shared-Memory Crosspoint-Buffered Packet Switch ...
    In these switches, an input has N virtual output queues to avoid head-of-line blocking [2]. The crosspoint buffers in CICB switches can provide call splitting ...
  18. [18]
    Advanced switch memory architectures improve network performance
    Oct 28, 2010 · To get around this blocking issue, also known as head of line (HOL) blocking, chip architects include virtual output queues at every switch ...
  19. [19]
    6500 line card and 'Head of line' blocking. - Cisco Community
    Feb 5, 2007 · Head of Line (HOL) blocking uses interface buffers, not shared ones. Disabling it can cause more packet loss on the port, but moves drops to ...Missing: ratio | Show results with:ratio
  20. [20]
    Cisco Nexus 5548P Switch Architecture
    Sep 23, 2010 · The Cisco Nexus 5548P is a one-rack-unit (1RU), 1 and 10 Gigabit Ethernet and FCoE access-layer switch built to provide 960 Gbps of throughput with very low ...
  21. [21]
  22. [22]
    [PDF] 15-441 Computer Networking Lecture 14: Router Design
    Oct 16, 2010 · Head-of-Line Blocking. Problem: The packet at the front of the ... Longest-prefix match (not exact). 2. Tables are large and growing. 3 ...
  23. [23]
    Understand Virtual Output Queues | Junos OS - Juniper Networks
    VOQ architecture eliminates head-of-line blocking (HOLB) issues. On non-VOQ devices, HOLB occurs when congestion at an egress port affects a different egress ...
  24. [24]
    [PDF] Relaxing state-access constraints in stateful programmable data ...
    an event may generate head-of-line blocking, where all packets in a queue are held by the first one. The problem can be alleviated by adding more queues ...
  25. [25]
    [PDF] Institute of Communication Networks and Computer Engineering
    We analyze how the impact of head-of-line blocking can be mitigated by using several parallel TCP connections,. SCTP multistreaming, or SCTP unordered mode, ...
  26. [26]
    [PDF] The Delay-Friendliness of TCP - CS@Columbia
    TCP uses a re- transmission to recover the lost packet, which in turn yields head-of-line blocking delay at the receiver. The receipt of a packet loss ...
  27. [27]
    Building Blocks of TCP - High Performance Browser Networking
    This effect is known as TCP head-of-line (HOL) blocking. The delay imposed by head-of-line blocking allows our applications to avoid having to deal with packet ...§three-Way Handshake · §slow-Start · §head-Of-Line Blocking<|control11|><|separator|>
  28. [28]
  29. [29]
  30. [30]
  31. [31]
  32. [32]
  33. [33]
  34. [34]
  35. [35]
  36. [36]
    HTTP/1.X - High Performance Browser Networking (O'Reilly)
    What if the first request hangs indefinitely or simply takes a very long time to generate on the server? With HTTP/1.1, all requests behind it are blocked and ...
  37. [37]
  38. [38]
    Connection management in HTTP/1.x - MDN Web Docs
    Jul 4, 2025 · HTTP pipelining therefore brings a marginal improvement in most cases only. Pipelining is subject to the head-of-line blocking. For these ...
  39. [39]
  40. [40]
    HTTP/2 - High Performance Browser Networking (O'Reilly)
    We have eliminated head-of-line blocking from HTTP, but there is still head-of-line blocking at the TCP level (see Head-of-Line Blocking). Effects of ...
  41. [41]
    RFC 9114: HTTP/3
    The QUIC transport protocol incorporates stream multiplexing and per-stream ... head-of-line blocking can be caused by compression. This allows an encoder ...
  42. [42]
    HTTP/3 and QUIC — prioritization and head-of-line blocking
    Nov 30, 2022 · QUIC relieves HOL blocking during loss as it is stream-aware such that it knows which specific streams have been affected. In high (random) loss ...Missing: RFC 9114
  43. [43]
    tc(8) - Linux manual page - man7.org
    Tc is used to configure Traffic Control in the Linux kernel. Traffic Control ... It has been added for hardware that wishes to avoid head-of-line blocking.
  44. [44]
  45. [45]
    [PDF] Matching Output Queueing with a Combined Input Output Queued ...
    It is well-known that if each input maintains a single FIFO, then HOL blocking can limit the throughput to just 58.6% [5]. ... WFQ and Strict Priority queueing.
  46. [46]
    Queuing Mechanisms in Modern Switches - ipSpace.net blog
    May 27, 2014 · Virtual output queues solve the head-of-line (HoL) blocking between input ports (traffic received on one port cannot block traffic received on ...
  47. [47]
    Cisco Silicon One Q200 (Cisco Catalyst 9500X and 9600X) QoS ...
    Mar 17, 2023 · VoQ is a technique to address the HoL block phenomenon with ingress buffering, as explained above. VoQ is a technique in switch architecture ...
  48. [48]
    [PDF] Efficient Fair Queuing Using Deficit Round-Robin - Stanford University
    A number of readers have conjectured that DRR should reduce to BR when the quantum size is one bit. This is not true. The two schemes have radically ...
  49. [49]
    RFC 9000 - QUIC: A UDP-Based Multiplexed and Secure Transport
    ... head-of-line blocking across multiple streams. When a packet loss occurs, only streams with data in that packet are blocked waiting for a retransmission to ...
  50. [50]
    gRPC on HTTP/2 Engineering a Robust, High-performance Protocol
    Aug 20, 2018 · In this article, we'll look at how gRPC builds on HTTP/2's long-lived connections to create a performant, robust platform for inter-service communication.<|separator|>
  51. [51]
    HTTP/2 and GRPC: The De Facto for Microservices Communication
    Apr 4, 2022 · The interleaved requests and responses can run in parallel without blocking the messages behind them, a process called multiplexing.
  52. [52]
    [PDF] Does QUIC make the Web faster ? - Department of Computer Science
    We find QUIC to perform better overall under poor network conditions (low bandwidth, high latency and high loss), for e.g. more than 90% of synthetic pages ...
  53. [53]
    A QUIC progress report - APNIC Blog
    Jun 17, 2025 · Of interest in Table 2 is the 45% rate of Safari clients who do not use QUIC, as compared to the 11% rate of Chrome clients who do not use QUIC.