Fact-checked by Grok 2 weeks ago

Head-of-line blocking

Head-of-line blocking (HOL blocking) is a performance-limiting phenomenon in computer networking where a delay or loss affecting the first packet or request in a queue prevents subsequent packets or requests from being processed, even if they could otherwise proceed independently. This issue arises in various queuing systems, including network switches, transport protocols, and application-layer communications, leading to increased latency and reduced throughput.^[1]^[2] In packet-switched networks, HOL blocking commonly occurs at input ports of switches or routers when frames destined for a congested output port block frames for other uncongested outputs on the same input interface, resulting in frame loss or added delay on otherwise idle paths. This effect is particularly pronounced in shared-medium environments like LANs, where verifying the absence of HOL blocking is a key benchmarking metric for device performance.^[3] At the transport layer, protocols like TCP exacerbate HOL blocking due to their ordered delivery semantics: a lost packet requires retransmission before any subsequent data in the stream can be delivered, stalling the entire connection regardless of the independence of the affected data. In contrast, the Stream Control Transmission Protocol (SCTP) mitigates this by supporting multiple independent streams within a single association, allowing delivery of data from unaffected streams while a head packet on one stream is delayed.^[4] Application-layer protocols such as HTTP/1.1 suffer from HOL blocking during request pipelining, where a slow or stalled response on a shared TCP connection blocks subsequent responses, prompting clients to open multiple parallel connections as a workaround. HTTP/2 addresses this through stream multiplexing over a single TCP connection, enabling interleaved, independent processing of multiple requests and responses to avoid blocking across streams. Similarly, the QUIC transport protocol, built on UDP, further reduces HOL blocking by isolating packet loss effects to specific streams within a multiplexed connection, using independent per-stream ordering and acknowledgments to ensure only affected data on that stream awaits retransmission.^[5]^[6]

Fundamentals

Definition and Causes

Head-of-line (HOL) blocking occurs when a packet or message positioned at the front of a queue cannot be immediately processed or forwarded due to contention or unavailability at its intended destination, thereby delaying all subsequent items in the queue despite those items potentially being ready for processing by available downstream resources.^[7] The primary causes of HOL blocking stem from the strict first-in, first-out (FIFO) queuing discipline commonly employed in packet-switched systems, which enforces sequential processing regardless of individual packet destinations or resource states.^[7] This discipline leads to inefficiencies when the head packet is stalled by a busy output or link, as seen in input-queued architectures where a single shared queue per input port exacerbates the issue.^[8] Additional triggers include inherent dependencies in ordered data streams that mandate in-sequence delivery and resource contention scenarios, such as when the leading item monopolizes shared buffers or transmission links, preventing access for trailing packets that could otherwise proceed.^[7] A basic illustration of HOL blocking appears in a router's input queue operating under FIFO: if the foremost packet is destined for a congested output port, it holds up packets immediately behind it that are targeted at unoccupied output ports, even though those latter packets could be transmitted without delay.^[7] This effect limits overall system throughput, as subsequent packets remain idle despite available capacity elsewhere. In queuing theory, such blocking contributes to increased average waiting times, consistent with Little's Law relating queue length to arrival rate and delay.^[7]

Mechanisms in Queuing Systems

Head-of-line (HOL) blocking prominently affects input queuing systems in packet switches, where each input port maintains a single FIFO queue for all outgoing packets regardless of their destination output port. In such architectures, a packet at the head of the queue destined for a congested output port can block subsequent packets in the same queue that are destined for available output ports, preventing them from being forwarded even though the switch fabric could otherwise support their transmission. This phenomenon arises because the switch can typically transfer only one packet per input port per time slot, leading to inefficient utilization of the internal fabric. In contrast, output queuing places dedicated queues at each output port, eliminating HOL blocking within the queuing discipline since packets are buffered based on their destination, allowing independent processing without intra-port dependencies.^[9] Virtual output queues (VOQs) represent a mitigation strategy in input queuing designs, where each input port maintains separate queues for packets destined to each possible output port, thereby preventing the classic HOL blocking across different destinations from the same input. With VOQs, a blocked head packet in one output-specific queue does not impede packets in other VOQs at the same input, enabling higher fabric utilization. However, VOQs do not fully eliminate vulnerability to HOL blocking, as contention among multiple inputs for the same output can still cause delays if scheduling algorithms fail to resolve conflicts efficiently, potentially leading to suboptimal matching in crossbar fabrics. The propagation of HOL blocking occurs as the head packet monopolizes critical resources, such as buffer space or link bandwidth, thereby starving tail packets that could otherwise proceed immediately. In a FIFO queue, this manifests as the head packet's service time extending the wait for all subsequent packets, regardless of their individual service requirements or destination availability. The severity of this blocking can be assessed by the proportion of a packet's total queuing time spent idle due to the head packet's delay, highlighting the inefficiency introduced by strict ordering. For instance, in a simple single-server queue, the delay experienced by the nth arriving packet includes the cumulative service times of all preceding n-1 packets, even if the nth packet's service could be expedited under a different discipline.^[9] To illustrate this mathematically, consider an M/M/1 queue model as a basic abstraction for a FIFO queuing system prone to HOL effects, where arrivals follow a Poisson process with rate λ and service times are exponential with rate μ > λ. The average queuing delay W for a packet is given by the sum of the residual service time of the packet in service upon arrival and the service times of packets already queued ahead, yielding:

W = \frac{\lambda}{\mu(\mu - \lambda)}

This formula captures how the head packet's service contributes to the wait of all followers, amplifying delays under load ρ = λ/μ approaching 1, where HOL-like dependencies dominate the system's behavior. In this model, the blocking effect is inherent to the single-queue discipline, mirroring the resource starvation in more complex switch queues.^[9] In multi-stage queuing systems, such as those employing wormhole routing in interconnection networks, HOL blocking can cascade across stages, creating dependency chains that propagate delays backward and amplify congestion. Under wormhole routing, packets are transmitted as continuous streams of flits, and if the head flit of a packet is blocked at an intermediate stage due to contention for a downstream link, the entire packet—including tail flits—remains stalled, occupying buffers and channels across multiple stages. This leads to higher-order HOL blocking, where a single blockage at one stage induces cascading effects in upstream queues, particularly under nonuniform traffic patterns, resulting in severe throughput degradation and potential network saturation. Such chains exacerbate the initial blocking by creating feedback loops that affect unrelated flows sharing the path.:1025-1034))

Impacts in Networking

Performance Degradation

Head-of-line (HOL) blocking significantly reduces link utilization in networking systems, particularly in input-queued switches employing FIFO queues. Under uniform random traffic assumptions, HOL blocking limits the maximum achievable throughput to approximately 58.6% of the physical link capacity, resulting in up to 41% loss in efficiency due to packets being stalled behind those destined for busy outputs.^[10] HOL blocking also amplifies latency, particularly tail latency, where the 99th percentile delays spike dramatically from bursty head packets that hold up queues. In large-scale networks, this effect exacerbates congestion by promoting spreading—where localized bottlenecks affect unrelated flows—contributing to bufferbloat, the excessive queuing that inflates end-to-end delays across the system. Techniques to mitigate HOL, such as virtual output queuing, are essential to curb these tail latency amplifications in distributed environments.^[11]^[12]

Out-of-Order Delivery Effects

In systems that mandate in-order delivery, such as those employing sequence numbers for reassembly, head-of-line (HOL) blocking exacerbates delays when packets arrive out of order. A blocked head packet forces subsequent out-of-order packets to remain queued in resequencing buffers until the missing head arrives, thereby amplifying the overall head-of-line wait time for the entire stream. This interaction arises because reordering—often due to variable path latencies—prevents immediate delivery, turning a simple HOL delay into a compounded stall that affects all pending packets behind the gap. The presence of HOL blocking further strains resequencing buffers, as out-of-order packets accumulate while awaiting the head, potentially leading to buffer overflow if the fixed buffer capacity is exceeded. In such scenarios, excess packets may be dropped, necessitating retransmissions that worsen congestion and delay. This overflow risk is particularly acute in bandwidth-asymmetric environments, where faster paths deliver packets more rapidly than slower ones, filling buffers disproportionately and stalling the entire flow until reassembly can proceed. A representative example occurs in multipath routing protocols, where traffic is distributed across multiple heterogeneous paths to exploit available bandwidth. If the head packet on a delayed path lags behind packets from faster paths, those subsequent out-of-order packets cannot be delivered until the sequence gap is filled, blocking the receiver's progress and increasing end-to-end latency—sometimes by hundreds of milliseconds in variable-delay networks like WiFi-LTE aggregates.^[13] The scale of this issue is quantified by the reordering buffer size requirement, which must accommodate the maximum out-of-order gap—the largest sequence number difference between consecutive in-order deliverable packets—multiplied by the average packet size to prevent drops. This buffer demand escalates with the frequency of HOL incidents, as persistent blocking widens gaps and necessitates larger allocations to maintain reliable delivery in ordered systems.

Applications in Network Devices

Switches and Buffers

Head-of-line (HOL) blocking manifests prominently in switch architectures that rely on input queuing, particularly shared memory switches where packets from multiple input ports are funneled into a central shared buffer for processing and forwarding. In these designs, each input port maintains a queue, but the head packet at an input can block subsequent packets destined for different outputs if the target output is congested, leading to inefficient buffer utilization across the switch fabric. This issue arises because the shared memory must arbitrate access synchronously, causing idle cycles when the HOL packet cannot proceed.^[14] Crosspoint buffered switches, which incorporate small buffers at the crosspoints of the switching fabric to enable non-blocking operation, mitigate input HOL blocking through distributed buffering.^[15] Buffer dynamics in these switches amplify HOL blocking, as the head packet from a congested input port can stall packets from other inputs that have available paths to their destinations. For instance, in a scenario with multiple inputs sharing a central buffer, a single HOL packet awaiting a busy output prevents the switch from forwarding packets from underutilized inputs, resulting in widespread underutilization of the fabric and increased latency for non-blocked flows. Historical observations of HOL blocking in 1990s Ethernet switches, which often used simple input-queued fabrics, highlighted these inefficiencies, prompting the widespread adoption of virtual output queuing (VOQ) in the 2000s to segregate queues by destination and eliminate input-side HOL contention.^[14]^[16]

Routers and Forwarding Planes

In input-queued routers, head-of-line (HOL) blocking occurs when the packet at the front of an input queue is delayed—often due to contention for the switch fabric or output port—preventing subsequent packets in the same queue from advancing, even if their destinations are available. This phenomenon is exacerbated during longest prefix match (LPM) operations, where the head packet's destination address lookup consumes processing resources on the line card, serializing access for trailing packets and reducing overall throughput. Under uniform traffic patterns, HOL blocking limits the maximum throughput of input-queued switches to approximately 58.6% of capacity without mitigations like virtual output queuing (VOQ).^[17]^[18] Output-queued routers address HOL blocking by placing buffers directly at the output ports, allowing multiple input ports to forward packets to a given output without input queue interference. However, this design necessitates an internal speedup factor—typically equal to the number of input ports—to prevent bottlenecks when multiple packets arrive simultaneously for the same output, ensuring full throughput utilization. Such architectures, while effective, increase hardware complexity and cost, particularly in high-port-count systems. In the forwarding plane, HOL blocking impacts packet processing pipelines where the head packet's route computation or Quality of Service (QoS) classification delays the entire line card queue, as lookups and modifications (e.g., TTL decrement or header checksumming) are often serialized per packet. For instance, in routers employing MPLS, delays in label imposition or swapping for the head packet can propagate blocking to subsequent packets sharing the queue during transit forwarding. Juniper Networks routers mitigate these effects through VOQ implementations in Junos OS, which dedicate virtual queues per output to isolate traffic and eliminate cross-port HOL interference in the switch fabric.^[19] In modern software-defined networking (SDN) routers emerging post-2010, centralized control planes handle route computation, but distributed data planes in programmable switches continue to encounter HOL blocking within input queues during high-speed packet forwarding. Techniques such as per-flow queuing in programmable data planes help alleviate this by parallelizing processing and reducing serialization, though residual HOL risks persist in congested scenarios without sufficient queue granularity.^[20]

Applications in Transport Protocols

Reliable Byte Streams

Reliable byte stream protocols, such as TCP, abstract application data as a continuous, ordered sequence of bytes, ensuring reliable delivery through mechanisms like sequence numbering and retransmissions. This semantic requires the receiver to reassemble segments in strict order before passing data to the application, leading to head-of-line (HOL) blocking when a lost or delayed segment at the front of the stream prevents delivery of subsequent, correctly received segments.^[21] In TCP, acknowledgments (ACKs) are cumulative, confirming all bytes up to a specific sequence number, while the sender's congestion window limits outstanding unacknowledged data. A loss of the head-of-line segment triggers the sender's window to stall, as further transmissions are blocked until the missing segment is retransmitted and acknowledged, even if later segments arrive out of order and are buffered at the receiver. This exacerbates HOL blocking across the connection, halting stream progress regardless of path diversity or subsequent packet success.^[21] For instance, in TCP Reno implementations prevalent in the 1990s, a single packet loss could impose delays of up to 400 ms at RTT=100-300 ms and low loss rates (≤2%), depending on loss detection via duplicate ACKs or timeouts, as observed in studies of TCP performance for delay-sensitive applications.^[22] The additional stream delay due to HOL blocking in such protocols can be approximated as \text{RTT} \times \left(1 + \frac{\text{loss\_rate}}{1 - \text{loss\_rate}}\right) under cumulative ACK semantics, reflecting the geometric expectation of retransmission attempts for the head segment before successful acknowledgment.^[22]

Connection-Oriented Protocols

In connection-oriented protocols, head-of-line (HOL) blocking arises from the need to maintain ordered delivery and manage shared resources across streams or paths, leading to delays that propagate beyond individual flows. While these protocols establish persistent associations to ensure reliability, mechanisms like sequence numbering and congestion control can inadvertently stall progress when one component experiences loss or delay. This issue is particularly pronounced in multi-stream and multi-path extensions, where independence between flows is incomplete. In the Transmission Control Protocol (TCP), HOL blocking manifests during connection setup when a SYN packet is lost or delayed, preventing the three-way handshake from completing and thereby blocking subsequent data transmission on that endpoint until retransmission succeeds or times out. This delay affects the entire connection, as no application data can be exchanged until the association is established, potentially queuing multiple pending handshakes from the same source if resources are constrained.^[23] The Stream Control Transmission Protocol (SCTP), defined in RFC 4960, introduces multi-streaming and multi-homing to mitigate TCP's single-stream limitations, allowing multiple independent streams within an association and failover across multiple paths for redundancy. Congestion control is applied per path, but delays or losses on one path—such as during multi-homing failover from a primary to a secondary address—can indirectly cause HOL blocking that affects delivery across streams by requiring path switching and potentially limiting overall throughput during the transition. Within each stream, ordered delivery requires reassembly of out-of-order chunks, holding subsequent data until gaps are filled, though unordered delivery (via the U-bit flag) avoids this per-stream. The effective blocking in multi-homed SCTP associations can thus scale with the maximum path delay, as shared control mechanisms propagate slowdowns from the slowest component.^[24]^[25]^[26] Multipath TCP (MPTCP), as specified in RFC 8684, extends TCP to aggregate bandwidth across multiple subflows while preserving a single ordered byte stream at the connection level. HOL blocking in one subflow—due to packet loss or higher latency—propagates to the main connection because data from all subflows must be reassembled in sequence using 64-bit data sequence numbers (DSNs), stalling delivery of subsequent data until the missing segment is retransmitted, potentially on a different subflow. Receivers mitigate this with buffers sized to the maximum round-trip time (RTT) multiplied by aggregate bandwidth, but persistent subflow issues can still degrade overall performance by forcing reliance on slower paths.^[27]^[28] QUIC, standardized in RFC 9000 (developed post-2012), addresses these challenges through stream multiplexing over UDP, enabling multiple independent streams per connection with per-stream flow control and no inter-stream ordering dependencies. This design avoids HOL blocking across streams, as a lost packet impacts only the streams carrying data within it, allowing others to proceed without delay. However, within a single stream, losses still require reassembly before delivery, and if multiple streams share a lost packet, their progress is jointly blocked until retransmission, providing only a partial fix compared to fully isolated flows.^[29]^[30]^[31]

Applications in Web Protocols

HTTP/1.x Limitations

In HTTP/1.0 and HTTP/1.1, the protocol operates over a single TCP connection for multiple requests, enforcing a sequential request-response model that inherently introduces head-of-line (HOL) blocking.^[32] When a client issues requests for webpage resources such as HTML, CSS, images, and scripts, each response must be fully received before the next can be processed on the same connection, as the protocol lacks true multiplexing.^[33] A slow or delayed response—such as for a large image or server-intensive CSS file—blocks subsequent resources, even if they are ready to transmit, leading to unnecessary latency in resource delivery.^[34] To mitigate this, HTTP/1.1 introduced pipelining in RFC 2616 (1999), allowing clients to send multiple requests without waiting for prior responses over a persistent connection.^[34] However, pipelining requires servers to deliver responses in the exact order of requests, amplifying HOL blocking when response times vary due to network conditions or server processing.^[33] This issue, combined with inconsistent server implementations, intermediary proxy bugs, and resource exhaustion risks, led to widespread non-adoption by browsers; for instance, major browsers like Chrome and Firefox disabled pipelining by default, favoring domain sharding and multiple connections instead.^[35] A practical example occurs when loading a modern webpage with around 10 resources, where a delayed CSS file at the head of the queue can block JavaScript execution and rendering, adding 200-500 milliseconds of latency depending on network variability and resource sizes.^[33] In such cases, the HOL delay quantifies as the cumulative response times of blocked requests minus the time achievable through parallel connections, effectively turning a potentially concurrent fetch into a serialized wait.^[33] This metric highlights how HOL blocking undermines the performance gains from persistent connections, often resulting in stalled page loads until the slowest resource resolves.^[35]

HTTP/2 and HTTP/3 Improvements

HTTP/2, as defined in RFC 7540 published in May 2015, introduces a binary framing layer that breaks HTTP messages into independent frames, enabling stream multiplexing over a single TCP connection.^[36] This design allows multiple request-response streams to interleave and process concurrently, each identified by a unique stream ID, thereby eliminating head-of-line (HOL) blocking at the HTTP protocol level that plagued earlier versions like HTTP/1.x.^[36] However, since HTTP/2 still relies on TCP for transport, it inherits TCP's HOL blocking, where a single lost packet can delay delivery across all multiplexed streams until retransmitted.^[37] In practice, HTTP/2's multiplexing enables efficient parallel loading of resources; for instance, a web page requiring simultaneous fetches of images, stylesheets, and JavaScript can process these streams independently over one connection, avoiding the sequential delays and multiple TCP handshakes of HTTP/1.x.^[37] Laboratory benchmarks demonstrate that HTTP/2 reduces page load times by 20-50% compared to HTTP/1.1, primarily through decreased connection overhead and better resource utilization.^[37] HTTP/3, standardized in RFC 9114 in June 2022, addresses HTTP/2's remaining limitations by mapping HTTP semantics over QUIC, a UDP-based transport protocol.^[38] QUIC incorporates per-stream encryption, congestion control, and reliable delivery, allowing independent loss recovery for each stream without impacting others, thus fully mitigating transport-level HOL blocking.^[38] This decoupling ensures that packet loss or reordering on one stream—common in mobile or lossy networks—does not stall unrelated streams, enhancing overall protocol resilience.^[39] For example, in HTTP/3, a webpage with multiplexed resources can continue delivering content from unaffected streams even if packets for a specific image are lost, preventing widespread stalls that would occur in HTTP/2 over TCP.^[39] Deployments in mobile environments, such as ride-sharing applications, report additional performance gains of 10-30% in tail-end latencies over HTTP/2, particularly under cellular conditions with variable connectivity. As of 2024, HTTP/3 accounted for 20.5% of global web requests, according to Cloudflare data, indicating growing adoption.^[40]

Mitigation Techniques

General Strategies

Detection of head-of-line (HOL) blocking typically involves monitoring queue performance metrics, such as the ratio of wait times for packets at the head of the queue compared to those deeper in the queue, which highlights delays caused by a stalled head packet affecting subsequent ones. In practice, tools like the Linux traffic control (tc) utility enable the collection of queue statistics, including backlog sizes and packet drop rates, to identify patterns indicative of HOL conditions where queues build up without proportional drops.^[41]^[42] To reduce HOL blocking, scheduling algorithms such as priority queuing (PQ) allow higher-priority packets to preempt or bypass blocked lower-priority heads, ensuring critical traffic progresses despite congestion at the front. Weighted fair queuing (WFQ) extends this by apportioning bandwidth proportionally among flows based on weights, preventing a single blocked flow from indefinitely delaying others.^[43] Additionally, buffer slicing per flow—often implemented as per-flow or virtual output queuing—allocates dedicated buffer space to individual flows or destinations, confining HOL effects to within a specific flow rather than the entire queue.^[44] Hardware implementations in application-specific integrated circuits (ASICs) for switches commonly employ virtual output queues (VOQs), where each input maintains separate queues for every possible output port, thereby eliminating HOL blocking across different output destinations.^[45] On the software side, extended Berkeley Packet Filter (eBPF) programs facilitate dynamic packet reordering within the kernel networking stack, allowing real-time adjustments to queue order and reducing HOL delays in software-defined environments. A representative example is Cisco's Deficit Round Robin (DRR) scheduler, which approximates fair queuing by cycling through queues with a deficit counter to track service credits, thereby mitigating the unfair service that exacerbates HOL blocking in shared queues.^[46]

Protocol Evolutions

The evolution of transport protocols has addressed head-of-line (HOL) blocking primarily through the introduction of multi-streaming capabilities, departing from TCP's strict in-order delivery requirement that causes a single lost packet to stall the entire connection. The Stream Control Transmission Protocol (SCTP), standardized in RFC 2960 in 2000, was designed for transporting signaling messages over IP while supporting multiple independent streams within a single association, allowing data from different streams to be delivered without blocking others even if packets are lost or reordered. This multi-streaming feature decouples stream-specific ordering from overall transmission, preventing the HOL delays inherent in TCP, particularly beneficial in error-prone networks like telephony signaling. Similarly, QUIC, defined in RFC 9000 in 2021, builds on UDP to enable independent streams multiplexed over a single connection, where packet loss affects only the streams carrying data in that packet, explicitly avoiding TCP-like HOL blocking across streams.^[4]^[47] At the application layer, protocol shifts have leveraged these transport advancements to minimize HOL impacts in service-oriented architectures. gRPC, built over HTTP/2, employs multiplexing to handle multiple remote procedure calls (RPCs) as independent streams on a single connection, avoiding the sequential request-response HOL issues common in RESTful HTTP/1.1 designs where a delayed response blocks subsequent ones. WebSockets, introduced in RFC 6455 in 2011, establish persistent, full-duplex TCP connections for bidirectional messaging, reducing the HOL risks from repeated connection setups in traditional HTTP polling and enabling continuous data flow without per-message handshakes, though still subject to underlying TCP HOL.^[48]^[49] Awareness of TCP's HOL blocking emerged in the 1990s alongside the protocol's widespread adoption for web traffic, prompting research into alternatives; this led to multiplexing standards in the 2010s, such as HTTP/2 (RFC 7540, 2015), and culminated in QUIC's standardization. By 2025, QUIC has achieved widespread adoption, powering over 85% of subsequent fetches in Chrome and around 45% in Firefox, driven by its integration into HTTP/3. These evolutions have significantly reduced HOL-induced delays compared to TCP in high-loss environments, enabling faster page loads and lower latency in mobile and satellite networks.^[39]^[50]^[51]

References

[1]
RFC 2285 - Benchmarking Terminology for LAN Switching Devices
RFC 2285 Benchmarking Terminology February 1998 ; 3.7.3 Head of line blocking ...
[2]
RFC 7540: Hypertext Transfer Protocol Version 2 (HTTP/2)
Summary of each segment:
[3]
https://datatracker.ietf.org/doc/html/rfc2285#section-3.7.3
[4]
RFC 2960 - Stream Control Transmission Protocol - IETF Datatracker
In both of these cases the head-of-line blocking offered by TCP causes unnecessary delay. -- The stream-oriented nature of TCP is often an inconvenience ...
[5]
https://datatracker.ietf.org/doc/html/rfc7540#section-2
[6]
https://datatracker.ietf.org/doc/html/rfc9000#section-2
[7]
[PDF] The iSLIP scheduling algorithm for input-queued switches
There is a popular perception that input- queued switches suffer from inherently low performance due to head-of-line (HOL) blocking. HOL blocking arises when.
[8]
[PDF] Avoiding Head of Line Blocking in Directional Antenna
Thus, existing implementations which use a single FIFO queue potentially leads to Head of Line blocking if the medium is busy in the direction of the packet at ...
[9]
https://ieeexplore.ieee.org/document/1096719
[10]
[PDF] Input Versus Output Queueing on a Space-Division Pack& Switch
Abstract-Two simple models of queueing on an N X N space-division packet switch are examined. The switch operates synchronously with.
[11]
[PDF] An Optimal Solution to Head-of-Line Blocking - Microsoft
Al- though this seems high, it is actually correct since a single packet loss causes a large number of packets to be delayed due to head-of-line blocking.
[12]
[PDF] Revisiting Congestion Control for Lossless Ethernet - USENIX
Apr 18, 2024 · Testbed and large-scale simulations demonstrate that ACC ameliorates fundamental issues in lossless Ethernet. (e.g., congestion spreading, HoL ...
[13]
[PDF] The tail at scale - Luiz André Barroso
We explore how these techniques allow sys- tem utilization to be driven higher with- out lengthening the latency tail, thus avoiding wasteful overprovisioning.
[14]
[PDF] Tackling the Challenge of Bufferbloat in Multi-Path Transport over ...
Receive window limitation and head of line (HOL) blocking are the two main factors impact- ing performance. Both are shortly introduced here. In order to ...<|control11|><|separator|>
[15]
Low Delay Random Linear Coding and Scheduling Over Multiple ...
Jul 30, 2015 · When the delay on one or more of the paths is variable, as is commonly the case, out of order arrivals are frequent and head of line blocking ...Missing: effects | Show results with:effects
[16]
[PDF] Input Versus Output Queueing on a Space-Division Pack& Switch
Abstract-Two simple models of queueing on an N X N space-division packet switch are examined. The switch operates synchronously with fixed-length packets ...
[17]
[PDF] Output-Based Shared-Memory Crosspoint-Buffered Packet Switch ...
In these switches, an input has N virtual output queues to avoid head-of-line blocking [2]. The crosspoint buffers in CICB switches can provide call splitting ...
[18]
Advanced switch memory architectures improve network performance
Oct 28, 2010 · To get around this blocking issue, also known as head of line (HOL) blocking, chip architects include virtual output queues at every switch ...
[19]
6500 line card and 'Head of line' blocking. - Cisco Community
Feb 5, 2007 · Head of Line (HOL) blocking uses interface buffers, not shared ones. Disabling it can cause more packet loss on the port, but moves drops to ...Missing: ratio | Show results with:ratio
[20]
Cisco Nexus 5548P Switch Architecture
Sep 23, 2010 · The Cisco Nexus 5548P is a one-rack-unit (1RU), 1 and 10 Gigabit Ethernet and FCoE access-layer switch built to provide 960 Gbps of throughput with very low ...
[21]
http://content.ikr.uni-stuttgart.de/Content/Publications/Archive/Sf_GLOBECOM2006_36516.pdf
[22]
[PDF] 15-441 Computer Networking Lecture 14: Router Design
Oct 16, 2010 · Head-of-Line Blocking. Problem: The packet at the front of the ... Longest-prefix match (not exact). 2. Tables are large and growing. 3 ...
[23]
Understand Virtual Output Queues | Junos OS - Juniper Networks
VOQ architecture eliminates head-of-line blocking (HOLB) issues. On non-VOQ devices, HOLB occurs when congestion at an egress port affects a different egress ...
[24]
[PDF] Relaxing state-access constraints in stateful programmable data ...
an event may generate head-of-line blocking, where all packets in a queue are held by the first one. The problem can be alleviated by adding more queues ...
[25]
[PDF] Institute of Communication Networks and Computer Engineering
We analyze how the impact of head-of-line blocking can be mitigated by using several parallel TCP connections,. SCTP multistreaming, or SCTP unordered mode, ...
[26]
[PDF] The Delay-Friendliness of TCP - CS@Columbia
TCP uses a re- transmission to recover the lost packet, which in turn yields head-of-line blocking delay at the receiver. The receipt of a packet loss ...
[27]
Building Blocks of TCP - High Performance Browser Networking
This effect is known as TCP head-of-line (HOL) blocking. The delay imposed by head-of-line blocking allows our applications to avoid having to deal with packet ...§three-Way Handshake · §slow-Start · §head-Of-Line Blocking<|control11|><|separator|>
[28]
https://datatracker.ietf.org/doc/html/rfc8684#section-3.3.4
[29]
https://datatracker.ietf.org/doc/html/rfc9000#section-2.2
[30]
https://datatracker.ietf.org/doc/html/rfc9000#section-12.4
[31]
https://datatracker.ietf.org/doc/html/rfc9000#section-13.4
[32]
https://datatracker.ietf.org/doc/html/rfc2616#section-8.1
[33]
https://hpbn.co/http1x/
[34]
https://datatracker.ietf.org/doc/html/rfc2616#section-8.1.2.2
[35]
https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Connection_management_in_HTTP_1.x
[36]
HTTP/1.X - High Performance Browser Networking (O'Reilly)
What if the first request hangs indefinitely or simply takes a very long time to generate on the server? With HTTP/1.1, all requests behind it are blocked and ...
[37]
https://hpbn.co/http2/
[38]
Connection management in HTTP/1.x - MDN Web Docs
Jul 4, 2025 · HTTP pipelining therefore brings a marginal improvement in most cases only. Pipelining is subject to the head-of-line blocking. For these ...
[39]
RFC 7540: Hypertext Transfer Protocol Version 2 (HTTP/2)
Summary of each segment:
[40]
HTTP/2 - High Performance Browser Networking (O'Reilly)
We have eliminated head-of-line blocking from HTTP, but there is still head-of-line blocking at the TCP level (see Head-of-Line Blocking). Effects of ...
[41]
RFC 9114: HTTP/3
The QUIC transport protocol incorporates stream multiplexing and per-stream ... head-of-line blocking can be caused by compression. This allows an encoder ...
[42]
HTTP/3 and QUIC — prioritization and head-of-line blocking
Nov 30, 2022 · QUIC relieves HOL blocking during loss as it is stream-aware such that it knows which specific streams have been affected. In high (random) loss ...Missing: RFC 9114
[43]
tc(8) - Linux manual page - man7.org
Tc is used to configure Traffic Control in the Linux kernel. Traffic Control ... It has been added for hardware that wishes to avoid head-of-line blocking.
[44]
https://blog.ipspace.net/2014/05/queuing-mechanisms-in-modern-switches/
[45]
[PDF] Matching Output Queueing with a Combined Input Output Queued ...
It is well-known that if each input maintains a single FIFO, then HOL blocking can limit the throughput to just 58.6% [5]. ... WFQ and Strict Priority queueing.
[46]
Queuing Mechanisms in Modern Switches - ipSpace.net blog
May 27, 2014 · Virtual output queues solve the head-of-line (HoL) blocking between input ports (traffic received on one port cannot block traffic received on ...
[47]
Cisco Silicon One Q200 (Cisco Catalyst 9500X and 9600X) QoS ...
Mar 17, 2023 · VoQ is a technique to address the HoL block phenomenon with ingress buffering, as explained above. VoQ is a technique in switch architecture ...
[48]
[PDF] Efficient Fair Queuing Using Deficit Round-Robin - Stanford University
A number of readers have conjectured that DRR should reduce to BR when the quantum size is one bit. This is not true. The two schemes have radically ...
[49]
RFC 9000 - QUIC: A UDP-Based Multiplexed and Secure Transport
... head-of-line blocking across multiple streams. When a packet loss occurs, only streams with data in that packet are blocked waiting for a retransmission to ...
[50]
gRPC on HTTP/2 Engineering a Robust, High-performance Protocol
Aug 20, 2018 · In this article, we'll look at how gRPC builds on HTTP/2's long-lived connections to create a performant, robust platform for inter-service communication.<|separator|>
[51]
HTTP/2 and GRPC: The De Facto for Microservices Communication
Apr 4, 2022 · The interleaved requests and responses can run in parallel without blocking the messages behind them, a process called multiplexing.
[52]
[PDF] Does QUIC make the Web faster ? - Department of Computer Science
We find QUIC to perform better overall under poor network conditions (low bandwidth, high latency and high loss), for e.g. more than 90% of synthetic pages ...
[53]
A QUIC progress report - APNIC Blog
Jun 17, 2025 · Of interest in Table 2 is the 45% rate of Safari clients who do not use QUIC, as compared to the 11% rate of Chrome clients who do not use QUIC.