Fact-checked by Grok 2 weeks ago

Weighted fair queueing

Weighted fair queueing (WFQ) is a packet scheduling used in computer networks to allocate among multiple flows in a fair and weighted manner, approximating the idealized fluid model of weighted generalized processor sharing (GPS) by assigning each flow a share of the output link capacity proportional to its predefined weight. Introduced in 1989 by Alan Demers, Srinivasan Keshav, and at the SIGCOMM conference, WFQ builds on earlier fair queueing concepts to address the limitations of first-in-first-out () scheduling, which can lead to unfair when flows have varying packet sizes or misbehaving senders dominate the queue. In WFQ, packets from different flows are placed into separate queues, and a virtual finishing time is computed for each packet based on its size, the flow's weight, and a virtual time clock that advances according to the system's service rate; the packet with the earliest virtual finish time is selected for transmission next, ensuring long-term guarantees and between flows. This mechanism provides strong from ill-behaved flows, prevents one flow from starving others, and supports quality-of-service (QoS) requirements by enforcing minimum shares, making it suitable for in routers and switches. WFQ's theoretical foundation relies on the GPS model, where is divided continuously among active flows according to weights, and the algorithm's packetized ensures worst-case delay bounds, as analyzed in the original work and subsequent studies like the Parekh-Gallager theorem for multiple-node networks. Variants such as worst-case weighted fair queueing (WF²Q) refine the approximation to GPS by serving packets only if their virtual start time does not exceed the current time, improving fairness for short-term guarantees. Widely implemented in network devices for , WFQ has influenced standards for avoidance and in networks, though it incurs computational overhead due to per-flow bookkeeping, often mitigated by approximations like class-based WFQ (CBWFQ) that group flows into fewer classes.

Overview

Definition and Purpose

Weighted fair queueing (WFQ) is a packet-based scheduling used in computer networks to manage the transmission of data packets from multiple traffic flows sharing a common link. It extends the original fair queueing discipline by assigning configurable weights to individual flows, allowing them to receive in proportion to these weights rather than equal shares. This mechanism enables , where higher-priority or more demanding flows can be allocated greater portions of the available capacity. The primary purpose of WFQ is to ensure equitable and efficient allocation of link among competing flows, mitigating issues like by aggressive or high-volume traffic sources. By approximating the idealized fluid-based Generalized Sharing (GPS) model, WFQ provides bounded delays for regulated traffic while guaranteeing fair for best-effort sessions, thereby supporting (QoS) requirements in diverse network environments such as routers and switches. This approach prevents latency-sensitive applications from being starved and promotes overall system stability. In operation, WFQ maintains separate queues for each and selects packets for based on their virtual finishing times, adjusted by the flow's , to emulate proportional rates. For instance, consider two backlogged flows with weights of 1 and 3 sharing a link; the first flow would receive about 25% of the , while the second gets 75%, reflecting their relative priorities without fluid assumptions.

Key Concepts

In weighted fair queueing (WFQ), a is defined as a sequence of packets originating from a particular source and destined for a specific , often representing a single or in a . Each flow is associated with a dedicated , which serves as a per-flow to hold incoming packets awaiting , allowing independent management of from different sources. The assigned to a flow acts as a relative share , determining the proportion of the server's capacity allocated to that flow compared to others, enabling tunable distribution among competing streams. Packet scheduling in network routers traditionally relies on first-in-first-out () queuing, where packets from all flows are handled in arrival order within a single , which can lead to head-of-line (HOL) blocking: a bursty high-volume may delay packets from lower-volume flows even if the latter require minimal . WFQ addresses this by maintaining separate queues for each and scheduling packets based on their weights and virtual finishing times, effectively flows to prevent one from monopolizing resources and mitigating HOL blocking across diverse traffic classes. This promotes fairness, particularly by preventing excessive for low-bandwidth flows amid high-bandwidth competitors. The ideal fluid model for fair queueing conceptualizes network traffic as a continuous stream of infinitesimally small bits, served in a bit-by-bit fashion proportional to each flow's weight, ensuring instantaneous and precise without discrete interruptions. In contrast, the packetized reality of actual networks involves discrete packets of varying sizes transmitted non-preemptively, which introduces challenges such as potential delays and deviations from ideal fairness due to the granularity of packet boundaries. WFQ approximates this model through packet-level mechanisms that emulate the proportional service while bounding the discrepancies caused by packetization. Under WFQ, the service rate provided to a flow is proportional to its weight divided by the sum of all active flows' weights, guaranteeing a fair share of the link capacity relative to the configured priorities without allowing any flow to exceed its allocation at the expense of others.

Mathematical Foundations

Parametrization and Fairness

In weighted fair queueing (WFQ), the parametrization involves administrators assigning a weight w_i > 0 to each flow i to specify its relative share of the available . These weights are configured based on service requirements, such as desired throughput proportions or quality-of-service policies, with higher weights granting larger shares. The fairness achieved by WFQ is characterized as weighted , a where allocation maximizes the minimum weight-normalized across such that no can increase its allocation without reducing that of another with a smaller normalized share. Under steady-state conditions with N active and link R, the long-term data allocated to i is given by r_i = \frac{w_i}{\sum_{j=1}^N w_j} R, ensuring weights directly govern asymptotic shares regardless of arrival patterns or . WFQ emphasizes long-term fairness over strict short-term equity, with the service received by any deviating from its weighted share by at most a bounded amount over time, typically normalized to the maximum packet size. This provides the isolation property, wherein the guaranteed service for compliant flows remains unaffected by the behavior or misbehavior of other flows, preventing one flow from starving others. Short-term unfairness may occur due to packet , but the bound ensures to the fair allocation as time progresses.

Generalized Processor Sharing

Generalized Processor Sharing (GPS) is a theoretical scheduling discipline that serves as an idealized model for among multiple sessions sharing a common . In GPS, the 's is divided continuously and proportionally among the backlogged sessions according to their assigned weights, assuming traffic flows are infinitely divisible and can be served simultaneously without discrete boundaries. This hypothetical scheduler operates in a manner, ensuring that each backlogged session receives service instantaneously proportional to its weight relative to the total weights of all backlogged sessions, while remaining work-conserving by utilizing the full whenever there is demand. A key property of GPS is its strong fairness, where no backlogged session experiences idle time while others are being served, as service is allocated continuously to all active sessions in proportion to their weights. This eliminates the possibility of one session monopolizing resources at the expense of others, providing isolation in terms of throughput and delay bounds that are independent of the behavior of non-conforming sessions. Additionally, GPS supports flexible allocation by adjusting weights dynamically if needed, and in the equal-weight case, it reduces to classical , but its primary advantage lies in enabling predictable performance guarantees for diverse traffic classes without packetization overhead. The mathematical foundation of GPS defines the instantaneous service rate for a backlogged session i at time t as \phi_i(t) = \frac{w_i}{\sum_{j \in B(t)} w_j} C, where w_i is the weight of session i, B(t) is the set of backlogged sessions at time t, and C is the server's constant . This allocation ensures that over any interval where a session remains continuously backlogged, the total service received matches the integral of this rate, bounding the session's output independently of others and providing worst-case delay guarantees proportional to its own arrival patterns. GPS is ideal for analyzing fair scheduling because it achieves perfect proportionality and isolation in a fluid model, serving as a benchmark for performance bounds in integrated services networks. However, it is unimplementable in practical packet-based systems, as real traffic arrives and departs in discrete packets that cannot be divided or served simultaneously across sessions without violating atomic transmission requirements. This limitation necessitates discrete approximations that emulate GPS behavior while handling packet granularity.

The WFQ Algorithm

Step-by-Step Operation

The Weighted Fair Queueing (WFQ) algorithm manages multiple s by maintaining a separate first-in-first-out () queue for each , where flows are typically classified based on attributes such as source-destination addresses, type, or port numbers. Each is assigned a weight that determines its share of the capacity, with higher weights receiving proportionally larger allocations. Packets from incoming s are processed to emulate a generalized model, ensuring that no receives less than its entitled share even if others are backlogged. The core operation involves three main steps executed upon relevant events: packet arrival, enqueuing with assignment, and packet selection for transmission. First, an arriving packet is classified into its corresponding and appended to the of that in order. Second, a virtual finish time is assigned to the packet, computed based on the current system virtual time, the virtual finish time of the previous packet in the same (if the was non-empty), and the packet's length scaled inversely by the 's weight to reflect its service entitlement. This represents when the packet would complete service under an ideal weighted discipline. Third, when the completes transmitting a packet and is ready for the next, it scans the heads of all non-empty queues and selects the packet with the smallest virtual finish time for dequeuing and transmission; if multiple heads share the same finish time, ties are broken arbitrarily or by . The operates in a work-conserving manner, transmitting continuously whenever packets are available. WFQ handles idle periods by keeping the system virtual time unchanged during periods when no packets are present for transmission. Upon a new packet arrival to an idle system, the virtual finish time is computed using the current (unchanged) virtual time in the max operation with the previous finish time, ensuring flows are not penalized for through the max(V(t_a), F_{p-1}). Virtual time advances only when service is provided. To illustrate, consider a with two flows, A (weight 1) and B (weight 2), serving packets of varying lengths over a link with unit capacity (C=1). Assume weights are relative, sum w=3 when both backlogged. At time t=0, flow A receives a 1-unit packet (F_A1 = max(0,0) + 1/1 = 1), and flow B receives a 2-unit packet (F_B1 = max(0,0) + 2/2 = 1). Both have F=1; select A first (arbitrary tie break). Transmit A from t=0 to t=1; during service, both backlogged, advance V from 0 by 1/3 to V=1/3. At t=1, select B (now only B at head), transmit B from t=1 to t=3; advance V by 2/3 to V=1/3 + 2/3 =1. Now suppose at t=1.5 (midway B), flow A receives another 1-unit packet; at arrival, current V≈1/3 + 0.5/3 ≈ 0.5, previous F=1, so F_A2 = max(0.5, 1) + 1/1 = 2. After B completes at t=3, V=1, select A2 (F=2 >1? Wait, but since only A now, serve it, advance V by 1/1 (now sum w=1 for A alone) to V=2. This demonstrates how WFQ approximates proportional sharing, with B getting twice the service over time, adjusting rates based on active flows.

Virtual Time Calculation

In weighted fair queueing (WFQ), virtual time V(t) functions as a normalized clock that measures the progress of service allocation across flows in a manner proportional to their weights, approximating the idealized generalized processor sharing (GPS) model. Virtual time V(t) is advanced incrementally during busy periods. Upon completing transmission of a packet of length l (with C = 1), V(t) advances by \frac{l}{\sum w_j}, where \sum w_j is the sum of weights over backlogged flows during that service interval. This scaling reflects the proportional sharing, as the virtual clock ticks according to divided resources among active flows. If the set of backlogged flows changes during service, approximations are used, but ideally updates adjust at packet boundaries. When the is idle, V(t) remains unchanged until the next arrival. Upon arrival, the virtual time used for computation is the current value, with no explicit set to . During busy periods, the incremental updates maintain consistency with the GPS fluid model. Central to WFQ's per-packet scheduling is the virtual finish time F_p, which determines the order of . For the p-th packet of i, arriving at time t_a with l_p, the virtual finish time is computed as F_p = \max\left( V(t_a), F_{p-1} \right) + \frac{l_p}{w_i}, where F_{p-1} is the virtual finish time of the preceding packet in i (initialized to 0 for the first packet). The \max operation handles cases where the was idle upon arrival, ensuring no within the while aligning with the current system virtual time. The next packet to serve is then selected as the one with the smallest F_p among all flows' head-of-line packets, preserving finish-order fairness. Efficient management of virtual finish times is achieved using a (typically a ) to store and retrieve the minimum F_p values. Packet arrivals require inserting or updating an entry for the flow (computing the new F_p and propagating if necessary), while departures involve extracting the minimum and potentially updating V(t). This structure yields an O(\log N) time complexity per packet operation, where N is the number of active flows, making WFQ feasible for moderate-scale systems despite the logarithmic overhead.

Properties and Approximations

Approximation to GPS

Weighted fair queueing (WFQ), also known as packet generalized processor sharing (PGPS), serves as a packet-based approximation to the idealized fluid model of generalized processor sharing (GPS). In GPS, is allocated proportionally to weights in a continuous manner, but real networks transmit discrete packets; WFQ maps this fluid abstraction to packets by assigning virtual start and finish times to each packet based on the session's weight and the cumulative service received. Packets are then scheduled in increasing order of their virtual finish times, emulating the order in which they would complete under GPS. The approximation bound for WFQ ensures that the difference in service received by any session under WFQ compared to GPS is at most one maximum packet time. This bound arises because WFQ is a work-conserving , meaning it transmits at full capacity whenever there is work to do, and the virtual time mechanism prevents any packet from finishing more than one maximum-sized packet later than in the fluid GPS model. A sketch of the proof relies on the virtual finish times: for a packet arriving to a session, its virtual finish time is computed as the maximum of its virtual start time (previous finish time) plus the packet length divided by the session's weight. Since packets are selected with the smallest eligible virtual finish time, and the system avoids idling when queues are non-empty, the actual finish time of a packet under WFQ deviates from its GPS counterpart by no more than the time to transmit the largest possible packet in the system. This preserves the relative service fairness across sessions in a packetized setting. In terms of , WFQ achieves this using a to manage virtual finish times across sessions, resulting in O(log N) complexity per packet operation, where N is the number of active sessions, making it practical for high-speed routers.

Fairness Bounds and Delays

Weighted fair queueing (WFQ) provides rigorous performance guarantees, particularly in terms of bounded delays and fairness metrics, which emulate the idealized generalized processor (GPS) scheduler while accounting for packetized traffic. For a i with allocated weight w_i in a with total weight \sum w_j and link capacity R, and assuming leaky-bucket regulated traffic to bound bursts, the worst-case under WFQ is comparable to that under GPS plus the time of a maximum-sized packet, l_{\max}/R. This bound arises from the packetization overhead over the GPS baseline, where the GPS delay for i is at most the burst divided by its allocated rate (w_i / \sum w_j) \cdot R. The worst-case fairness index (WFI) quantifies WFQ's deviation from proportional sharing, defined as the maximum difference between the actual received by a and its GPS-equivalent over any , normalized by the flow's and link . This index is bounded by the time of a single maximum-sized packet, l_{\max}/R, reflecting the limit imposed by packets rather than ; seminal analyses show that standard WFQ has an unbounded WFI due to potential long-term deviations, but variants like worst-case fair WFQ (WF²Q) tighten it to this packet-level bound for enhanced isolation. A key benefit of WFQ is its isolation property, whereby the worst-case delay for a well-behaved remains unaffected by the or misbehavior of other flows, mirroring GPS guarantees as long as the aggregate utilization stays below capacity; this prevents malicious or overloaded sessions from inflating delays for compliant traffic beyond the inherent packetization overhead. In comparison to first-in-first-out () scheduling, WFQ significantly reduces delay variance for interactive flows, such as those in applications, by preventing and ensuring bandwidth-proportional service, thereby improving responsiveness without the unpredictable queuing delays common in under mixed traffic loads.

Variants

Worst-Case Fair WFQ

Worst-Case Fair Weighted Fair Queueing (WF²Q) is an enhanced packet scheduling algorithm that refines Weighted Fair Queueing (WFQ) to achieve a closer to the fluid Generalized Processor Sharing (GPS) model, particularly by mitigating scenarios where WFQ allows packets from intermittently backlogged flows to be served excessively ahead of their GPS-equivalent start times, potentially compromising fairness. Developed to ensure worst-case guarantees on service allocation, WF²Q incorporates an eligibility criterion based on both virtual start and finish times, ensuring that only packets whose service would have commenced under GPS are selected for transmission. The core modification in WF²Q lies in its packet eligibility and selection rules. For each packet p in a flow i with weight \phi_i and length l_p, the virtual start time S_p is defined as S_p = \max\{ F_{p^-}, V(a_p) \}, where F_{p^-} is the virtual finish time of the previous packet in the flow, V(a_p) is the virtual time at packet arrival, and the virtual finish time is F_p = S_p + \frac{l_p}{\phi_i}. A packet is deemed eligible at current virtual time V(t) if S_p \leq V(t). Among all eligible backlogged packets, WF²Q selects and serves the one with the smallest F_p using a smallest eligible finish time first (SEFF) policy. This approach prevents the selection of packets that have not yet reached their GPS start time, addressing WFQ's vulnerability to unfair bursts from non-continuously backlogged sessions. WF²Q yields significant improvements in fairness and delay bounds relative to WFQ. It guarantees that the difference in service received by any flow under WF²Q and GPS is at most the transmission time of a single maximum-sized packet, providing the tightest known packet-level approximation to GPS among fair queueing variants and enhancing short-term fairness by limiting instantaneous service deviations. These properties ensure bounded end-to-end delays for real-time sessions even under adversarial traffic patterns, with the algorithm proven to minimize worst-case fairness index values across packet fair queueing disciplines. In terms of implementation, WF²Q retains WFQ's per-packet time complexity of O(\log N), where N is the number of flows, by leveraging a ordered by virtual finish times; however, it necessitates additional eligibility verification—typically via a linear of head-of-line packets or an augmented —to identify the minimum F_p among eligible candidates, slightly increasing practical overhead without altering the asymptotic bound.

Deficit Round Robin

Deficit Round Robin (DRR) is a packet scheduling designed as an efficient approximation to weighted fair queueing, particularly suited for high-speed implementations where computational overhead must be minimized. Introduced by Shreedhar and , DRR achieves proportional allocation among flows while maintaining constant-time processing per packet, making it viable for wire-speed operations in routers. By adapting the core fairness principles of weighted fair queueing—such as virtual time tracking—into a framework with credit-based adjustments, DRR ensures that no flow is starved and bandwidth is shared according to assigned weights, though with bounds on precision due to packet-level granularity. The mechanism of DRR revolves around assigning each (corresponding to a ) a fixed quantum of , \phi_i = w_i \cdot P_{\max}, where w_i is the weight of i and P_{\max} is the maximum packet size in the system. This quantum represents the nominal amount of data the can transmit per round, scaled to handle variable packet lengths without excessive unfairness. Each maintains a deficit counter, initialized to zero, which accumulates unused credits from previous rounds to allow partial quanta to carry over, preventing short packets from underutilizing their allocation. are organized in a circular active list containing only non-empty , ensuring the scheduler cycles efficiently through contending . In operation, the scheduler processes queues in round-robin order from the active list. For the current queue, the deficit counter is incremented by its quantum \phi_i. Then, while the queue is non-empty and the deficit counter is at least as large as the size of the head-of-line packet, that packet is dequeued and transmitted, and the packet size is subtracted from the deficit counter. This allows multiple packets to be served in one visit if they fit within the updated deficit, up to the total quantum plus any prior carry-over. If no packets can be served (e.g., the head packet exceeds the current deficit), the queue remains at the front but the scheduler moves to the next queue in the list, with the deficit preserved for the subsequent round. Upon dequeuing the last packet from a queue, it is removed from the active list; new arrivals reinsert the queue at the end. This process repeats indefinitely, with the active list updated dynamically to skip empty queues. DRR's primary advantages stem from its and : each round requires O(1) operations per visit—independent of the number of flows—enabling implementation in hardware or software at gigabit speeds without the logarithmic overhead of calendar-based schedulers like WFQ. Fairness is approximated closely to ideal fluid models, with the difference in bytes transmitted between any two backlogged flows bounded by the maximum packet size P_{\max}, ensuring throughput proportionality within one packet's granularity. This makes DRR particularly effective for approximating the service order of weighted fair queueing in scenarios with moderate packet size variability. However, DRR has limitations in precision, especially with highly variable packet sizes. If packets exceed the quantum (though mitigated by setting \phi_i based on P_{\max}), a flow may experience delayed service until credits accumulate, leading to bursts and potential unfairness beyond the P_{\max} bound during transient periods. Compared to more refined variants, DRR offers weaker guarantees for traffic, as worst-case delays can approach N \cdot P_{\max} for N flows, and it does not enforce strict eligibility times, resulting in less tight adherence to fluid fair queueing emulation than algorithms with virtual time enforcement.

Applications

Network Quality of Service

Weighted fair queueing (WFQ) plays a central role in providing (QoS) guarantees within wired networks and routers by enabling differentiated treatment of traffic streams based on assigned priorities. In particular, class-based WFQ (CBWFQ), a widely implemented variant, integrates seamlessly with QoS frameworks in devices such as routers, where administrators assign weights to predefined traffic classes—such as voice, video, and data—to allocate bandwidth proportionally during periods of congestion. This mechanism ensures that real-time applications like VoIP and video conferencing receive their designated shares, supporting scalable QoS deployment across enterprise and networks. The primary benefits of WFQ in these environments include delivering low latency to high-priority flows through weighted scheduling, which minimizes delays for latency-sensitive , while reserving according to weights to prevent underutilization. Additionally, WFQ mitigates unfairness issues by isolating flows, thereby protecting responsive TCP-based applications from being starved by non-responsive flows, such as those in constant-bit-rate video streams, ensuring stable performance for diverse mixes. These properties stem from WFQ's core fairness bounds, which approximate generalized processor sharing to bound delays proportionally to weights. In practical implementations, WFQ supports the (DiffServ) architecture by realizing assured forwarding (AF) per-hop behaviors (PHBs), where packets marked with AF codepoints receive varying levels of drop precedence and forwarding assurance through weighted queue servicing. It is frequently combined with (RED) as a congestion avoidance technique, applying probabilistic packet drops based on queue thresholds to preemptively manage overload and avoid global synchronization in flows, enhancing overall network stability. Performance evaluations confirm WFQ's effectiveness in QoS scenarios, particularly for streaming, where it reduces end-to-end by prioritizing time-bound packets and achieves higher throughput fairness across heterogeneous flows compared to simpler queuing. For instance, modeling studies using Petri nets have shown that WFQ maintains bounded delays and low variance under mixed loads.

Wireless and Specialized Systems

In (CDMA) cellular networks, weighted fair queueing (WFQ) adaptations incorporate weights based on energy and costs to enable fair in spread-spectrum systems, where is limited by (SINR). These weights adjust service shares proportionally to account for location-dependent , ensuring lagging flows due to errors recover without excessive delays for others. For instance, the wireless worst-case fair weighted fair queueing (W2F2Q) extends WFQ by tracking virtual finish times and compensating for -induced lags using bounded timestamps, achieving global fairness bounds even under varying error rates typical in CDMA environments. Similarly, downlink scheduling in CDMA data networks employs WFQ principles with power fractions derived from SINR and frame error rates, optimizing throughput while maintaining proportional fairness across users affected by multi-user . For dynamic channel allocation in systems, WFQ variants use as a primary cost metric to determine weights, facilitating efficient sharing by prioritizing allocations that minimize overall system . In multicode CDMA setups, dynamic WFQ extensions decouple delay and bandwidth guarantees by solving for rate allocation, where weights reflect costs to support diverse traffic classes without violating capacity limits imposed by spread-spectrum spreading factors. This approach enhances user satisfaction and network utilization compared to static methods, as -aware weighting reduces the impact of on high-priority flows. Modern extensions of WFQ appear in non-traditional domains, such as smart grids for scheduling deferrable loads like charging or appliance cycling. In these systems, WFQ assigns weights based on power demand, wait time, or geographic region to fairly distribute limited grid capacity, using virtual finish times (F = l/w, where l is load size and w is weight) to prioritize activations without forecasting. Variants like WFQ-Power (weights by wattage) increase serviced loads by approximately 60% over algorithms while reducing deferral variance by up to 2.08 times, optimizing residual power usage in peak-demand scenarios with populations of around 300 loads and 25 MW capacity. Emerging post-2020 studies explore WFQ in (SDN) for dynamic flow control, where adjusts bandwidth allocations in WFQ queues to handle variable traffic, improving fairness and throughput in programmable switches without explicit hardware modifications. Challenges in wireless WFQ implementations include handling mobility-induced handoffs, which disrupt flow continuity and require adaptive weight recalibration to preserve fairness bounds during base station transitions. In 5G networks, hybrid approaches combine WFQ with active queue management (AQM) techniques, such as proportional integral controller enhanced (PI2) or controlled delay (CoDel), to mitigate bufferbloat and ensure low-latency QoS for mobile users, addressing interference spikes and handoff delays through joint scheduling that maintains end-to-end delays under 1 ms in high-mobility scenarios.

History

Origins of Fair Queueing

The origins of fair queueing trace back to the congestion challenges faced by the in the 1980s, where first-in-first-out () queueing in routers led to severe unfairness, with a single high-volume source often monopolizing and causing widespread and delays for other users. These issues, exacerbated by the network's growth and the adoption of , motivated early proposals for more equitable scheduling to prevent greedy flows from starving others. In 1987, John Nagle proposed a foundational approach to address this unfairness in CSMA/CD networks by implementing across separate queues for each source, ensuring that no single flow could dominate the output link. This per-source queueing discipline aimed to emulate a bit-by-bit service, promoting fairness by allocating proportionally to the number of active sources rather than arrival order. Nagle's idea built directly on ARPANET's observed problems, where exacerbated congestion collapse by allowing unresponsive sources to flood queues. The concept was formalized and analyzed in 1989 by Alan Demers, Srinivasan Keshav, and in their seminal , which introduced a practical fair queueing () algorithm emulating bit-by-bit round-robin for packet networks. Their algorithm approximated an ideal fluid-flow model, where packets from different flows are served in a manner that minimizes disparities in service rates, even under varying packet sizes. Through simulations, they demonstrated that significantly reduced delay variance compared to in scenarios with heterogeneous flows and effectively protected low-bandwidth flows from greedy ones, limiting the impact of a single dominant source to roughly its fair share. This unweighted FQ laid the groundwork for later extensions, including weighted variants to support .

Development of Weighted Variants

The concept of weighted fair queueing emerged as an extension of the original fair queueing algorithm, which initially treated all flows equally but was adapted to allocate bandwidth proportionally based on assigned weights. In their seminal 1989 paper, Demers, Keshav, and Shenker proposed this weighted variant as a means to provide while maintaining fairness, allowing flows to receive bandwidth in proportion to their weights rather than equal portions. This idea was formalized and analyzed in depth by and Gallager in , who introduced Generalized Processor Sharing (GPS) as an idealized fluid model for weighted sharing and developed Packet-by-Packet Generalized Processor Sharing (PGPS), also known as Weighted Fair Queueing (WFQ), as a practical packet-based to GPS. Their work established key theoretical bounds on delays and fairness, demonstrating that PGPS closely emulates GPS service under realistic packetized conditions, with maximum delays differing by at most the maximum packet transmission time. Subsequent refinements addressed limitations in WFQ's fairness and efficiency. In 1996, Bennett and Zhang introduced Worst-case Fair Weighted Fair Queueing (WF2Q), which selects packets not only based on virtual finish times but also ensures they would have been served under GPS at the current virtual time, achieving tighter worst-case fairness bounds by eliminating discrepancies up to a packet length. Concurrently, Shreedhar and Varghese proposed Deficit Round Robin (DRR) in 1996 as an O(1) approximation to weighted round-robin scheduling, using a deficit counter to handle variable packet sizes without timestamps, thus improving efficiency for hardware implementation while approximating WFQ fairness within a small error bound proportional to maximum packet size. By the late 1990s, weighted fair queueing variants gained adoption in IETF standards for (QoS), notably in 1633 (1994), which referenced WFQ for bandwidth allocation, and subsequent RFCs like 2211 and 2212 (1997) that integrated it into signaling for per-flow guarantees in networks. Post-2000 research has focused on scalable approximations to handle core network statelessness and multi-resource sharing, such as core-stateless variants and extensions like Dominant Resource Fair Queueing, addressing high-speed router constraints while retaining fairness properties.