Deficit round robin
Deficit Round Robin (DRR) is a packet scheduling algorithm employed in computer networking to approximate fair queueing, ensuring equitable bandwidth allocation among multiple competing flows in devices such as routers and switches, developed in 1995 by M. Shreedhar and George Varghese. By modifying traditional round-robin scheduling to account for variable packet sizes, DRR achieves near-perfect fairness in throughput allocation while requiring only constant O(1) time complexity per packet, enabling efficient implementation in hardware for high-speed networks.[1]
The algorithm operates by maintaining a queue and a deficit counter for each flow. In each scheduling round, the algorithm serves queues in a cyclic round-robin order, incrementing the deficit counter by a predefined quantum of service (typically proportional to the flow's allocated bandwidth share). It then dequeues and transmits packets from the current queue as long as the packet size does not exceed the updated deficit counter; the counter is decremented by the size of each transmitted packet, with any remaining credit carried over to the next round for that queue. This carry-over mechanism compensates for discrepancies caused by larger packets in previous rounds, preventing flows with smaller average packet sizes from unfairly dominating bandwidth. An active list of non-empty queues optimizes the process by skipping idle ones.[1][2]
Compared to more computationally intensive methods like Weighted Fair Queuing (WFQ), which require per-packet timestamp calculations and O(log n) operations, DRR provides comparable fairness bounds—with a worst-case deviation in throughput bounded by the maximum packet size—while being simpler and more scalable. It has been extended in variants such as Weighted DRR (WDRR) for supporting traffic priorities and class-based queueing, and Modified DRR (MDRR) for reducing latency in real-time applications like VoIP. DRR is widely used in quality-of-service (QoS) frameworks for input-queued switches, wireless mesh networks, and bandwidth management protocols, promoting isolation of misbehaving flows without excessive overhead.[1][3]
Introduction
Definition and Purpose
Deficit Round Robin (DRR) is a packet-based scheduling algorithm designed as an efficient approximation to Generalized Processor Sharing (GPS), specifically tailored for managing variable-length packets in network routers and switches.[4] Unlike idealized fluid-based models like GPS, which assume divisible service, DRR operates on discrete packets by employing a round-robin service order augmented with per-flow deficit counters to track and compensate for unused bandwidth allocations.[4] This approach ensures that flows receive service in proportion to their assigned weights, even when packet sizes vary significantly, thereby mitigating the inefficiencies of basic round-robin scheduling where larger packets can unfairly dominate bandwidth.[4]
The primary purpose of DRR is to provide proportional bandwidth allocation among competing network flows while isolating misbehaving or high-bandwidth flows to prevent them from starving others.[4] By addressing the limitations of traditional round-robin methods—such as unfairness arising from packet length variations—DRR promotes equitable resource sharing in congested queues, which is essential for maintaining performance in shared-link environments like IP routers.[4] This makes it particularly suitable for scenarios where flows exhibit diverse traffic patterns, ensuring that each flow's throughput closely tracks its entitled share without requiring complex per-packet computations.[4]
Key benefits of DRR include its inherent simplicity, which allows for straightforward hardware implementation, and its low computational overhead of O(1) operations per packet, contrasting with more intricate alternatives.[4] It achieves strong fairness properties without the need for per-packet timestamps or virtual time calculations, reducing both complexity and potential sources of error in high-speed networks.[4] In DRR, the minimum service rate for flow i is given by \frac{Q_i}{\sum Q_j} \times R, where Q_i is the quantum assigned to flow i, \sum Q_j is the sum of quanta across all flows, and R is the link transmission rate; this formula underscores how quanta directly dictate proportional sharing.[4] As a simpler alternative to Weighted Fair Queuing (WFQ), DRR trades minor latency bounds for significantly reduced processing demands.[4]
Historical Development
The Deficit Round Robin (DRR) scheduling algorithm was invented in 1995 by M. Shreedhar and George Varghese, who presented it in their seminal paper "Efficient Fair Queuing using Deficit Round Robin" at the ACM SIGCOMM conference.
This development was driven by the shortcomings of prior round-robin-based schedulers, notably Weighted Round Robin (WRR), which struggled to ensure equitable bandwidth allocation in the presence of variable packet lengths, often favoring flows with smaller packets at the expense of those with larger ones.[5]
In the years following its introduction, DRR saw early adoption within academic and research settings as a practical, low-complexity method for approximating idealized fluid fair queueing disciplines, such as Generalized Processor Sharing (GPS), thereby enabling more efficient packet-level implementations of proportional fairness.
A significant milestone in DRR's evolution came with its incorporation into the Linux kernel's traffic control framework in 2008, where it was implemented by Patrick McHardy as the sch_drr module to support network traffic shaping and scheduling.
Algorithm Mechanics
Core Components
The core components of the Deficit Round Robin (DRR) scheduling algorithm consist of key data structures and parameters that facilitate fair packet servicing across multiple flows in a network scheduler. These elements, as defined in the original formulation, enable efficient traversal and credit-based allocation without requiring complex timestamp computations.[1]
The quantum, denoted as Q_i for each flow i, represents a fixed number of bytes allocated to that flow per scheduling round, determining its proportional share of bandwidth. For instance, to achieve weighted fairness, the quantum for one flow can be set as twice that of another to allocate double the bandwidth. This parameter is configured by the scheduler manager and remains constant across rounds for a given flow.[1]
The deficit counter, DC_i, is a per-flow variable initialized to 0 that tracks unused service credits from previous rounds. At the start of servicing flow i, the current quantum Q_i is added to DC_i, allowing the accumulation of any remainder bytes for carryover to subsequent rounds; the counter is then decremented by the size of each transmitted packet, ensuring no over-servicing occurs. This mechanism bounds the counter value between 0 and the maximum packet size, preventing excessive credit buildup.[1]
Queue management relies on an active list, a circular structure (such as a linked list) containing indices of only non-empty flow queues to enable efficient round-robin traversal. Upon packet arrival to an empty flow, its index is inserted into the active list with DC_i reset to 0; the list is updated dynamically to exclude emptied flows, avoiding unnecessary checks on idle queues.[1]
Packet eligibility is governed by a simple rule: during a flow's turn, packets are dequeued and transmitted only if their length L is less than or equal to the updated DC_i (after adding Q_i), with DC_i reduced by L for each eligible packet until the counter falls below the next packet's size or the queue empties. Any remaining positive value in DC_i carries over to the next round, promoting byte-level fairness without fragmenting packets.[1]
For handling empty queues, when a flow's queue becomes empty after servicing, its DC_i is explicitly reset to 0, and the flow's index is removed from the active list to skip it in future rounds until new packets arrive. This prevents unfair credit accumulation for idle flows while maintaining the scheduler's efficiency.[1]
Operational Steps
The Deficit Round Robin (DRR) algorithm begins with an initialization phase where separate queues are established for each flow, a fixed quantum Q_i is assigned to each queue i based on its desired bandwidth share, and the deficit counter DC_i for every queue is set to 0.[1] This setup ensures that all flows start with no accumulated credit, allowing the scheduler to track service allocation from the outset.
In the round traversal phase, the algorithm maintains a circular list of active (non-empty) queues and starts at the head of this list. For the current queue i, if it is non-empty, the deficit counter is incremented by the quantum Q_i, which represents the bytes of service credit allocated to that flow in the current round. The quantum plays a key role in this increment step by proportionally determining each flow's service opportunity relative to others.[1]
The dequeuing logic then activates for queue i: while DC_i is greater than or equal to the length L of the next packet in the queue and the queue remains non-empty, the packet is dequeued, transmitted, and L is subtracted from DC_i. This process allows multiple packets to be sent in a single visit if the accumulated credit suffices, up to the point where the remaining credit is insufficient for the next packet. If the queue becomes empty during this phase, DC_i is reset to 0.[1]
After servicing queue i, the algorithm advances to the next queue in the circular list. This traversal repeats for all active queues, completing a full round; the process continues across multiple rounds until all queues are empty or no further packets arrive. The carry-over of any remaining positive DC_i to the next round prevents short packets from being starved, as unused credit from oversized packets in prior visits accumulates to enable transmission of smaller ones later.[1]
A pseudocode representation of the core operational loop, adapted from the original description, illustrates this sequence:
while (there are non-empty queues) {
i = next_queue(); // Select next active queue in round-robin order
if (queue_i.empty()) continue;
DC_i += Q_i; // Add quantum to deficit counter
while (DC_i >= L && !queue_i.empty()) { // L is length of head packet
packet = dequeue(queue_i);
transmit(packet);
DC_i -= packet.length;
}
if (queue_i.empty()) DC_i = 0; // Reset if queue empties
// Move to next queue; remaining DC_i carries over if positive
}
while (there are non-empty queues) {
i = next_queue(); // Select next active queue in round-robin order
if (queue_i.empty()) continue;
DC_i += Q_i; // Add quantum to deficit counter
while (DC_i >= L && !queue_i.empty()) { // L is length of head packet
packet = dequeue(queue_i);
transmit(packet);
DC_i -= packet.length;
}
if (queue_i.empty()) DC_i = 0; // Reset if queue empties
// Move to next queue; remaining DC_i carries over if positive
}
[1]
Fairness Properties
Deficit Round Robin (DRR) achieves proportional fairness by allocating bandwidth to flows in proportion to their assigned quantum sizes Q_i, thereby approximating the max-min fairness provided by idealized Generalized Processor Sharing (GPS).[2] Under heavy traffic conditions, DRR ensures that the fairness index for any flow i equals 1, meaning the allocation closely matches the ideal share without undue favoritism toward any flow.[2] This property holds because DRR serves packets from each non-empty queue in a cyclic manner, adjusting service based on the quantum to prevent any flow from monopolizing resources.
A key worst-case fairness bound in DRR states that no backlogged flow receives less than its entitled share minus a small error term due to packet granularity, specifically bounded by the maximum packet size \max(L).[2] Over K rounds, the difference between the actual service \sent_{i,K} to flow i and its ideal allocation K Q_i satisfies |\sent_{i,K} - K Q_i| \leq \max(L).[2] This bound arises from the mechanism's handling of variable-length packets, where the deficit counter carries over unused portions of the quantum to subsequent rounds, ensuring that long packets in one round do not unfairly diminish service in the next.[2] As a result, the deficit counter remains between 0 and \max(L), preventing cumulative unfairness.
The proof of these service bounds relies on analyzing the service provided to a flow within a single round. In a given round k, the total bytes served to flow i satisfy Q_i - \max(L) \leq \bytes_{i,k} + \deficit_{i,k} \leq Q_i + \max(L), where \deficit_{i,k} is the updated deficit counter.[2] This establishes that the deviation from the ideal GPS service is tightly controlled by packet size, independent of the number of flows or rounds.
Empirical simulations in the original DRR study demonstrate near-ideal fairness under bursty traffic conditions, with maximum throughput deviations of less than 0.3% for constant packet sizes, 0.4699% for random packet sizes, and 0.32% for bimodal packet sizes, all under Poisson arrivals, confirming robust performance across diverse input distributions.[2]
Complexity Measures
The time complexity of Deficit Round Robin (DRR) is O(1) per packet transmission, achieved by scanning through the N flows in each round while amortizing the work to constant time per packet across multiple rounds. This efficiency stems from simple increment and decrement operations on the deficit counter and quantum values, without requiring complex timestamp calculations.
Space complexity for DRR is O(N), where N is the number of flows, primarily due to storing per-flow queues, deficit counters, and quanta; notably, no additional state is maintained per packet, keeping memory usage linear and modest. For instance, simulations demonstrate that DRR can manage 10,000 flows hashed into 1,000 buckets with an average of 0.1 probes per access, confirming its scalability to thousands of flows without logarithmic overhead.[2]
Compared to Weighted Fair Queuing (WFQ), which incurs O(log N) time per packet due to priority queue operations, DRR avoids such factors, enabling efficient handling of large numbers of flows. The overhead is minimal, involving only a few CPU instructions beyond First-Come-First-Served (FCFS) scheduling, making DRR suitable for high-speed links where low per-packet processing is essential. In simulations with N=1000 flows, DRR demonstrates effective packet processing while maintaining near-perfect fairness, outperforming Stochastic Fair Queuing (SFQ) in deterministic bandwidth allocation.
Latency Characteristics
Deficit Round Robin (DRR) provides bounded worst-case delays for packets, though these bounds are looser than those of ideal schedulers like Weighted Fair Queuing (WFQ). For a packet in flow i, the worst-case delay is upper-bounded by \frac{N \cdot \max(Q_j)}{R} + \frac{\max(L)}{R}, where N is the number of flows, Q_j is the quantum size for flow j, R is the link transmission rate, and L is the packet length; this accounts for the time to complete a full round of service for other flows plus the packet's own transmission time.[1]
This latency bound reflects a key trade-off in DRR: while WFQ achieves an O(1) delay relative to the ideal Generalized Processor Sharing (GPS) due to its per-packet finish-time calculations, DRR's round-robin scanning introduces an O(N) dependency in practice, leading to higher per-packet waits under contention.[1] Network calculus provides a formal characterization of DRR's service behavior, with a strict service curve given by \beta(t) = \frac{Q_i}{P} t - B, where P is the period defined as the sum of all quanta \sum Q_j, and B represents the burst allowance influenced by maximum deficits and packet sizes. This curve enables precise delay computations for flows with token-bucket arrival constraints.[6]
The choice of quantum size Q_i significantly affects latency profiles, particularly in mixed-traffic scenarios. Larger quanta reduce the number of rounds needed for high-volume flows, lowering their average delays, but can increase latency for short flows or small packets, as a minimal packet may wait for nearly a full quantum to be eligible after larger siblings. Optimal tuning balances these effects, often setting Q_i proportional to allocated bandwidth while ensuring Q_i \geq \max(L) for O(1) complexity.[1]
The original DRR study notes that under heavy load, DRR's delays are higher than ideal GPS primarily due to the bursty service pattern from round completions, though it maintains competitive performance for throughput fairness.[1]
Variants and Extensions
Weighted and Modified Versions
Weighted Deficit Round Robin (WDRR) extends the basic DRR algorithm by assigning different quanta to queues based on their relative weights, enabling prioritized bandwidth allocation among traffic classes. In WDRR, the quantum for queue i is defined as Q_i = w_i \times Q, where w_i is the weight of queue i (typically normalized such that \sum w_i = 1) and Q is a base quantum size, ensuring that higher-weighted queues receive proportionally more service opportunities while maintaining fairness within classes.[7] This modification allows network operators to implement class-based prioritization, such as allocating more bandwidth to voice or video traffic over best-effort data, without the complexity of timestamp-based schedulers like Weighted Fair Queuing.[8]
Smoothed WDRR (SWDRR) further refines WDRR by incorporating smoothing mechanisms into the quanta allocation process, specifically to mitigate jitter in flows with variable bit rates, such as multimedia streams. By adjusting the deficit counter application over multiple rounds—often through a shaped or averaged allocation—SWDRR reduces burstiness in packet transmission, providing more consistent latency bounds compared to standard WDRR, particularly under high load conditions with heterogeneous traffic.[9] This smoothing is achieved without increasing per-packet complexity beyond O(1), making it suitable for high-speed routers handling variable-rate applications.
Hybrid approaches integrate DRR with strict priority queuing to support real-time traffic requirements alongside fair sharing for non-critical flows. In such hybrids, high-priority queues (e.g., for low-latency real-time packets) are serviced first in each round, with DRR applied only to the remaining best-effort queues, effectively deferring lower-priority traffic when necessary.[10] This combination preserves the fairness of DRR for elastic flows while ensuring minimal delay for delay-sensitive ones, as demonstrated in packet-switched networks where priority queues are positioned post-DRR for reordering.[11]
Deficit adjustment variants address challenges with idle flows in DRR-based schedulers, where prolonged idleness can lead to accumulated deficits that disadvantage flows upon reactivation. In reset variants, the deficit counter is cleared to zero when a queue empties, preventing carryover penalties for idle periods; conversely, carry-forward variants retain the residual deficit to ensure exact bandwidth recovery after idleness.[12] These options allow fine-tuning for scenarios like bursty traffic, balancing fairness and responsiveness without altering core round-robin traversal.
Cisco's implementation in IOS, known as Modified Deficit Round Robin (MDRR), employs WDRR principles for QoS class scheduling, where weights are assigned to queues (often configured as percentages summing to 100 for relative bandwidth shares) to prioritize traffic classes like expedited forwarding or assured forwarding. This adaptation integrates with broader QoS policies, supporting up to eight queues per port with configurable weights to enforce service level agreements in enterprise and service provider networks.[13]
Recent Advancements
In 2022, researchers provided a refined network calculus analysis for Deficit Round Robin (DRR), introducing virtual delay curves to derive tighter bounds on worst-case delays compared to prior models. This approach enhances the theoretical understanding of DRR's performance guarantees in packet-switched networks by accounting for variable packet lengths more precisely, enabling better predictability in high-load scenarios.[14]
Building on weighted variants for greater adaptability, recent innovations have extended DRR to dynamic environments. In 2025, the Channel- and Queue-aware Deficit Round Robin (CQDRR) was proposed as a variant tailored for hybrid flows in wireless networks, particularly integrated Time-Sensitive Networking (TSN) and 5G setups. CQDRR dynamically adjusts the quantum allocation based on real-time channel state information and queue lengths, improving fairness and throughput for diverse traffic types such as real-time and best-effort flows while mitigating starvation in varying wireless conditions.[15]
Another 2025 advancement, Self-Clocked Round-Robin (SCRR), introduces a parameter-less scheduling mechanism that self-adjusts clock rates to prioritize short, latency-sensitive flows without manual tuning. SCRR reduces CPU overhead by up to 23% over traditional DRR implementations and achieves lower tail latencies for bursty traffic, making it suitable for modern datacenter and edge computing applications where flow characteristics evolve rapidly. Evaluations on physical testbeds demonstrated SCRR's ability to maintain fairness while boosting completion times for small flows by adapting the round-robin cycle based on observed service rates.[16]
Also in 2025, a convexity-based optimization framework was developed for tuning DRR parameters in delay-constrained systems, proving the feasible set of quantum and deficit parameters to be convex for two-flow scenarios. This enables efficient solving of optimization problems to meet strict delay bounds, using standard convex solvers to balance fairness and latency without exhaustive search. The framework provides a scalable method for configuring DRR in resource-limited networks, with applications in embedded systems where parameter selection impacts overall system reliability.[17]
These theoretical and algorithmic advancements have found practical relevance in enhanced DRR deployments for 5G and emerging 6G network slicing, addressing fair resource allocation amid surging IoT traffic volumes. For instance, proportional time-deficit round robin extensions have been integrated into slicing architectures to ensure isolated performance for IoT services like e-health monitoring, preventing interference from high-volume data streams while maintaining low-latency guarantees.[18]
Practical Implementations
Software Deployments
The Deficit Round Robin (DRR) scheduling algorithm is prominently implemented in the Linux kernel's Traffic Control (tc) subsystem through the DRR module in the file net/sched/sch_drr.c. This implementation, authored and maintained by Patrick McHardy, was introduced in kernel version 2.6.28 via the commit "pkt_sched: add DRR scheduler" on November 20, 2008. The module operates under the GNU General Public License (GPL) version 2, consistent with Linux kernel licensing, and includes features such as flow classification via tc filters and statistics logging for monitoring queue performance and packet drops.[19]
Configuration of the DRR scheduler is facilitated through the tc utility in the iproute2 package, allowing administrators to set parameters like the quantum, which determines the byte limit a class can dequeue per round. For instance, the command tc qdisc add dev eth0 root drr quantum 10000 attaches a DRR queue discipline to the eth0 interface with a quantum of 10,000 bytes, enabling class-based fair sharing.[19] This setup supports hierarchical queuing disciplines (qdiscs), making DRR suitable for multi-tenant environments such as virtualized servers or containerized applications, where iproute2 tools configure bandwidth allocation among multiple flows or tenants to prevent dominance by any single user.[20]
Hardware and Commercial Uses
Cisco Systems implements a modified form of Deficit Round Robin (MDRR) within its IOS and IOS-XE operating systems for Quality of Service (QoS) management, particularly in ASR 9000 series routers used for enterprise wide area networks (WANs). This adaptation, known as Weighted DRR in some configurations, allocates bandwidth to traffic classes using a deficit counter to ensure fair servicing while supporting strict priority queues for low-latency applications.[21][22]
Juniper Networks incorporates DRR-based scheduling, including Shaped Deficit Weighted Round Robin (SDWRR), in its Junos OS for MX series routers, enabling efficient egress scheduling integrated with Multiprotocol Label Switching (MPLS) for traffic engineering in service provider environments. This implementation uses MDRR on enhanced queuing dense port concentrators (DPCs) to guarantee committed information rates (CIR) at interface sets, supporting up to four levels of strict priority alongside round-robin servicing for multiple queues.[23][24]
In data center environments, commercial switches such as those running Arista EOS employ prioritized variants of weighted round-robin scheduling, akin to DRR, to manage east-west traffic fairness by mixing strict priority and weighted queues on egress interfaces, ensuring balanced resource allocation for high-volume inter-server communications.[25] DRR principles are also applied in Internet Service Providers (ISPs) for bandwidth shaping, where a DRR-based traffic control scheme guarantees quality of service for conformant packets across varying time scales using token bucket meters.[26] In 5G core networks, DRR variants facilitate network slice isolation by allocating airtime resources to different service slices, as seen in Wi-Fi RAN integrations that support proportional time-based DRR for multi-tenant resource partitioning.[27]
A practical deployment example involves Linux-based routers like VyOS, utilized by small and medium-sized businesses (SMBs) for traffic policy enforcement, where round-robin schedulers derived from DRR enable VoIP prioritization through classful queuing up to 4096 classes, contributing to low-latency performance in real-time communications.[28]