Fact-checked by Grok 2 weeks ago

Queuing delay

Queuing delay refers to the time a data packet spends waiting in a buffer or queue within a network device, such as a router, before it is processed and transmitted onto the next link.^[1] This delay is a critical component of end-to-end network latency and occurs when the arrival rate of packets exceeds the service rate of the device, leading to temporary congestion.^[2] In computer networks, queuing delay is analyzed using queuing theory, which models packet arrivals as stochastic processes and examines buffer behavior under various traffic conditions.^[3] For instance, in a single-server queue like the M/M/1 model, the average queuing delay is given by \frac{\rho}{\mu(1 - \rho)}, where \rho = \lambda / \mu is the utilization factor, \lambda is the average packet arrival rate, and \mu is the service rate.^[2] Factors influencing queuing delay include traffic intensity, buffer size, and scheduling disciplines such as first-come-first-served (FCFS), which can exacerbate delays during bursts of high-volume traffic.^[4] Queuing delay contributes significantly to overall packet delay alongside processing, transmission, and propagation delays, and it is particularly pronounced in scenarios with variable bit rates or overloaded links.^[1] In practice, it can be mitigated through congestion control mechanisms like those in TCP, which adjust transmission rates to reduce queue buildup, or by dimensioning buffers appropriately to balance delay and packet loss.^[3] Understanding and predicting queuing delay is essential for network design, performance optimization, and ensuring quality of service in applications ranging from web browsing to real-time streaming.^[4]

Fundamentals

Definition

Queuing delay refers to the time a data packet spends waiting in a queue at a network device, such as a router or switch, before it can be processed and transmitted onto the next link. This waiting occurs in the device's buffer when incoming packets arrive faster than the device can service them, leading to temporary storage until transmission capacity becomes available.^[5]^[6] Unlike fixed components of network delay, such as propagation delay—which is determined by the physical distance between nodes and the speed of signal propagation in the medium—queuing delay is inherently variable and directly influenced by the degree of congestion in the network. Under low traffic conditions, it may be negligible, but during periods of high utilization, it can dominate the overall latency experienced by packets.^[7]^[6] The concept of queuing delay emerged in the foundational work on packet-switched networks during the late 1960s and early 1970s, particularly with the development of the ARPANET, where variable delays were first rigorously modeled using queuing theory to analyze message flow and network performance. Leonard Kleinrock's pioneering application of queuing theory to communication networks, starting with his 1961 proposal and 1964 book, provided the mathematical framework for understanding these delays in store-and-forward systems like ARPANET.^[8] For instance, in a router handling bursty traffic, an arriving packet may join a queue of dozens or hundreds of others, waiting seconds or even longer if the output link is saturated, thereby significantly increasing the end-to-end delay for applications sensitive to latency, such as real-time video streaming.^[5]^[6]

Components of Network Delay

In packet-switched networks, the end-to-end delay for a packet is composed of four primary components: processing delay, queuing delay, transmission delay, and propagation delay.^[9] Processing delay refers to the time taken by a router or switch to analyze the packet header, perform lookups, and determine the next hop.^[9] Queuing delay is the duration a packet waits in the output buffer before transmission begins.^[9] Transmission delay is the time required to push all bits of the packet onto the physical link, calculated as the packet length L divided by the link bandwidth R, or d_{trans} = \frac{L}{R}.^[9] Propagation delay is the time for the signal to traverse the physical medium from source to destination, given by the distance d divided by the propagation speed s, or d_{prop} = \frac{d}{s}.^[9] These components contribute to the nodal delay at each router, with processing and queuing delays occurring internally, followed by transmission and propagation on the outgoing link.^[9] Transmission and propagation delays are generally deterministic and fixed for a given packet size, link capacity, and physical path, allowing predictable planning in network design.^[9] In contrast, queuing delay is highly variable, depending on instantaneous traffic load and buffer occupancy, making it the dominant source of unpredictability in overall delay.^[9] This variability in queuing delay introduces jitter, defined as the variation in packet arrival times, which disrupts the timing-sensitive nature of real-time applications like Voice over IP (VoIP).^[10] In VoIP, even small jitter values (e.g., exceeding 30 ms) can degrade audio quality by causing out-of-order playback or gaps in conversation flow.^[11] Queuing delay interacts with processing delay within the router's forwarding pipeline, where packets undergo header examination before entering the queue; if output buffers fill to capacity, incoming packets may be discarded post-processing, indirectly increasing retransmission loads and router processing overhead under sustained congestion.^[11]

Causes and Mechanisms

Packet Arrival Processes

Packet arrival processes fundamentally influence queuing delay in communication networks by determining the rate and variability at which data packets enter queues. A classic model for packet arrivals is the Poisson process, which assumes that packets arrive randomly and independently over time, with inter-arrival times exponentially distributed at a mean rate λ (packets per unit time). This model, foundational to early analyses of packet-switched networks, implies that the probability of k arrivals in a fixed interval t follows a Poisson distribution, leading to moderate queuing delays under light to moderate loads when combined with appropriate service rates.^[12] In practice, however, network traffic frequently deviates from this ideal due to burstiness, where packets arrive in clusters rather than smoothly. Burstiness often arises from transport-layer protocols like TCP during its slow-start phase, in which the congestion window exponentially increases, causing a sudden surge of packets that can overwhelm router buffers and induce significant queuing delays or even packet losses. For instance, TCP's self-clocking mechanism, coupled with queueing effects, generates short-scale bursts that propagate through the network, exacerbating delay variability compared to steady-state arrivals.^[13]^[14] Arrival processes can also be classified as deterministic or stochastic, each impacting queuing delay differently. Deterministic arrivals, exemplified by constant bit rate (CBR) traffic such as circuit-emulated voice streams, feature regular, predictable intervals between packets, resulting in minimal queuing delays as buffers remain underutilized during off-peak patterns. In contrast, stochastic variable bit rate (VBR) traffic, common in compressed video or adaptive applications, introduces irregular arrival bursts and lulls, leading to higher average queuing delays due to the accumulation of variable-length packets during peak periods. Comparisons in multimedia transport show that VBR requires larger buffers to meet the same delay bounds as CBR, highlighting the queuing penalty of variability.^[15]^[16] Real-world internet traffic, particularly web traffic, often exhibits self-similar patterns characterized by long-range dependence, where burstiness persists across multiple time scales rather than decaying quickly. This fractal-like structure, observed in empirical measurements of Ethernet and wide-area traffic, causes queues to build up more severely than under Poisson assumptions, with average queuing delays increasing by orders of magnitude in self-similar scenarios. Seminal analyses of LAN traces revealed Hurst parameters around 0.8–0.9, indicating strong long-range dependence that amplifies tail delays in queues. For web traffic specifically, the heavy-tailed file size distributions and user behavior drive this self-similarity, further elevating queuing delays in backbone routers.^[17]^[18]^[19]

Service and Scheduling Disciplines

Service and scheduling disciplines determine how packets waiting in a queue are selected for transmission by a router or switch, directly affecting the queuing delay experienced by each packet. These mechanisms manage the order and timing of service, balancing simplicity, fairness, and performance under varying traffic conditions. Common disciplines range from basic non-prioritizing approaches to sophisticated weighted and probabilistic schemes, each with trade-offs in delay predictability and resource allocation.^[20] First-In-First-Out (FIFO) is the simplest service discipline, where packets are enqueued and dequeued in the strict order of their arrival, treating all traffic uniformly without regard to type or importance. This approach minimizes implementation complexity and overhead, making it the default in many legacy and low-end network devices. However, FIFO is susceptible to head-of-line (HOL) blocking, where a single large or delayed packet at the front of the queue prevents subsequent packets—potentially from higher-value flows—from being served, leading to increased variability in queuing delays during bursts. Priority queuing addresses HOL blocking by classifying packets into multiple queues based on assigned priorities, serving higher-priority queues exhaustively before lower ones, often using a preemptive or non-preemptive strategy. For instance, control packets like routing updates may be assigned high priority to ensure low delay for network management traffic, while bulk data receives lower priority. In a non-preemptive priority system modeled as M/G/1, the mean queuing delay for packets in priority class i (with classes ordered from highest to lowest priority) is given by

D_i = \frac{R}{(1 - \sum_{j=1}^{i-1} \rho_j)(1 - \sum_{j=1}^i \rho_j)},

where R = \frac{1}{2} \sum_{k=1}^K \lambda_k E[S_k^2] is the mean residual service time, \rho_j = \lambda_j E[S_j] is the utilization due to class j (with \lambda_j the arrival rate and E[S_j] the mean service time for class j), and \rho = \sum_{k=1}^K \rho_k < 1 is the total system utilization across all K classes; this formula captures the delay imposed by higher-priority traffic and residual service on class i.^[21] This discipline reduces delay for critical traffic but can starve lower-priority flows if high-priority loads are sustained, necessitating careful priority assignment to avoid unfairness. Weighted Fair Queuing (WFQ) extends fair queuing principles to provide proportional bandwidth allocation among flows or classes, approximating the idealized Generalized Processor Sharing (GPS) scheduler in packet networks. Each flow is assigned a weight reflecting its share of the link capacity, and packets are served based on virtual finishing times computed as if served under GPS, ensuring that no flow receives less than its entitled share even under contention. This promotes delay fairness for diverse traffic types, such as allocating more bandwidth to real-time video over email, but introduces computational overhead from maintaining per-flow timestamps and sorting, which can limit scalability in high-speed routers without hardware acceleration. Drop policies complement service disciplines by managing queue overflow, influencing queuing delay through proactive congestion control rather than reactive full-buffer drops. Tail drop, the traditional policy, discards arriving packets only when the buffer is exhausted, often triggering global synchronization in TCP flows due to correlated drops and exacerbating delay bursts from lock-out effects. In contrast, Random Early Detection (RED) mitigates this by probabilistically dropping packets when the average queue length exceeds a threshold, using the drop probability to signal incipient congestion early and encourage senders to reduce rates, thereby keeping queues shorter and delays more stable without biasing specific flows. RED parameters, such as minimum and maximum thresholds, are tuned based on link speed and traffic mix to balance under- and over-dropping.

Modeling and Analysis

Basic Queueing Models

Basic queueing models provide foundational abstractions for analyzing queuing delay in communication networks, where packets arrive, wait if necessary, and are served by a transmission link modeled as a server. These models abstract arrival processes—often Poisson for random packet arrivals—and service times, typically exponential or deterministic, to predict delay behavior under steady-state conditions. A standard classification system, known as Kendall's notation, describes queueing systems compactly as A/B/c/K/N/D, where A denotes the arrival process distribution (e.g., M for Markovian or Poisson), B the service time distribution (e.g., M for exponential, D for deterministic), c the number of servers, K the system capacity (infinite if omitted), N the population size (infinite if omitted), and D the queue discipline (e.g., FCFS for first-come-first-served, omitted if FCFS).^[22]^[23] The M/M/1 model represents a single-server queue with Poisson arrivals at rate \lambda and exponential service times at rate \mu, assuming an infinite buffer and FCFS discipline, making it a Markovian process suitable for modeling variable-rate network links like those affected by random traffic bursts. Key performance metrics include the utilization factor \rho = \lambda / \mu, which must be less than 1 for stability, indicating the fraction of time the server is busy; average queue length L_q = \rho^2 / (1 - \rho); and average waiting time in queue W_q = \rho / (\mu (1 - \rho)), derived from the balance between arrival and service rates. This model captures the essence of queuing delay as the time packets spend waiting due to contention, though it idealizes networks by assuming memoryless inter-arrival and service times.^[24]^[25] Basic queueing models like M/M/1 rely on assumptions such as infinite buffer capacity, which simplifies analysis but overlooks real-world packet drops from overflow, and independent arrivals and services, ignoring correlations in network traffic patterns like burstiness from upper-layer protocols. These limitations mean the models provide upper bounds on delay in infinite-buffer scenarios but underpredict losses in finite-buffer systems, necessitating extensions for practical network design; for instance, they do not account for priority scheduling or multi-class traffic without additional modifications.^[26] An important extension is the M/D/1 model, which retains Poisson arrivals but assumes deterministic (constant) service times, relevant for fixed-rate links in networks like circuit-switched or constant-bit-rate channels where transmission delays are predictable. In this model, the variance in service time is zero, leading to lower queuing delays than M/M/1 for the same utilization—specifically, average waiting time W_q = \rho / (2\mu (1 - \rho))—highlighting how service regularity reduces contention buildup. This abstraction aids in evaluating delay in scenarios with uniform packet sizes and steady transmission speeds, bridging idealized theory to deterministic network elements.^[27]^[23]

Mathematical Formulation

Little's Law provides a fundamental relationship in queueing systems, stating that the long-run average number of customers in the system L equals the arrival rate \lambda times the average time a customer spends in the system W, expressed as L = \lambda W.^[28] This law holds under general conditions for stable systems with stationary and ergodic processes, independent of the specific arrival or service distributions.^[28] Applying Little's Law to the queue specifically (excluding service time), the average queue length L_q relates to the average queuing delay W_q by L_q = \lambda W_q, allowing computation of W_q = L_q / \lambda once L_q is known from a model.^[28] For the M/M/1 queue, where arrivals follow a Poisson process with rate \lambda and service times are exponential with rate \mu > \lambda, the utilization is \rho = \lambda / \mu < 1. The average queuing delay derives from the steady-state queue length L_q = \rho^2 / (1 - \rho), yielding W_q = L_q / \lambda = \rho / (\mu (1 - \rho)) via Little's Law. The waiting time distribution is a mixture: probability $1 - \rho of zero delay, and conditional on waiting, exponential with rate \mu - \lambda. The variance of W_q is \rho (2 - \rho) / [\mu (1 - \rho)]^2, capturing jitter variability useful for performance analysis. Kingman's approximation extends delay estimates to more general G/G/1 queues, where interarrival and service times have arbitrary distributions but finite variance. In heavy traffic (\rho \approx 1), the average queuing delay approximates W_q \approx \frac{c_a^2 + c_s^2}{2} \cdot \frac{\rho}{1 - \rho} \cdot \frac{1}{\mu}, with c_a^2 and c_s^2 as the squared coefficients of variation for arrival and service processes, respectively. This heavy-traffic limit arises from diffusion approximations, where scaled queue length and workload converge to Brownian motion, providing accurate bounds even for moderate loads. In deterministic networks, such as those analyzed via network calculus, worst-case queuing delay bounds focus on bursty traffic without probabilistic assumptions. For a flow with maximum burst size b (in bits) arriving at a link of rate R, the maximum delay is W_{q,\max} = b / R, assuming no shaping beyond the burst. For packetized systems, if the burst comprises n packets of fixed size L (in bits), this simplifies to W_{q,\max} = (n - 1) \frac{L}{R} for the last packet, accounting for serialization without interference from other flows in isolated analysis.^[29] These bounds ensure predictability in time-sensitive applications by enforcing arrival curve constraints.

Impacts and Mitigation

Effects on Network Performance

Queuing delay significantly impacts latency-sensitive applications, such as online gaming, where even modest increases in end-to-end delay can degrade user experience and performance. In first-person shooter games, latencies exceeding 100 ms lead to a sharp 35% drop in player accuracy due to reduced responsiveness in client-server architectures. This variability introduced by queuing during network congestion exacerbates the issue, as packets wait unpredictably at routers, pushing total delays beyond acceptable thresholds for interactive play.^[30] Queuing delay also induces jitter, which disrupts real-time UDP-based streams like video conferencing or VoIP, where variable inter-packet delays cause audio/video artifacts and synchronization issues. In UDP traffic, lacking built-in congestion control, queuing variations directly translate to inconsistent playback, often requiring additional buffering that further increases latency. For TCP flows, tail queuing delays—experienced by packets at the end of bursts—can trigger timeouts and retransmissions, reducing effective throughput and prolonging flow completion times in congested links. These effects collectively undermine Quality of Service (QoS) by amplifying delay variability across protocols.^[5] In data center environments employing fat-tree topologies, microsecond-scale queuing delays compound across multiple switch hops, leading to scalability challenges in high-throughput applications like distributed storage or machine learning training. For instance, in a standard fat-tree with 100 Gbps switches, maximum queuing delays can reach 12.6 μs per switch under load, accumulating to higher tail latencies that bottleneck microsecond-level RPCs and increase job completion times. This compounding is particularly acute in oversubscribed links, where bursty traffic from virtualized workloads amplifies queue buildup, limiting overall network efficiency.^[31] Within 5G networks, queuing delay constrains ultra-reliable low-latency communication (URLLC) services, such as industrial automation or autonomous vehicles, by consuming critical portions of stringent end-to-end delay budgets. URLLC targets typically allocate 1 ms for user-plane latency, with queuing at base stations and edges potentially accounting for significant overhead in bursty traffic scenarios, risking packet drops if reliability exceeds 99.999%. In these setups, end-to-end budgets are dynamically partitioned via Packet Delay Budgets (PDBs), but queuing variability can violate sub-millisecond requirements, necessitating careful resource allocation to maintain ultra-low latency.^[32]^[33]

Buffer Management Strategies

Buffer management strategies aim to optimize buffer sizes and behaviors in network devices to control queuing delay, balancing the need for high throughput against low latency. A foundational approach involves sizing buffers according to the bandwidth-delay product (BDP), defined as B = \text{RTT} \times C, where RTT is the round-trip time and C is the link capacity; this rule ensures full link utilization for a single TCP flow by accommodating packets in flight during congestion windows.^[34] For multiple flows, buffers can be reduced to approximately \frac{\text{BDP}}{\sqrt{n}}, where n is the number of flows, as statistical multiplexing allows smaller queues without significant underutilization.^[34] However, modern TCP variants like Cubic or BBR permit even smaller buffers, down to 0.25 BDP, while maintaining performance.^[34] Active queue management (AQM) techniques enhance buffer control by proactively dropping or marking packets to prevent excessive buildup, targeting low latency rather than just avoiding loss. The Controlled Delay (CoDel) algorithm monitors packet sojourn time—the delay from enqueue to dequeue—and drops packets if this exceeds a target threshold (typically 5 ms) for an interval (100 ms), using a control law where drop probability increases as \frac{1}{\sqrt{t}} (t being time since last drop) to signal congestion early.^[35] This design is parameter-light and adapts to varying link rates, reducing bufferbloat in consumer networks without starving bursts, as drops are avoided if the buffer holds less than one maximum transmission unit (MTU).^[35] Similarly, the Proportional Integral controller Enhanced (PIE) AQM estimates queue delay using Little's law (\text{delay} = \frac{\text{queue length}}{\text{dequeue rate}}) and adjusts drop probability periodically with a self-tuning formula: p = p + \alpha (\text{current delay} - \text{target delay}) + \beta (\text{current delay} - \text{previous delay}), where \alpha and \beta scale based on congestion level (e.g., 0.125 Hz and 1.25 Hz when p > 10\%).^[36] PIE targets an average delay of 15 ms, allowing burst tolerance while controlling jitter for real-time applications.^[37] A practical extension is Flow Queue CoDel (FQ-CoDel), which integrates CoDel with per-flow queuing to ensure fairness and eliminate head-of-line blocking. By hashing flows into separate queues and applying CoDel independently, FQ-CoDel prevents a single flow from monopolizing the buffer, reducing latency for short flows amid long ones; it is the default AQM in Linux kernels since version 3.11 (2013) and remains widely deployed as of 2025.^[38] Explicit Congestion Notification (ECN) complements AQM by marking packets with a congestion experienced (CE) codepoint in the IP header (using two ECN bits set to '11') instead of dropping them when queues approach thresholds, enabling senders to reduce rates via TCP's congestion window without retransmissions.^[39] Endpoints negotiate ECN capability during connection setup, with receivers echoing marks via the ECN-Echo (ECE) flag in acknowledgments and senders confirming via Congestion Window Reduced (CWR); this avoids the delay spikes from loss recovery, particularly in AQM-integrated setups like RED.^[39] By signaling congestion proactively, ECN reduces queuing buildup and supports low-latency flows without sacrificing throughput.^[39] In quality-of-service (QoS) frameworks, hybrid approaches combine first-in-first-out (FIFO) queuing with traffic shaping and policing to bound delays for aggregated traffic classes. Shaping delays excess packets into buffers to conform to a profile (e.g., token bucket), smoothing bursts before FIFO enqueueing, while policing discards or remarks non-conforming packets at ingress to enforce rates, preventing overload in downstream FIFO queues.^[40] The Differentiated Services (DiffServ) architecture applies these at network boundaries: classifiers mark packets with DiffServ codepoints, followed by shaping or policing to condition traffic, ensuring per-hop behaviors (PHBs) in core FIFO queues allocate resources predictably and cap delays for priority aggregates.^[40] These strategies involve inherent trade-offs, as larger buffers absorb bursts to minimize loss but exacerbate queuing delay—a phenomenon known as bufferbloat, where overprovisioning (e.g., 10x BDP) can inflate latency by factors of 10 or more, slowing TCP's congestion response quadratically.^[41] In home routers, bufferbloat manifests severely due to asymmetric links, with DSL upstream queues exceeding 600 ms and WiFi adding up to 3 seconds at low rates from oversized fixed buffers (e.g., 256 packets).^[41] Core networks, while less prone due to higher speeds, still suffer from "dark buffers" without AQM, leading to hidden delays; thus, strategies like CoDel or ECN are crucial to prioritize latency over mere loss avoidance in edge devices.^[41]

References

[1]
Interactive end-of-chapter exercises - gaia
Queuing Delay. Consider the queuing delay in a router buffer, where the packet experiences a delay as it waits to be transmitted onto the link.Missing: definition explanation
[2]
[PDF] 1 Queueing Delay Consider a buffer where packets arrive at the ...
The queueing delay is the time the packets (customers) is assigned to a queue for transmission and the time it starts transmitting. D uring this time, the ...Missing: computer networks explanation
[3]
[PDF] Computer Networks A gentle introduction to queuing theory
Queuing theory is the field responsible for the study of such systems. The theory will help us gain some insight about buffer space, packet delays, and network ...
[4]
[PDF] Lecture 11 - Delay Models I - Electrical and Computer Engineering
7. Queueing Delay: Queuing delay accounts for the time a packet waits in memory before it is processed and transmitted. The reason for delay may be because ...Missing: explanation | Show results with:explanation
[5]
Queueing Delay - an overview | ScienceDirect Topics
In computer networks, delay includes processing delay (time to handle a packet at routers), queueing delay (time a packet waits to be sent to a link), ...Introduction to Queueing... · Queueing Delay in Computer...
[6]
[PDF] A Machine Learning Approach to Estimating Queuing Delay ... - NJIT
Abstract—Queuing delay is a dynamic network parameter that plays an important role in defining the performance of Internet applications over an end-to-end ...Missing: explanation | Show results with:explanation
[7]
5 Packets - An Introduction to Computer Networks
On a LAN, the most significant is usually what we will call bandwidth delay: the time needed for a sender to get the packet onto the wire. This is simply the ...
[8]
A Brief History of the Internet - Internet Society
In December 1970 the Network Working Group (NWG) working under S. Crocker finished the initial ARPANET Host-to-Host protocol, called the Network Control ...Missing: queuing | Show results with:queuing
[9]
[PDF] Computer Networking - A Top Down Approach (8th Edition)
We'll examine delay, loss, and throughput of data in a computer network and provide simple quantitative models for end-to-end throughput and delay: models.
[10]
Analysis of delay and delay jitter of voice traffic in the Internet
Singling out queuing delay as the only source of jitter, we present the methodology used to quantify it. We come up with appropriate models for voice delay, ...
[11]
Understanding Delay in Packet Voice Networks - Cisco
Feb 2, 2006 · This paper explains the sources of delay when you use Cisco router/gateways over packet networks. Though the examples are geared to Frame Relay, the concepts ...Missing: seminal | Show results with:seminal
[12]
[PDF] Queueing systems
In this text we study the phenomena of standing, waiting, and serving, and we call this study queueing theory. Any system in which arrivals place demands upon a ...
[13]
[PDF] Why is the Internet Traffic Bursty in Short Time Scales?
TCP self-clocking, combined with network queueing, creates a two-level ON-OFF pattern, causing burstiness in short time scales, especially in bulk transfers.
[14]
[PDF] Understanding the Performance of TCP Pacing - CSE Home
TCP pacing evenly spaces data to avoid bursts, but can have worse throughput than regular TCP due to synchronized losses and delayed congestion signals.
[15]
[PDF] Comparison of VBR and CBR service classes for MPEG-2 video ...
To perform fair comparison, we de- rive NVBR and NCBR under the same delay bound in such a way that the maximum buffering delays are identical in both VBR and ...
[16]
[PDF] Traffic descriptors for VBR video teleconferencing over ATM networks
In this paper, we examine the potential statistical multiplexing gain (SMG) of VBR video, and we equate the delay in both the VBR and CBR systems and attempt to ...Missing: queuing | Show results with:queuing
[17]
[PDF] On the Self-Similar Nature of Ethernet Traffic
The main objectives of this paper are (i) to establish in a statistically rigorous manner the self-similarity characteristic or, to use a more popular notion, ...
[18]
[PDF] Self-Similarity in High-Speed Packet Traffic: Analysis and Modeling ...
Apr 17, 2002 · This paper shows that Ethernet traffic is statistically self-similar, using statistical analysis of traffic measurements to reveal this ...Missing: web delay
[19]
Explaining World Wide Web Traffic Self-Similarity - Computer Science
Since a self-similar process has observable bursts on all time scales, it exhibits long-range dependence; values at any instant are typically correlated with ...
[20]
6.2 Queuing Disciplines - Computer Networks: A Systems Approach
This section introduces two common queuing algorithms—first-in, first-out (FIFO) and fair queuing (FQ)—and identifies several variations that have been proposed ...
[21]
[PDF] Introduction to Queueing Theory
❑ Kendall Notation: A/S/m/B/k/SD, M/M/1. ❑ Number in system, queue, waiting, service. Service rate, arrival rate, response time, waiting time, service time.
[22]
[PDF] Module 7: Introduction to Queueing Theory (Notation, Single ...
– However, times at which service completes are renewal points, since arrival process is Poisson. • Need to determine the residual lifetime of a customer in ...<|separator|>
[23]
[PDF] CS-350: Fundamentals of Computing Systems
The notation M/M/1 describes the "queue" in the system as having a Markovian arrival process (i.e. Poisson) and a Markovian (i.e. exponential) service.
[24]
[PDF] 4 The M/M/1 queue - No home page
For this simple model there are many ways to determine the solution of the equations. (2)–(4). Below we discuss several approaches. 4.2.1 Direct approach. The ...
[25]
[PDF] Queueing Theory - Texas A&M University
Sep 29, 2006 · The objective of this chapter is to introduce fundamental concepts in queues, clarify assumptions used to derive results, motivate models using ...Missing: limitations | Show results with:limitations
[26]
[PDF] Lecture 5/6 Queueing - MIT
• M/D/1. – Poisson arrivals, deterministic service times (fixed). Server. µ ... • The state of an M/M/1 queue is the number of customers in the system.
[27]
A Proof for the Queuing Formula: L = λW | Operations Research
John D. C. Little, (1961) A Proof for the Queuing Formula: L = λW. Operations Research 9(3):383-387. https://doi.org/10.1287/opre.9.3.383 · PDF download. Close ...
[28]
[PDF] LATENCY AND PLAYER ACTIONS IN ONLINE GAMES
Latency determines not only how players experience online gameplay but also how to design the games to mitigate its effects and meet player expectations. Page 2 ...
[29]
[PDF] Harmony: A Congestion-free Datacenter Architecture - CS@Cornell
Harmony is a datacenter architecture that guarantees bounded queueing at each switch, ensuring bounded delays and eliminating congestion-related drops.
[30]
Ultra Reliable and Low Latency Communications - 3GPP
Jan 2, 2023 · URLLC is a major axis of enhancement of the 5G System. In Rel-16 ... Delay Budget, Packet Delay Budget (PDB) and enhancements of session ...
[31]
[PDF] Ultra-Reliable Low-Latency Communication - 5G Americas
URLLC is part of. Release 15 and has a target of 1-millisecond. URLLC also is ideal for applications that require end-to-end security and 99.999 percent ...
[32]
[PDF] Updating the Theory of Buffer Sizing - Stanford University
This is the bandwidth delay product, BDP = C · RTT. Fact (BDP Rule). TCP Reno fully utilizes a link if and only if B ≥ BDP. Most proofs of the BDP rule ...Missing: seminal | Show results with:seminal
[33]
[PDF] Controlled Delay Active Queue Management draft-nichols-tsvwg ...
Jul 11, 2012 · Codel - The Controlled-Delay Active Queue Management algorithm. Copyright (C) 2011-2012 Kathleen Nichols <nichols@pollere.com>. Redistribution ...
[34]
[PDF] PIE: A lightweight control scheme to address the bufferbloat problem
There is a pressing need to design intelligent queue management schemes that can control latency and jitter; and hence provide desirable quality of service to ...
[35]
RFC 3168 - The Addition of Explicit Congestion Notification (ECN) to ...
This memo specifies the incorporation of ECN (Explicit Congestion Notification) to TCP and IP, including ECN's use of two bits in the IP header.
[36]
RFC 2475: An Architecture for Differentiated Services
### Summary of Differentiated Services (DiffServ) Using Traffic Shaping, Policing, and FIFO Queuing to Bound Delays in QoS Frameworks
[37]
Bufferbloat: Dark Buffers in the Internet - Communications of the ACM
Jan 1, 2012 · At the advent of congestion control in TCP, the recommendation for buffer sizing was to have a bandwidth-delay product (BDP's) worth of buffer, ...Missing: paper | Show results with:paper