Fact-checked by Grok 2 weeks ago

Queueing theory

Queueing theory is the mathematical study of waiting lines, or queues, that form when the demand for a service or resource exceeds immediate supply, employing probabilistic models to predict and optimize system performance such as waiting times and queue lengths.^[1] Pioneered by Danish engineer and mathematician Agner Krarup Erlang in 1909, queueing theory emerged from efforts to analyze telephone traffic congestion at the Copenhagen Telephone Exchange, where Erlang developed formulas to determine the minimum number of operators needed to handle calls efficiently.^[2] His work laid the foundation for stochastic processes in queueing, influencing subsequent developments in operations research during the mid-20th century. A typical queueing system consists of three core components: an arrival process describing how customers or jobs enter the system (often modeled as a Poisson process for random arrivals), a service mechanism specifying how they are processed (e.g., exponential service times for constant average rates), and a queue discipline dictating the order of service, such as first-in-first-out (FIFO).^[3] Queueing models are classified using Kendall's notation, introduced by David G. Kendall in 1953, which specifies the arrival distribution (A), service distribution (B), number of servers (c), system capacity (K), population size (N), and discipline (D) in the format A/B/c/K/N/D; a common example is the M/M/1 model for a single-server queue with Markovian (Poisson) arrivals and exponential service times.^[4] One of the cornerstone results in queueing theory is Little's law, formulated by John D. C. Little in 1961, which states that under steady-state conditions, the average number of items in a system (L) equals the average arrival rate (λ) multiplied by the average time an item spends in the system (W), expressed as L = λW. This law holds for a broad class of systems regardless of specific distributions, provided the system is stable (i.e., arrival rate less than service rate), and it enables quick calculations of performance metrics without full model solving.^[5] Queueing theory finds wide applications in optimizing real-world systems, including telecommunications for call center staffing, computer networks for packet routing and congestion control, healthcare for patient flow in hospitals, and manufacturing for assembly line efficiency.^[1] In modern contexts, such as cloud computing and IoT, extensions incorporate non-stationary arrivals and priority queues to handle dynamic workloads, with ongoing research integrating machine learning for adaptive modeling.^[6]

Fundamentals

Description

Queueing theory is the mathematical study of waiting lines, or queues, that form when the demand for a service or resource exceeds its immediate availability. It provides frameworks for modeling the dynamics of systems where entities, such as customers, jobs, or packets, arrive, wait if necessary, and are served, enabling the prediction of delays and optimization of resource allocation. Central to this field are the arrival processes, which describe how entities enter the system; service times, which indicate the duration required to process each entity; and queue disciplines, which determine the order of service, such as first-come, first-served.^[7] Key components of queueing systems include the arrival rate λ, representing the average number of entities entering per unit time; the service rate μ, the average number processed per unit time per server; the number of servers c; queue capacity, which may be finite or infinite; and population size, distinguishing between infinite sources (like Poisson arrivals from a large pool) and finite sources (like a closed system with limited entities). These elements allow analysts to simulate real-world scenarios, such as checkout lines or data packet transmission, under varying conditions. Common assumptions underpin most models, including stationarity (constant rates over time), independence of interarrival and service times, and often an infinite queue length to simplify analysis, though finite capacities are considered for bounded systems.^[7]^[8] Performance in queueing systems is evaluated through metrics such as the average queue length L (number of entities waiting or in service), average waiting time W (time spent in the system), throughput (rate of completed services), and utilization ρ = λ / (c μ) (proportion of time servers are busy on average, with stability requiring ρ < 1). A foundational insight, Little's Law, relates these by stating that under steady-state conditions, the average number of entities in the system equals the arrival rate times the average time spent therein. These measures help assess efficiency and guide improvements, like adding servers to reduce congestion.^[8] The field originated in 1909 with Danish engineer Agner Krarup Erlang's work on telephone traffic congestion at the Copenhagen Telephone Exchange, where he developed formulas to determine the minimum number of operators needed during peak hours. This practical foundation in telecommunications engineering laid the groundwork for broader applications. In modern contexts, queueing theory informs diverse areas, including traffic flow optimization at intersections, patient scheduling in healthcare facilities, network design in computing and telecommunications, and supply chain management in manufacturing.^[9]^[7]

History

Queueing theory emerged in the early 20th century from efforts to model probabilistic events in telecommunications, particularly the random arrival of telephone calls. In 1909, Agner Krarup Erlang, a Danish engineer at the Copenhagen Telephone Exchange, developed the first queueing models to optimize telephone switchboard operations, introducing formulas for traffic intensity and the probability of waiting times in systems with multiple servers. These Erlang formulas, derived from birth-death processes, laid the groundwork for analyzing loss systems without queues, such as the Erlang B and C formulas, which remain fundamental in teletraffic engineering. The field's theoretical foundations advanced in the 1920s and 1930s through work on single-server queues. In 1930, Felix Pollaczek derived the Pollaczek–Khinchine formula for the mean waiting time in an M/G/1 queue, which was recast in probabilistic terms by Aleksandr Khinchin in 1932, generalizing Erlang's results to non-exponential service times and establishing key results for queue length distributions. Khinchin's 1930s contributions further formalized the analysis of single-server queues under Poisson arrivals, proving stability conditions and asymptotic behaviors. Siméon Denis Poisson's earlier 1837 work on the Poisson distribution provided the probabilistic basis for modeling random arrivals, though its direct application to queues came later in the century. Post-World War II, queueing theory experienced rapid growth, driven by operations research and computing needs. In 1953, David G. Kendall introduced his notation system for classifying queueing models, standardizing descriptions like M/M/1 for Markovian arrivals, service, and single server. Paul J. Burke's 1956 theorem established output theorems for single queues, enabling analysis of queueing networks. James R. Jackson's 1957 work on product-form solutions for open queueing networks extended these ideas, showing that under certain conditions, network performance decomposes into individual queue behaviors. John Little formulated Little's Law in 1961, a fundamental relation linking average queue length, arrival rate, and waiting time, applicable across diverse systems. The 1960s and 1970s saw queueing theory applied to computer systems and networks, with Leonard Kleinrock's pioneering work on packet switching for the ARPANET, detailed in his 1964 thesis and subsequent books "Queueing Systems Volume 1" (1975) and "Volume 2" (1976), which synthesized delay analysis and optimization techniques. F. P. Kelly's 1979 book "Reversibility and Stochastic Networks" advanced mean-field approximations for complex networks. In the late 20th and early 21st centuries, the discipline integrated with simulation methods, stochastic processes, and big data analytics, influencing fields like manufacturing and healthcare, while computational tools enabled modeling of large-scale systems.

Kendall's Notation

Kendall's notation provides a compact symbolic representation for specifying the characteristics of queueing systems, introduced by David G. Kendall in his 1953 paper on stochastic processes in queueing theory.^[10] Originally comprising three elements in the form A/B/c, it has become the standard shorthand for classifying single-node queueing models based on key stochastic assumptions. In the basic notation A/B/c, the symbol A denotes the interarrival-time distribution, B indicates the service-time distribution, and c represents the number of parallel servers. Common designations for A and B include M for Markovian (Poisson arrivals or exponential service times), D for deterministic (constant interarrival or service times), G for general independent distributions, and Ek for the Erlang-k distribution (a special case of gamma with integer shape parameter k). If omitted, c defaults to 1, implying a single-server system. The notation was later extended to the full form A/B/c/K/N/D to accommodate additional features, where K specifies the maximum system capacity (including queue and servers, defaulting to infinite), N denotes the finite population size from which customers are drawn (defaulting to infinite for open systems), and D describes the queueing discipline (defaulting to first-come, first-served, or FCFS). For instance, the M/M/1 model describes a single-server queue with Poisson arrivals and exponential service times under infinite capacity and population, while M/D/2/5 indicates two servers with deterministic service times, Poisson arrivals, and a total system capacity of 5. The G/G/1 notation applies to a single-server queue with arbitrary independent arrival and service processes, highlighting the generality beyond parametric assumptions. Extensions to the notation incorporate more sophisticated distributions for modern applications, such as PH for phase-type (a class of distributions representable by finite Markov chains) and MAP for Markovian arrival processes (capturing correlated arrivals via phase transitions).^[11] These allow notations like MAP/PH/c for multi-server queues with phase-based arrivals and services, enabling matrix-analytic solutions for non-exponential cases.^[11] Despite its ubiquity, Kendall's notation has limitations in describing complex systems, as it focuses on single nodes and assumes independence unless extended, failing to fully specify priorities, multi-class routing, or network interactions. For queueing networks, specialized frameworks like the BCMP notation supplement it by classifying node types (e.g., FCFS, processor sharing) that admit product-form stationary distributions.

Single Queueing Nodes

Birth-Death Processes

Birth-death processes form a foundational class of continuous-time Markov chains used to model the dynamics of single queueing nodes, where the state of the system is defined by the number of customers present, denoted as n = 0, 1, 2, \dots. In this framework, transitions occur exclusively between adjacent states: an increase from state n to n+1 represents a "birth," corresponding to a customer arrival, while a decrease from n to n-1 represents a "death," corresponding to a customer departure or service completion. This structure captures queueing systems without overtaking, such as those operating under first-in-first-out discipline, where the population evolves through incremental changes only.^[12]^[13] The arrival rate in state n, denoted \lambda_n, governs the birth transitions, while the service (or death) rate \mu_n (with \mu_0 = 0) governs departures for n \geq 1. These rates may depend on the current state n, allowing flexibility in modeling state-dependent behaviors, such as congestion effects on arrivals or multiple servers. For instance, in the classic M/M/1 queue—symbolized in Kendall's notation as a Markovian arrival process, Markovian service time, and single server—the rates are constant: \lambda_n = \lambda for all n and \mu_n = \mu for n \geq 1. This state-dependent formulation enables the representation of diverse queueing scenarios as birth-death processes.^[12]^[13] Under the assumption of ergodicity (requiring the traffic intensity \rho < 1 in constant-rate cases), the steady-state probabilities \pi_n satisfy the global balance equations. For n \geq 1,

\lambda_{n-1} \pi_{n-1} + \mu_{n+1} \pi_{n+1} = (\lambda_n + \mu_n) \pi_n,

and for the boundary state n = 0,

\mu_1 \pi_1 = \lambda_0 \pi_0,

with the normalization condition \sum_{n=0}^\infty \pi_n = 1. These equations derive from equating the flow into and out of each state, yielding a recursive solution for \pi_n in terms of \pi_0 and the rate ratios. For constant rates as in the M/M/1 model, the equations simplify to \lambda \pi_{n-1} + \mu \pi_{n+1} = (\lambda + \mu) \pi_n for n \geq 1.^[12]^[13] Birth-death processes find wide application in queueing theory, modeling systems like the infinite-server queue (M/M/∞), where \mu_n = n \mu reflects unlimited parallel service without queuing. They also accommodate finite-capacity queues by setting \lambda_n = 0 for n exceeding the capacity K, preventing overflow. Extensions include quasi-birth-death processes, which generalize the model to multi-dimensional state spaces with level-dependent transitions, enabling analysis of more complex systems like layered queues. These processes provide the probabilistic backbone for evaluating queue stability and long-run behavior in single-node settings.^[12]^[13]

M/M/1 Queue Analysis

The M/M/1 queue represents a fundamental model in queueing theory, characterizing a single-server system with Poisson-distributed customer arrivals at rate \lambda and exponentially distributed service times at rate \mu, assuming an infinite waiting room and first-come, first-served (FCFS) discipline.^[14] This setup captures many real-world scenarios, such as call centers or single-processor computing tasks, where interarrival and service durations exhibit memoryless properties.^[15] The model is ergodic and admits a steady-state distribution provided the traffic intensity \rho = \lambda / \mu < 1, ensuring that the arrival rate does not overwhelm the service capacity.^[16] As a continuous-time birth-death process with constant birth rate \lambda and death rate \mu, the steady-state behavior of the M/M/1 queue is derived from the global balance equations for the number of customers N(t) in the system. For n=0, the equation is \lambda \pi_0 = \mu \pi_1, reflecting the flow from empty to occupied states. For n \geq 1, the detailed balance yields \lambda \pi_{n-1} = \mu \pi_n, or equivalently, the cut-balance condition \lambda \pi_{n-1} + \mu \pi_{n+1} = (\lambda + \mu) \pi_n.^[14] Solving recursively from the n=0 case gives \pi_n = \rho \pi_{n-1} for n \geq 1, with normalization \sum_{n=0}^\infty \pi_n = 1 imposing \pi_0 = 1 - \rho. Thus, the steady-state probability of n customers is \pi_n = (1 - \rho) \rho^n for n = 0,1,2,\dots, following a geometric distribution with parameter $1 - \rho.^[15] Key performance measures follow directly from these probabilities. The expected number of customers in the system is L = \sum_{n=0}^\infty n \pi_n = \rho / (1 - \rho), while the expected number in the queue (excluding the one in service) is L_q = \sum_{n=1}^\infty (n-1) \pi_n = \rho^2 / (1 - \rho).^[16] By Little's law, the mean response time (sojourn time) is W = L / \lambda = 1 / (\mu - \lambda), and the mean waiting time in queue is W_q = L_q / \lambda = \rho / (\mu - \lambda). These relations highlight the queue's sensitivity to \rho: as \rho nears 1, L, L_q, W, and W_q diverge, signaling escalating congestion and potential instability even below the threshold.^[14] Transient analysis of the M/M/1 queue, addressing time-dependent probabilities P_n(t) = \Pr(N(t) = n), involves solving the Kolmogorov forward equations \frac{d}{dt} P_0(t) = -\lambda P_0(t) + \mu P_1(t) and, for n \geq 1, \frac{d}{dt} P_n(t) = \lambda P_{n-1}(t) + \mu P_{n+1}(t) - (\lambda + \mu) P_n(t), often via probability generating functions or matrix-exponential methods.^[15]

Little's Law and Basic Performance Measures

Little's Law, a fundamental theorem in queueing theory, states that in a stable queueing system, the long-run average number of customers in the system, denoted L, equals the long-run average arrival rate \lambda multiplied by the long-run average time a customer spends in the system, W. This relationship, L = \lambda W, holds under mild conditions, including a stationary system where arrival and service processes are independent of time in the long run. The theorem was originally proved by John Little in 1961 using a sample-path argument that relates the area under the cumulative arrival and departure curves. An extension of the law applies to the queue excluding the server, where the average number in the queue L_q equals the arrival rate \lambda times the average waiting time in the queue W_q, yielding L_q = \lambda W_q. A proof sketch via renewal theory considers the system as a renewal-reward process, where arrivals renew the process, and the "reward" accumulates the number of customers present over time; by the renewal-reward theorem, the long-run average reward per renewal (time in system) times the renewal rate (\lambda) equals the long-run average number of customers. This approach demonstrates the law's generality beyond Markovian assumptions.^[17] Key performance measures derived using Little's Law include server utilization \rho, defined as the long-run fraction of time the server is busy, which for a single-server system equals \lambda times the mean service time; throughput, the long-run effective departure rate, which equals \lambda in stable systems without losses; and the probability of delay P(W > 0), the long-run proportion of customers who wait before service. These metrics provide essential insights into system efficiency and customer experience.^[8] The law applies to general single-server queues, such as the G/G/1 queue with arbitrary arrival and service time distributions, as long as the system is stable (\lambda less than the service rate) and the stated conditions hold, allowing performance estimation without solving complex distributional equations. In practice, Little's Law enables quick assessment of system performance; for instance, in a supermarket checkout, observing an average of 5 customers in line with an arrival rate of 10 per hour implies an average wait of 30 minutes per customer, aiding capacity planning without full stochastic modeling.^[17] Limitations of Little's Law include its reliance on long-run averages, making it inapplicable to transient behaviors, and assumptions that all arriving customers enter the system without balking (refusing to join upon arrival) or reneging (leaving before service), which can distort \lambda if present. These constraints highlight the need for complementary models in systems with customer impatience.^[17]

Service Disciplines

First-Come, First-Served

First-Come, First-Served (FCFS), also known as First-In, First-Out (FIFO), is a fundamental queueing discipline in which customers are processed strictly in the order of their arrival to the queue, prohibiting any overtaking by later arrivals.^[18] This approach ensures that the first customer to join the queue is the first to receive service once the server becomes available.^[19] FCFS is inherently work-conserving, meaning the server remains active whenever customers are present in the system, avoiding any idle time due to scheduling decisions.^[20]^[21] It is also non-preemptive, as ongoing service for a customer is completed without interruption, regardless of subsequent arrivals.^[22] In systems with bursty arrival patterns, FCFS can result in clustered or bursty waiting times, where periods of low load alternate with extended delays during high-arrival bursts.^[23] The performance characteristics of FCFS vary by queue model. In the M/M/1 queue, the steady-state distribution of the number of customers in the system is insensitive to the specific work-conserving service discipline, including FCFS, due to the memoryless property of exponential service times.^[21] However, in the more general M/G/1 queue, while the mean waiting time remains the same across non-preemptive work-conserving disciplines, the variance of waiting time is influenced by the discipline; FCFS minimizes this variance compared to alternatives like last-come, first-served.^[23] The mean waiting time in an M/G/1 queue under FCFS is given by the Pollaczek-Khinchine formula:

W_q = \frac{\lambda E[S^2]}{2(1 - \rho)}

where \lambda is the arrival rate, E[S^2] is the second moment of the service time distribution, and \rho = \lambda E[S] is the traffic intensity (with E[S] as the mean service time).^[24]^[25] This formula highlights that the mean waiting time depends only on the first two moments of service time, independent of the full service time distribution beyond those moments.^[26] FCFS offers advantages in terms of fairness, as it treats all customers equally based on arrival order without favoring any based on attributes like service requirement, and simplicity, requiring minimal overhead for implementation.^[19] A key disadvantage is the potential for a single customer with an unusually long service time to delay all subsequent customers, an effect that can propagate through the queue and reduce overall efficiency in systems with variable service times.^[27] A notable variant of FCFS is head-of-line (HOL) priority, which extends the discipline to multiple customer classes by maintaining separate FCFS queues per priority level but allowing higher-priority customers to access the server only when it becomes free, without preempting ongoing service.^[28] This preserves the non-preemptive nature of FCFS while introducing class-based selection at service initiation points.^[22]

Priority and Preemptive Disciplines

Priority disciplines in queueing theory classify service rules where customers are ordered for service based on assigned priorities rather than strict arrival order, enabling preferential treatment for certain classes to optimize overall system performance or meet specific requirements. These disciplines are broadly categorized into non-preemptive and preemptive types. In the non-preemptive (head-of-the-line) discipline, higher-priority customers are served first only when the server becomes available after completing the current service; an arriving high-priority customer does not interrupt an ongoing low-priority service.^[28] This approach maintains service continuity but can delay high-priority customers if a long low-priority service is in progress. In contrast, preemptive disciplines allow a higher-priority customer to interrupt and displace a lower-priority customer's service upon arrival. The preemptive resume variant saves the interrupted service state and resumes it later from the interruption point, while preemptive restart discards the partial service and starts anew. Preemptive resume is widely analyzed for its fairness in preserving work done, though it requires mechanisms to handle interruptions. These disciplines are applied in multi-class queues, where customers belong to distinct classes with arrival rates \lambda_i and service rates \mu_i (or mean service times $1/\mu_i), ordered by decreasing priority (class 1 highest). The utilization for class i is \rho_i = \lambda_i / \mu_i, with total utilization \rho = \sum \rho_i < 1 for stability. A fundamental conservation law states that the weighted sum of mean waiting times, \sum \rho_i W_i, remains constant across work-conserving disciplines, equal to the mean unfinished work \rho W_0 / (1 - \rho) in M/M/1 systems, where W_0 is the mean residual service time; this law, due to Kleinrock, demonstrates that priorities merely redistribute delays without altering total workload.^[29] Performance under priority disciplines varies significantly by class. Higher-priority classes achieve waiting times comparable to a dedicated queue ignoring lower classes, benefiting from reduced interference, while lower-priority classes endure substantially longer delays and face potential starvation if the utilization of higher classes approaches 1. In the M/M/1 non-preemptive priority queue with identical service distributions across classes, the mean waiting time for class k is derived via busy-period analysis as

W_k = \frac{\sum_i \rho_i W_0}{(1 - \sum_{i < k} \rho_i)(1 - \sum_{i \leq k} \rho_i)},

where W_0 = 1/\mu is the mean residual service time (since exponential service implies memoryless residuals equal to mean service time); this formula, pioneered by Cobham, highlights how denominators shrink for lower k, shortening waits for high-priority classes.^[30] For preemptive resume in M/G/1 queues, the formula adjusts to use residuals only from higher and own classes, yielding W_k = R_k / [(1 - \sum_{i < k} \rho_i)(1 - \sum_{i \leq k} \rho_i)], where R_k = \sum_{i=1}^k \lambda_i E[S_i^2]/2, ensuring high-priority classes are unaffected by lower ones.^[31] A notable example is the shortest-job-first (SJF) discipline, a dynamic non-preemptive priority scheme assigning higher priority to customers with shorter expected service times. SJF minimizes the overall mean response time among non-preemptive policies, as proven optimal under the conservation law, though it increases waiting time variance, potentially harming fairness. Priority and preemptive disciplines find essential use in real-time systems, such as embedded computing and telecommunications, where critical high-priority tasks require bounded delays to prevent failures.

Queueing Networks

Open and Closed Networks

In queueing theory, open networks model systems where customers enter from external sources and exit after completing service, typically assuming an infinite population of potential arrivals. These networks consist of multiple interconnected service nodes, with customers routed from node i to node j according to fixed probabilities p_{ij}, where \sum_j p_{ij} \leq 1 and the remainder represents departure from the system. External arrivals are often modeled as Poisson processes, and the structure allows for analysis of throughput, delays, and congestion across the entire network.^[32] A specific class of open networks, known as Jackson networks, assumes exponential service times at each node and Markovian routing, leading to a product-form stationary distribution under stability conditions. In these networks, the joint steady-state probability of the system state factorizes into the product of marginal distributions for each queue, behaving as if the queues were independent despite interconnections. Stability requires that the traffic intensity \rho_k = \lambda_k / \mu_k < 1 for each node k, where \lambda_k is the effective arrival rate and \mu_k is the service rate; this ensures ergodicity and finite queue lengths in steady state.^[32] Closed networks, in contrast, feature a fixed number N of customers circulating indefinitely among a finite set of nodes, with no external arrivals or departures. The system's throughput and performance measures, such as response times, depend directly on N, as larger populations increase contention for resources but also drive higher utilization. These networks are particularly useful for modeling finite-resource systems like computer operating systems or manufacturing lines with recirculating jobs. Due to the bounded customer population, closed networks are always ergodic for finite N, exhibiting a unique steady-state distribution, provided the network is irreducible.^[33] The BCMP theorem extends product-form solutions to closed (and mixed open-closed) networks, accommodating more general service time distributions beyond exponentials, subject to specific service disciplines at each node: first-come-first-served (FCFS) with identical exponential service time distributions across customer classes, shortest-job-first (SJF) or processor-sharing (PS), or infinite-server (IS) queues. Under these conditions, the steady-state distribution takes a product form, where the joint probability is proportional to the product of individual queue state probabilities, normalized by a constant G(N) computed via successive convolutions over the possible state allocations of the N customers. This normalization ensures the probabilities sum to unity and facilitates computational analysis of performance metrics.^[34] Queueing networks build on single-node models by linking multiple queues via probabilistic routing, with node types often specified using extensions of Kendall's notation.^[32]

Routing and Product-Form Solutions

In queueing networks, routing mechanisms determine how customers move between service nodes after completing service. Probabilistic routing, often modeled with Bernoulli probabilities p_{ij}, specifies that a customer departing node i proceeds to node j with fixed probability p_{ij} (for j \neq 0) or exits the network with probability p_{i0} = 1 - \sum_{j \neq 0} p_{ij}.^[35] Tandem networks represent a special case of linear topology where routing is deterministic, with p_{i,i+1} = 1 until the final node, simplifying analysis compared to general topologies that allow arbitrary connections and cycles.^[35] Extensions include state-dependent routing, where probabilities p_{ij} vary with the current system state (e.g., queue lengths), and load-dependent routing, where service rates or server allocations adjust based on the number of customers at a node.^[36] Product-form solutions provide exact stationary distributions for certain queueing networks, where the joint probability factors into a product of marginal distributions for individual nodes. For an open Jackson network consisting of single-server nodes with exponential service times and Poisson external arrivals, the stationary distribution is given by

\pi(\mathbf{n}) = \frac{1}{G} \prod_{i=1}^M \pi_i(n_i),

where \mathbf{n} = (n_1, \dots, n_M) denotes the vector of queue lengths, G = 1 (as each marginal normalizes independently), and each \pi_i(n_i) = (1 - \rho_i) \rho_i^{n_i} with load \rho_i = \lambda_i / \mu_i < 1 (here, \lambda_i is the effective arrival rate to node i and \mu_i its service rate).^[35] This factorization holds under probabilistic routing and Markovian assumptions, implying that queues behave independently in steady state despite interactions. For closed networks with a fixed population of K customers and no external arrivals or departures, Gordon and Newell established a similar product form:

\pi(\mathbf{n}) = \frac{1}{G(K)} \prod_{i=1}^M \left( \frac{e_i}{\mu_i} \right)^{n_i},

where e_i is the relative visit rate to node i (solved from traffic equations e_i = \sum_j e_j p_{ji}), \mu_i the service rate, and G(K) normalizes over all states with \sum_i n_i = K.^[33] Burke's theorem underpins these results by establishing reversibility in Markovian queues. For an M/M/1 queue in steady state with arrival rate \lambda and service rate \mu > \lambda, the departure process is Poisson with rate \lambda, and the number of customers at time t is independent of departures after t.^[37] This property ensures that outputs from one queue form valid Poisson inputs to downstream queues in a network, preserving the product-form structure through time-reversibility.^[37] Mean-value analysis (MVA) offers an efficient iterative method to compute performance measures in closed product-form networks without evaluating the full state space. For a single-class network with M nodes and N customers, MVA recursively calculates the mean response time at node k for N customers as R_k(N) = S_k (1 + Q_k(N-1)), where S_k = 1 / \mu_k is the mean service time at k, and Q_k(N-1) is the mean queue length at k with N-1 customers (computed via Little's law: Q_k(n) = X(n) V_k R_k(n), with throughput X(n) = n / \sum_k V_k R_k(n) and visit ratio V_k); marginal queue lengths follow similarly.^[38] This approach avoids the normalizing constant and scales well for moderate N, with approximations like the Bard-Schweitzer method used for large N by bounding queue lengths. For multi-class networks with fixed numbers per class, MVA extends using the arrival theorem, where arriving customers see the time-average queue lengths with one fewer customer of their class.^[38] Extensions broaden product-form applicability. Kelly networks incorporate state-dependent routing and quasi-reversible queues, maintaining the product form under conditions like symmetric routing probabilities.^[36] The BCMP theorem generalizes to multiclass open, closed, or mixed networks with processor-sharing, infinite-server, or first-come-first-served disciplines (with exponential service times per class), yielding product forms insensitive to service time distributions beyond their means for PS and IS.^[34] Insensitivity results, notably in Kelly's framework, show that stationary distributions depend only on mean service times for certain disciplines, even with general distributions.^[36] Computational algorithms facilitate exact solutions. For closed networks, the convolution algorithm efficiently computes the normalizing constant G(K) via recursive marginal probabilities: define g_k(r) as the partial sum for the first k nodes with r customers, then g_k(r) = \sum_{j=0}^r g_{k-1}(r-j) \cdot \left( \frac{e_k}{\mu_k} \right)^j for single-server nodes, with G(K) = g_M(K); time complexity is O(M K^2).^[39] Approximate MVA variants, such as those using load-dependent bounds, extend scalability for large populations or complex topologies.^[38]

Approximation Methods

Mean-Field Approximations

Mean-field approximations provide a powerful framework for analyzing large-scale queueing networks by replacing complex interactions among many components with their average behavior, enabling tractable deterministic models. In queueing theory, this approach is particularly useful for systems where the number of queues or customers, denoted as N, grows large, allowing the application of probabilistic limits to simplify stochastic dynamics. The core idea draws from statistical mechanics, where individual particle interactions are approximated by interactions with a mean field representing the system's average state.^[40] The mean-field limit arises through the law of large numbers, whereby the empirical measure of the system's state converges to a deterministic evolution as N \to \infty. For a system of N interacting queues, the stochastic processes describing queue lengths or occupancies fluctuate around their means, and in the limit, the state evolves according to a system of ordinary differential equations (ODEs) that capture the average flow of customers. This limit holds under conditions of weak interactions, where each queue's dynamics depend on the global average rather than specific pairwise couplings, ensuring the approximation's validity for homogeneous or symmetrically structured networks.^[41]^[42] In typical setups, consider N identical queues sharing resources such as bandwidth or servers, with arrival rates \lambda and service rates \mu influenced by the aggregate occupancy. At equilibrium, the mean occupancies \bar{n}_i for queue i satisfy fixed-point equations derived from balancing inflows and outflows, often solved iteratively to yield stationary approximations. These equations replace the full Markov chain analysis with algebraic or differential constraints, making computation feasible for otherwise intractable models. For instance, in balanced fairness allocations, the fixed points ensure equitable resource distribution across queues, minimizing variance in performance metrics.^[43]^[40] Applications of mean-field approximations span diverse domains, including processor-sharing networks where multiple jobs share server capacity proportionally to their numbers, and Aloha protocols in wireless networks where nodes contend for channel access. In processor-sharing systems, the approximation computes throughput and response times by modeling the average share per job, crucial for bandwidth allocation in communication infrastructures. Similarly, for slotted Aloha, mean-field models analyze collision probabilities and stability under random access, revealing long-term throughput trends in large populations of nodes.^[43]^[44] The theoretical foundation relies on propagation of chaos, which posits that in the mean-field limit, the queues become asymptotically independent despite initial correlations, justifying the replacement of joint distributions with products of marginals. Error bounds for these approximations are typically of order O(1/\sqrt{N}), quantifying the deviation between the finite-N stochastic system and its deterministic limit, with tighter bounds available under additional regularity assumptions on arrival and service processes.^[42]^[45] A representative example is the mean-field analysis of closed Jackson networks, where a fixed number of customers circulate among N nodes without external arrivals. Under the mean-field scaling, the average occupancy vector \bar{\mathbf{n}}(t) evolves according to the fluid equation

\frac{d \bar{\mathbf{n}}}{dt} = \lambda(\bar{\mathbf{n}}) - \mu(\bar{\mathbf{n}}) \bar{\mathbf{n}},

where \lambda(\cdot) and \mu(\cdot) denote routing and service rate functions dependent on the mean state, leading to equilibrium solutions that approximate cycle times and utilizations. This deterministic dynamics captures condensation phenomena, where queue lengths concentrate at certain nodes in unbalanced topologies.^[46] The primary advantage of mean-field approximations lies in their scalability for massive systems, such as data centers with thousands of servers, where exact analyses are computationally prohibitive; by reducing dimensionality to ODEs or fixed points, they enable real-time performance predictions and optimization in cloud computing environments.^[43] Recent advances as of 2025 include refined mean-field approximations for discrete-time queueing networks with changing population sizes, improving accuracy for dynamic systems.^[47]

Diffusion and Fluid Approximations

Fluid approximations model queueing systems by scaling time and space by a factor of $1/\epsilon, where \epsilon is a small parameter representing system size or load, transforming stochastic queues into deterministic fluid flows that capture the macroscopic behavior under large-scale limits. In this framework, queue lengths and workloads evolve as continuous, deterministic processes governed by ordinary differential equations, where inflows and outflows balance except at boundaries, enforced via reflection mechanisms. The Skorokhod reflection map provides a pathwise solution to these boundary constraints, mapping an unconstrained net input process to a regulated output that remains non-negative; for a single queue, the workload W(t) satisfies W(t) = X(t) - \int_0^t \mu \, ds + L(t), where X(t) is the net input, \mu is the service rate, and L(t) is the cumulative idle time or regulator process ensuring non-negativity. This deterministic approximation is particularly useful for analyzing stability and transient dynamics in complex networks, as established in foundational work on fluid limits for queueing networks.^[48]^[49] Diffusion approximations extend fluid models by focusing on fluctuations around the deterministic path, especially in heavy-traffic regimes where the traffic intensity \rho approaches 1. For a single-server queue, the scaled queue length process converges to a reflected Brownian motion (RBM), a one-dimensional diffusion with drift -\mu(1-\rho) and variance incorporating arrival and service variabilities, reflected at zero to maintain non-negativity. In multiserver or network settings, the limit is a multidimensional RBM on the orthant, with oblique reflections determined by routing and allocation policies, capturing interactions across queues. These limits arise as the number of servers or arrival rates scale to infinity while keeping \rho \to 1, providing Gaussian approximations for performance measures like waiting times.^[50]^[51] Heavy-traffic analysis parameterizes the arrival rate as \lambda_n = n \mu (1 - \beta / \sqrt{n}) for large n, where \beta > 0 controls the approach to saturation, yielding diffusion-scaled processes that converge to Gaussian limits rather than deterministic ones. This scaling centers the process around the fluid limit while magnifying deviations by \sqrt{n}, resulting in Brownian approximations for transient and steady-state behaviors. For the G/G/1 queue, Kingman's formula approximates the steady-state waiting time distribution as exponential with rate \frac{2(1-\rho)}{\rho (c_a^2 + c_s^2)}, where c_a^2 and c_s^2 are the squared coefficients of variation of interarrival and service times, valid asymptotically as \rho \to 1. In networks, similar scalings lead to semimartingale reflected Brownian motions (SRBMs), whose stationary distributions inform delay predictions.^[52] The theoretical foundation relies on functional central limit theorems (FCLTs) and invariance principles, which establish weak convergence of scaled input processes (e.g., renewal arrivals) to Brownian motion in the Skorokhod space of cadlag functions, extended via continuous mappings to output processes like queues. These principles, applied to net inputs and regulators, justify the diffusion limits for general service disciplines and topologies, ensuring the approximations hold under mild moment conditions on primitives. Fluid models address large deviations and mean paths, while diffusion captures central-limit-scale fluctuations, with the former serving as a zeroth-order approximation and the latter as a first-order refinement around it.^[53]^[54] Applications include bandwidth-sharing protocols in communication networks, where fluid limits model fair allocation under proportional scheduling, revealing stability regions and convergence to equilibria. In inventory management, diffusion approximations assess stock levels under heavy demand, approximating reorder policies via reflected processes to minimize holding costs. Heavy-traffic limits also prove asymptotic optimality of policies like maximum pressure or cμ-rules, where scheduling decisions minimize the diffusion's workload drift, achieving near-optimal performance as \rho \to 1. Recent developments as of November 2025 include diffusion approximations for loss queueing systems with multiple sources and Stein's method for error bounds in general clock models, enhancing precision in heavy-traffic analyses.^[55]

Applications

Telecommunications and Networks

Queueing theory has been foundational in telecommunications for modeling traffic in circuit-switched networks, where resources are allocated for the duration of a call. In loss systems, such as the M/M/c/c model, arriving calls are blocked if all c circuits are occupied, with no waiting allowed. The Erlang B formula calculates the blocking probability as a function of offered traffic load and the number of circuits, providing a key metric for dimensioning trunk groups to minimize call blocking. Similarly, the Erlang C formula extends this to systems allowing queuing up to c, estimating the probability that a call must wait, which informs staffing and capacity planning in telephony switches.^[56]^[57] In packet-switched networks, queueing models address buffer management and delay. The M/M/1 queue serves as a basic model for router buffers, where packets arrive according to a Poisson process and experience exponential service times, yielding average queue length and waiting time via the Pollaczek-Khinchine formula. For network-wide performance, Kleinrock's independence approximation assumes that arrival processes at successive queues are independent, enabling the estimation of end-to-end delays as the sum of individual queue delays despite correlations in tandem systems. This approximation has been widely used to predict average packet delays in interconnected networks, facilitating capacity allocation and routing decisions.^[58]^[59] Modern telecommunications applications leverage advanced queueing models for wireless and IP-based services. In wireless networks, the ALOHA protocol, particularly slotted ALOHA, can be analyzed as an M/G/1 queue with retransmissions, where collisions lead to exceptional service for the first packet in a busy period, modeling throughput and delay under random access. For Voice over IP (VoIP), the G.711 codec, which uses pulse-code modulation without compression, is modeled using queueing theory to assess packetization delays and jitter, ensuring low-latency transmission over UDP by treating voice packets as a fluid flow in M/M/1 or priority queues. In 5G networks, slicing employs priority queues to isolate services like ultra-reliable low-latency communication (uRLLC), enhanced mobile broadband (eMBB), and massive machine-type communications (mMTC), with queueing models assigning higher priorities to uRLLC to meet stringent delay bounds.^[60]^[61] Performance evaluation in telecommunications often focuses on end-to-end delays in tandem queues, where packets traverse multiple nodes, and the total delay approximates the sum of marginal delays under independence assumptions, though correlations can inflate variances. Quality of service (QoS) mechanisms utilize priority queues to differentiate traffic classes, such as assigning strict priority to real-time flows over best-effort data, reducing latency for delay-sensitive applications while maintaining fairness via weighted scheduling.^[62]^[63] Practical examples illustrate these models' utility. Call centers, integral to telecommunications support, are modeled as M/M/c queues with abandonments, where impatient customers renege after an exponential patience time, allowing computation of service levels and abandonment rates to optimize agent staffing. In router buffers, tail-drop policies discard packets only when buffers overflow, leading to global synchronization and unfairness in TCP flows, whereas active queue management (AQM) techniques like random early detection probabilistically drop packets to signal congestion early, stabilizing queues and improving throughput for bursty traffic.^[64]^[65] Challenges in applying queueing theory to telecommunications include handling non-stationary traffic, where arrival rates vary over time, requiring time-dependent models or fluid approximations to capture diurnal patterns in call volumes. Burstiness, often modeled via self-similar processes with long-range dependence, arises from aggregated sources like web traffic, exhibiting fractal-like variability across scales that amplifies queue lengths beyond Poisson predictions and necessitates heavy-tailed distributions for accurate delay forecasting.^[66]^[67]

Computing and Operations Research

In computer systems, queueing theory informs CPU scheduling strategies such as shortest job first (SJF), which prioritizes processes with the smallest expected execution time to minimize average waiting times, as demonstrated in foundational analyses of non-preemptive scheduling policies.^[68] For disk I/O management, first-come, first-served (FCFS) treats requests in arrival order but can lead to high seek times under random access patterns, whereas the SCAN algorithm sweeps the disk head bidirectionally to serve requests more efficiently, reducing average head movement in multi-queue environments.^[69] In cloud computing, virtual machines are often modeled as closed queueing networks with fixed job populations circulating among servers, enabling performance predictions for resource allocation and workload balancing in multi-tenant environments.^[70] Operations research applies queueing models to inventory control through (s-1, s) policies, where an order is placed to replenish inventory to level s whenever the inventory position reaches s-1, optimizing holding costs and stockouts for perishable items under Poisson demand.^[71] Supply chains leverage tandem queue models to represent assembly lines as sequential stations, where blocking and inventory buffers mitigate delays, as explored in make-to-stock production systems with planned stocks at each stage.^[72] Simulation tools like Arena integrate queueing models to validate system designs, such as optimizing supermarket checkout lanes by varying server counts to reduce average queue lengths and customer wait times in discrete-event simulations.^[73] In healthcare, emergency departments are frequently modeled as M/M/c queues with reneging, where patients abandon the system after exceeding patience thresholds, allowing estimation of abandonment rates and staffing needs to improve throughput.^[74] Appointment scheduling employs queueing frameworks to balance no-shows and overbooking, minimizing provider idle time while ensuring patient access in outpatient clinics.^[75] Financial transaction processing, particularly in high-frequency trading, uses queueing theory to model order books as priority queues, analyzing latency and liquidity impacts from rapid arrivals that can exacerbate volatility in limit order systems.^[76] Recent advancements incorporate machine learning for queue length prediction, enhancing traditional models with data-driven forecasts of arrival rates in dynamic environments like service counters.^[77] Reinforcement learning optimizes dynamic control in queueing networks by learning admission and routing policies that minimize delays, outperforming static heuristics in simulated multi-server setups.^[78] Little's Law underpins service level agreement (SLA) compliance in computing by relating average queue lengths to arrival rates and response times, guiding capacity planning to meet latency targets in data centers and cloud services.^[8]

References

[1]
Queueing Theory - an overview | ScienceDirect Topics
Queueing theory is a research area that focuses on analyzing the flow of people, things, or information in a line. It aims to understand and address ...
[2]
Erlang | The Engines of Our Ingenuity - University of Houston
Sep 25, 2014 · His 1909 paper, 'The Theory of Probability and Telephone Conversations' is generally regarded as the first paper in the field of queuing theory ...
[3]
Introduction to Queueing Theory | SpringerLink
Our main objective here is to present the essential ideas and techniques that are used to analyze simple queueing systems. The word “simple” in queueing theory ...Missing: sources | Show results with:sources
[4]
[PDF] Type Queues Kendall's Notation for Queues A/B/C/D/E
A. Inter-arrival time distribution. B. Service time distribution. C. Number of servers. D. Maximum number of jobs that can be there in the.
[5]
[PDF] Chapter 5 Little's Law
Little's Law states that the average number of items in a system equals the average arrival rate multiplied by the average time an item spends in the system. ( ...
[6]
A review on queuing theory: Concepts, models, and applications
Jun 12, 2025 · Queuing theory is a vital mathematical tool used to model and analyze systems where waiting lines or queues occur.Missing: sources | Show results with:sources
[7]
Queueing Theory - an overview | ScienceDirect Topics
Queueing theory is defined as the mathematical study of waiting lines (queues) used to model systems such as vessel arrivals and handling in port terminals, ...Missing: scholarly | Show results with:scholarly
[8]
[PDF] 27 A Little Bit of Queueing Theory
First, observe that as 𝜆, the mean arrival rate, increases, all the performance metrics mentioned earlier increase (get worse). Also, as 𝜇, the mean service ...Missing: LW | Show results with:LW
[9]
[PDF] Coding and Control for Communication Networks - Sean Meyn
Oct 20, 2009 · One hundred years ago the Danish Scientist, Agner Krarup Erlang launched the field of queueing theory with his paper The theory of probabilities ...
[10]
Stochastic Processes Occurring in the Theory of Queues and their ...
... Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain. David G. Kendall · DOWNLOAD PDF + SAVE TO ...
[11]
Analysis of a Queue with General Service Demands and Multiple ...
Herein, the standard Kendall's notation A X / B / c is used to describe a ... Construction of Markov chains for discrete time MAP/PH/K queues. Perform ...
[12]
[PDF] Queueing systems
Birth-Death Processes . PART II: ELEMENTARY QUEUEING THEORY. Chapter 3 Birth-Death Queueing Systems in Equilibrium. 3.1. General Equilibrium Solution. •. 3. 3.
[13]
[PDF] Fundamentals of Queueing Theory - download
Gross, Donald. Fundamentals of queueing theory I Donald Gross, John F. Shortie, Carl M. Harris.- 4th ed. p.cm. Includes bibliographical references and index ...
[14]
[PDF] CS 547 Lecture 12: The M/M/1 Queue
To analyze the M/M/1 queue, we'll make use of the tagged customer method, or, as I like to call it, “the method of who's-in-front-of-me.” Consider an ...
[15]
[PDF] 4 The M/M/1 queue - No home page
It is also possible to derive the equations (2) and (3) directly from a flow diagram, as shown in figure 1. Figure 1: Flow diagram for the M/M/1 model. The ...
[16]
[PDF] Module 7: Introduction to Queueing Theory (Notation, Single ...
Queueing Discipline Specification. • Queueing discipline is typically specified using Kendall's notation (A/S/m/B/K/SD), where. – Letters correspond to six ...<|control11|><|separator|>
[17]
A review ofL=λW and extensions | Queueing Systems
A fundamental principle of queueing theory isL=λW (Little's law), which states that the time-average or expected time-stationary number of customers in a system
[18]
[PDF] Queuing theory
A birth-death process is a process wherein the system's state at any t is a nonneg- ative integer.Missing: seminal sources
[19]
[PDF] QUEUEING THEORY AND MODELING - Columbia Business School
A queueing model is a mathematical description of a queuing system which makes some specific assumptions about the probabilistic nature of the arrival and ...
[20]
[PDF] Queueing Theory without Probabilities
Jun 5, 1995 · A discipline is said to be work-conserving if the server is not idle when there is a customer waiting. Work-conserving Law: The sequence of idle ...
[21]
[PDF] Simple queueing models - University of Bristol
We assume that the server is busy whenever there is at least one customer in the system. Such a service discipline is called work-conserving. If the service ...
[22]
[PDF] Queueing Theory - Texas A&M University
Sep 29, 2006 · Consider a single stage queueing system where the arrivals are according to a Poisson process with average arrival rate λ per unit time (which ...Missing: LW | Show results with:LW
[23]
[PDF] Chapter 8: Queueing Theory - Emunix Documentation on the Web
○ Also called: Pollaczek-Khintchine formula. ○ Is exact, not an approximation, for M/G/1. ○ Recognize (1+SCV)/2 ? ... ○ FCFS has lowest wait-time variance, LCFS.Missing: Khinchine | Show results with:Khinchine
[24]
Queueing Time - an overview | ScienceDirect Topics
We recall that for the M/G/1 queue, the LST of the steady-state waiting (queueing) time distribution is given by the Pollaczek-Khinchin formula: (8.6.4) W ...
[25]
[PDF] Symbolic Moment Calculation for an M/G/1 Queue - InfoShako
Figure 1. Symbolic calculation of the moments of the waiting time for an FCFS M/G/1 system: Taylor series expansion of the Pollaczek-Khinchine formula ...
[26]
Mitigating long queues and waiting times with service resetting - PMC
We demonstrate that a simple service resetting mechanism can reverse the deleterious effects of large fluctuations in service times.
[27]
operating systems - The convoy effect in process scheduling
Apr 15, 2013 · FCFS (First-Come, First-Served) scheduling can also cause blocking in a busy dynamic system in another way, known as the convoy effect. When one ...
[28]
4.9.1 Prememptive and Nonpreemptive Priorities - MIT
Each of the r queues will be assumed to run on a FCFS basis, but any given priority class cannot obtain access to the service facility unless no other user ...
[29]
[PDF] Assigning Priorities (or not) in Service Systems with Nonlinear ...
Stieltjes transforms of the waiting time distributions for M/G/1 priority queues and considering other priority mechanisms such as preemptive prioritization.<|control11|><|separator|>
[30]
Priority Assignment in Waiting Line Problems - PubsOnLine
The position of a unit or member of a waiting line is determined by a priority assigned to the unit rather than by its time of arrival in the line.
[31]
[PDF] Priority queues
Consider an M/G/1 queue where the customers are divided into K priority classes, k = 1,...,K: - class 1 has the highest priority and class K the lowest ...
[32]
Networks of Waiting Lines | Operations Research - PubsOnLine
Optimal service and arrival rates in Jackson queueing networks. 13 January ... Jackson, (1957) Networks of Waiting Lines. Operations Research 5(4):518 ...
[33]
https://pubsonline.informs.org/doi/10.1287/opre.15.2.254
[34]
Reversibility and Stochastic Networks - Statistical Laboratory
Reversibility and Stochastic Networks. F. P. Kelly. This is an experiment to compare different formats for the distribution of the text of this book, ...
[35]
The Output of a Queuing System - PubsOnLine
THE OUTPUT OF A QUEUING SYSTEM. PAUL J. BURKE. Bell Telephone Laboratories, New York, New York. (Received June 25, 1956). For a queuing system with Poisson ...
[36]
Mean-Value Analysis of Closed Multichain Queuing Networks
ABSTRACT. It tS shown that mean queue sizes, mean waiting tunes, and throughputs in closed multiple-cham queuing networks which have product-form solution ...
[37]
Open, Closed, and Mixed Networks of Queues with Different ...
Special cases of the results presented here have been developed by Ferdinand [9], Posner and Bernholtz [15], Baskett [1], Baskett and Palacios [2], and Chandy ...
[38]
Computational algorithms for closed queueing networks with ...
Methods are presented for computing the equilibrium distribution of customers in closed queueing networks with exponential servers. Expressions for various.
[39]
Balancing Queues by Mean Field Interaction | Queueing Systems
Nov 6, 2004 · Abstract. Consider a queueing network with N nodes in which queue lengths are balanced through mean-field interaction. When N is large, we study ...<|control11|><|separator|>
[40]
A mean-field limit for a class of queueing networks - Penn State
The limit N → ∞ is discussed, where N is the branching number of the network graph. This procedure is inspired by an analogy with statistical mechanics (the ...
[41]
Propagation of Chaos for Queueing Networks - SIAM.org
We consider propagation of chaos phenomena for closed Markovian queueing networks with increasing numbers of nodes. The queues at different nodes behave ...
[42]
Improving the mean-field fluid model of processor sharing queueing ...
Improving the mean-field fluid model of processor sharing queueing networks for dynamic performance models in cloud computing. Author links open overlay panel
[43]
[PDF] Random Multi-access Algorithms - A Mean Field analysis - Hal-Inria
May 19, 2006 · Abstract: In this paper, using mean field techniques, we present a performance analysis of random back-off algorithms, such as the ...
[44]
On the Approximation Error of Mean-Field Models
This paper analyzes the approximation error of mean-field models for continuous-time Markov chains (CTMC), and focuses on mean-field models that are ...Missing: foundational | Show results with:foundational
[45]
Condensation in large closed Jackson networks - Project Euclid
... mean queue lengths are uniformly bounded and when there exists a node where the mean queue length tends to ∞ ∞ under the above limit (condensation ...
[46]
[PDF] Queueing Networks via Fluid Limit Models
It follows from Lemma 2.2 of Dai and Weiss [6] that the fluid model is stable whenever (4.8) holds. This together with Theorem 4.1 gives the following stability.
[47]
[PDF] Chapter 3 Skorokhod Problems - UCSD Math
We show that the solution exists and is unique. This is used to define a path-to-path mapping called the reflection map, or Skorokhod map.
[48]
On Queues in Heavy Traffic - Kingman - 1962
We say that a single server queue is in “heavy traffic” when the traffic intensity p is less than, but very near, unity. Then the equilibrium waiting time w ...
[49]
Reflected Brownian Motion on an Orthant - Project Euclid
... Reiman has shown that this restriction is met by all diffusions Z Z arising as heavy traffic limits in open K K -station queuing networks. Our process Z Z ...
[50]
[PDF] Heavy Traffic Limit Theorems for Queues a Survey
Jan 2, 2025 · Previous surveys of heavy traffic research appear in KINGMAN. (1965a) and WHITT (1968). Asymptotic methods in queueing have been reviewed by ...
[51]
[PDF] SUFFICIENT CONDITIONS FOR FUNCTIONAL-LIMIT-THEOREM ...
The familiar queueing principle expressed by the formula L = AW can be inter- preted as a relation among strong laws of large numbers. In a previous paper, we.
[52]
A central-limit-theorem version ofL=λw | Queueing Systems
This relation between the central limit theorems is conveniently expressed in terms of functional central limit theorems, using the continuous mapping theorem ...
[53]
[PDF] The Erlang B and C Formulas
The Erlang B formula gives the steady-state blocking probability in the Erlang loss model, which is a probability classic.
[54]
[PDF] Chapter 4 Circuit-Switching Networks
○ Blocking occurs if all trunks are busy, i.e. N(t)=c. ○ If call requests are Poisson, then blocking probability. P b is given by Erlang B Formula. ○ The ...
[55]
[PDF] Chapter V: Analysis of Packet Switching Networks
Kleinrock's approximation is accurate in networks with multiple channels at each node because: ... Kleinrock's Independence Approximation and Proportional Routing.
[56]
[PDF] Kleinrock Independence Approximation
The Kleinrock Independence Approximation. We now formulate a framework for approximation of average delay per packet in telecommunications networks.Missing: switched | Show results with:switched
[57]
(PDF) Performance Analysis of G.711 and G.729 Codec Schemes ...
Apr 15, 2024 · The choice of codecs and queuing techniques becomes crucial for ensuring optimal performance, especially in networks with diverse traffic types.
[58]
5G Infrastructure Network Slicing: E2E Mean Delay Model and ...
To that end, we develop a Queuing Theory (QT)-based model to estimate the end-to-end (E2E) mean response time of the infrastructure slices. Specifically, we ...
[59]
https://www.cs.ucr.edu/~mart/204/queueing_II.pdf
[60]
Delays in a series of queues with correlated service times
The simulation study focuses on comparisons of end-to-end delays for independent service times at different nodes and correlated service times, respectively. It ...Missing: telecommunications | Show results with:telecommunications
[61]
Application of queueing models with abandonment for Call Center ...
The aim of this article is to show that analytical queueing models M/M/c+G with abandonment, with patience time represented by generic (particularly mixed) ...
[62]
Dimensioning Drop-tail and AQM (RED) buffers at access networks ...
For a Drop-tail buffer we show that efficiency and fairness are hard to guarantee, while for the RED queue management algorithm we determine an optimal and ...
[63]
[PDF] A Nonstationary Poisson View of Internet Traffic - CAIDA
Second-order self-similar processes are characterized by a hyperbolically decaying autocorrelation function and are exten- sively used to model long-range ...
[64]
Explaining World Wide Web Traffic Self-Similarity - Computer Science
Traffic that is bursty on many or all time scales can be described statistically using the notion of self-similarity. Self-similarity is the property we ...
[65]
[PDF] Scheduling: Introduction - cs.wisc.edu
SCHEDULING: INTRODUCTION. TIP: THE PRINCIPLE OF SJF. Shortest Job First represents a general scheduling principle that can be applied to any system where the ...
[66]
[PDF] New Disk Scheduling Algorithms - for Real-Time Systems
Specif- ically, all requests in the I/O queue are divided into multiple priority levels. The SCAN algorithm is used within each level, which means that the disk ...Missing: theory | Show results with:theory
[67]
(PDF) A queuing theory model for cloud computing - ResearchGate
Aug 10, 2025 · This paper presents a model based on queuing theory to study computer service QoS in cloud computing.
[68]
(S − 1, S) Policies for Perishable Inventory | Management Science
We consider (S − 1, S) policies for a single item whose lifetime is fixed and known with certainty. Demands are generated by a stationary Poisson process ...
[69]
Tandem Queues with Planned Inventories | Operations Research
This paper explores a natural generalization of the classic tandem-queue model, designed specifically to represent make- to-stock production processes.
[70]
(PDF) Discrete simulation applied to queue management in a ...
DES was applied to manage a supermarket's queuing system, varying number of cashiers from 0 -10, using Arena software package, and reached a conclusion of 88.23 ...
[71]
Queueing Problems in Emergency Departments: A Review of ... - NIH
Dec 30, 2021 · This paper aims to provide an extensive review of studies addressing queueing-related problems explicitly related to emergency departments.
[72]
Appointment Scheduling Problem in Complexity Systems of the ...
Mar 3, 2022 · In theory, the typical queuing problem in appointment scheduling has long been a source of consternation for domestic and international ...
[73]
(PDF) The Impact of High-Frequency Trading on Market Liquidity
Aug 8, 2025 · This paper presents a mathematical modeling approach to analyze the impact of HFT on market liquidity using queueing theory and game-theoretic ...
[74]
A Machine Learning Approach to Waiting Time Prediction in ...
Queueing theory has been widely used to assess client waiting times, to optimize staff schedules, and to increase the robustness of a queueing system against a ...
[75]
RL-QN: A Reinforcement Learning Framework for Optimal Control of ...
Nov 14, 2020 · RL-QN is a reinforcement learning algorithm for queueing networks that uses model-based RL on a subset of states and a stabilizing policy for ...