Multilevel feedback queue

A multilevel feedback queue (MLFQ) is a CPU scheduling algorithm employed in operating systems to efficiently manage process execution by dividing the ready queue into multiple priority levels, where processes can dynamically move between queues based on their observed CPU usage and behavior, approximating shortest-job-first scheduling without prior knowledge of process burst times.^[1] Originally described by Fernando J. Corbató and colleagues in 1962 as part of the Compatible Time-Sharing System (CTSS), the MLFQ scheduler addresses the challenges of mixed workloads, including interactive and batch jobs, by prioritizing short or I/O-bound processes while ensuring progress for longer-running ones.^[1] Key to its operation are several rules: processes begin in the highest-priority queue and are scheduled using round-robin (RR) within each queue; upon exhausting a queue's time quantum, a process demotes to the next lower-priority queue with a potentially larger quantum; and mechanisms like periodic priority boosts prevent indefinite starvation of low-priority jobs.^[1]^[2] This adaptive feedback mechanism allows MLFQ to favor responsive, short jobs—reducing average response times—while compute-intensive jobs gradually descend to lower queues, balancing throughput and fairness without requiring advance predictions of job lengths.^[2]^[1] Implementations vary across systems: early versions appeared in Multics and BSD UNIX, while modern examples include Solaris (with up to 60 queues and configurable time slices from 20 ms to hundreds of ms) and elements in Windows NT scheduling.^[1]^[3] Despite its strengths, MLFQ can risk process starvation if boosts are infrequent, though tunable parameters like maximum wait times mitigate this in practice.^[2]^[3]

Introduction

Definition and Purpose

A multilevel feedback queue (MLFQ) is a class of CPU scheduling algorithms employed in operating systems to manage process execution through a hierarchy of priority-based queues, where processes can dynamically shift between queues according to their runtime behavior. This structure enables the system to observe and adapt to process characteristics, such as CPU burst lengths, without requiring advance knowledge of job durations. Originally developed for time-sharing systems, MLFQ approximates the efficiency of shortest-job-first (SJF) scheduling by initially treating all processes as potentially short-running, thereby promoting fairness and efficiency in multiprogrammed environments.^[4]^[1] The primary purpose of MLFQ is to enhance user responsiveness by prioritizing short, interactive, or I/O-bound processes, which typically require quick execution slices, while mitigating the risk of long-running CPU-bound processes monopolizing system resources. By demoting processes that exhibit prolonged CPU usage to lower-priority queues, MLFQ prevents system hogging and maintains high overall throughput, ensuring that the majority of jobs complete in a timely manner even under varying workloads. This dynamic adaptation supports balanced performance in interactive computing scenarios, where low response times for foreground tasks are critical alongside efficient batch processing.^[1]^[4] At its core, MLFQ addresses the shortcomings of static scheduling algorithms, such as round-robin, which apply uniform time quanta to all processes regardless of their I/O or CPU intensity, often leading to suboptimal responsiveness for mixed workloads. Through feedback mechanisms that adjust priorities based on empirical usage patterns, MLFQ incorporates learning to better allocate CPU time, fostering a more equitable and performant system that evolves with process demands.^[1]

Historical Development

The multilevel feedback queue (MLFQ) scheduling algorithm originated in 1962 with the work of Fernando J. Corbató and colleagues at MIT, as described in their paper on the Compatible Time-Sharing System (CTSS). This system, running on modified IBM 709 hardware, pioneered time-sharing by allowing multiple users to access the computer interactively through remote terminals, a departure from the era's dominant batch processing paradigms.^[5] A key milestone was the first practical implementation of MLFQ in CTSS, which supported up to 30 simultaneous users by dynamically adjusting process priorities to favor short, interactive tasks over long-running computations.^[5] This innovation enabled efficient resource sharing on a single machine, fundamentally shifting computing from sequential job batches to concurrent, user-driven sessions and laying groundwork for modern multitasking environments. For these contributions to time-sharing, Corbató received the 1990 ACM A.M. Turing Award.^[6] The MLFQ built upon earlier multilevel queue concepts from 1950s batch processing systems, where fixed-priority queues managed job streams to optimize throughput.^[7] It was further refined in the 1960s within Multics, the successor to CTSS co-developed by MIT, Bell Labs, and General Electric, to accommodate growing interactive workloads in a multi-user utility computing model.^[5] While influencing variants in later operating systems like Unix, the algorithm's core structure has endured as a foundational approach without significant modifications since the 1970s.^[1]

Fundamentals of Process Scheduling

Basic Concepts in CPU Scheduling

In operating systems, processes execute through a cycle of CPU and I/O bursts, where a CPU burst represents the time a process requires the processor for computation before performing input/output operations or blocking.^[8] CPU bursts vary in length; short bursts are typical of I/O-bound processes, which spend most time waiting for I/O and thus interact frequently with the system, while long bursts characterize CPU-bound processes that perform intensive calculations with minimal I/O.^[9] Key performance metrics for evaluating scheduling effectiveness include turnaround time, defined as the interval from process arrival to completion; waiting time, the duration spent in the ready queue excluding actual execution; and response time, the period from arrival until the process first receives CPU attention.^[8] These metrics balance goals like maximizing CPU utilization and throughput while minimizing delays, particularly in systems handling diverse process types.^[9] Fundamental scheduling algorithms provide baselines for managing process execution. First-come, first-served (FCFS) allocates the CPU to the longest-waiting ready process in arrival order, offering simplicity and fairness without preemption.^[8] Its strengths lie in low overhead and ease of implementation, but it suffers from the convoy effect, where a single long CPU-bound process delays many short I/O-bound ones, leading to poor average waiting times (e.g., 17 ms in example workloads).^[8] Shortest-job-first (SJF) prioritizes processes with the smallest estimated CPU burst, achieving optimality in minimizing average waiting and response times for non-preemptive scenarios assuming accurate burst predictions.^[8] While SJF improves throughput over FCFS (e.g., reducing average wait to 7 ms), it risks indefinite starvation for longer jobs if short ones continually arrive and requires prior knowledge of burst times, which is often unavailable.^[8] Round-robin (RR) scheduling assigns fixed time quanta (e.g., 10-100 ms) to each ready process in a circular queue, preempting upon expiration to promote fairness and responsiveness.^[8] It excels in time-sharing environments by preventing starvation and supporting interactive workloads, yielding balanced waiting times (e.g., 5.66 ms with appropriate quanta).^[8] However, RR incurs context-switching overhead, and poorly chosen quanta can mimic FCFS's convoy effect for CPU-bound processes or degrade throughput via excessive preemptions in mixed settings.^[8] Static algorithms like FCFS, SJF, and RR falter in mixed workloads combining I/O-bound and CPU-bound processes, as fixed ordering or quanta fail to adapt to varying burst patterns, resulting in inefficiencies such as prolonged waits for interactive tasks or underutilized CPU during I/O phases.^[10] For instance, FCFS and RR exhibit convoy delays when CPU-intensive jobs block shorter ones, while SJF's reliance on static estimates exacerbates starvation without dynamic adjustments.^[11] This underscores the need for adaptive scheduling that observes process behavior over time to dynamically prioritize based on observed bursts, enhancing responsiveness and utilization across heterogeneous demands.^[9]

Evolution to Multilevel Queues

Single-queue scheduling algorithms, such as round-robin (RR), treat all processes equally by allocating CPU time in fixed quanta to each ready process, which ensures fairness but often results in inefficiencies for mixed workloads combining interactive and CPU-bound jobs.^[1] For instance, short interactive tasks may suffer long wait times behind long-running batch processes, degrading response times, while simple FCFS scheduling exacerbates convoy effects where short jobs are delayed by a single long job.^[1] To address these issues, multilevel queue scheduling emerged as an advancement, partitioning processes into multiple distinct queues based on fixed attributes like process type or priority at the time of creation.^[1] Each queue operates independently with its own scheduling policy: higher-priority queues, typically for interactive or system processes, might employ RR with short quanta to prioritize responsiveness, while lower-priority queues for batch jobs could use FCFS to minimize overhead for long computations.^[1] This hierarchical structure allows the scheduler to favor critical tasks without uniformly applying a single algorithm, improving overall system balance in early multiprogrammed environments.^[1] Despite these benefits, multilevel queues suffer from rigid assignments, where processes cannot migrate between queues even if their behavior evolves—for example, an initially interactive process turning CPU-intensive remains stuck in a high-priority queue, potentially starving others.^[1] This lack of adaptability limits responsiveness to dynamic workload changes, highlighting the need for feedback mechanisms to enable priority adjustments based on observed execution patterns.^[1] Early implementations in 1960s multiprogramming systems demonstrated how fixed assignments could constrain flexibility, paving the way for dynamic reclassification in subsequent designs.^[1]

The MLFQ Algorithm

Queue Structure and Priorities

The multilevel feedback queue (MLFQ) employs a hierarchical arrangement of multiple queues, each corresponding to a distinct priority level, to manage process scheduling efficiently. Systems typically implement between 2 and n queues, with 3 queues being a common setup to balance simplicity and effectiveness in distinguishing process behaviors. The queues are ordered such that the uppermost queue holds the highest priority, and priority levels decrease progressively down the structure, ensuring that higher-priority queues always preempt those below them. This fixed hierarchy implies priorities through queue position rather than assigning explicit numerical values to individual processes.^[1] Upon entering the system, all processes are placed in the topmost (highest-priority) queue, regardless of their initial characteristics. This initial assignment favors interactive or short-duration tasks, such as I/O-bound processes that require quick CPU bursts. The scheduler selects the process from the head of the highest-priority non-empty queue, only proceeding to lower queues when upper ones are vacant. In this way, lower queues, like the bottommost one, execute solely when all superior queues lack ready processes.^[1] A representative 3-queue structure illustrates this organization: the first queue prioritizes new or short jobs expected to complete rapidly, the second queue handles medium-duration tasks, and the third queue accommodates long-running processes that demand sustained CPU time. This setup, first conceptualized in early time-sharing systems, allows the scheduler to allocate resources based on queue level, with higher queues receiving immediate attention to minimize response times for critical workloads.^[1]

Process Movement and Feedback Rules

The multilevel feedback queue (MLFQ) scheduling algorithm incorporates a dynamic feedback mechanism that enables processes to move between queues based on their observed CPU usage patterns, allowing the system to adaptively prioritize short, interactive jobs while managing longer, CPU-intensive ones. This movement is governed by a set of core rules that observe process behavior over time, approximating the performance of shortest-job-first (SJF) scheduling without prior knowledge of burst lengths. Short CPU bursts keep processes in higher-priority queues for quick execution, whereas prolonged CPU usage leads to demotion, ensuring efficient resource allocation in time-sharing environments.^[1] Central to this mechanism is the demotion rule: when a process exhausts its allocated time quantum at a given queue level—regardless of the number of scheduling quanta it has received—it is moved to the next lower-priority queue. This feedback penalizes CPU-bound processes by reducing their priority, preventing them from monopolizing the CPU and allowing interactive processes to progress. For instance, a process starting in the highest-priority queue that fully consumes its quantum will shift downward, reflecting its longer burst characteristics.^[1] Promotion rules counterbalance demotion to ensure fairness and prevent indefinite starvation. New processes entering the system are always placed in the highest-priority queue, giving them an initial opportunity for rapid execution. Additionally, to address potential starvation of low-priority processes, a periodic boost mechanism elevates all processes back to the top queue after a fixed time period S, effectively implementing aging based on system-wide wait times. For processes that voluntarily yield the CPU—such as those blocking on I/O operations—they typically remain at their current queue level upon resumption, though some variants promote them to a higher queue to favor I/O-bound jobs. These rules collectively form a decision tree: higher-priority queues are serviced first, equal-priority processes use round-robin, and movements are triggered solely by quantum exhaustion or system timers, without considering intra-queue order.^[1]^[1] The feedback principle underlying these movements relies on historical CPU burst observations to dynamically classify processes, favoring those with short bursts (e.g., interactive tasks) by retaining them in upper queues while sinking CPU-intensive jobs lower. This empirical approximation of SJF enhances responsiveness in mixed workloads, as demonstrated in early time-sharing systems where short jobs complete swiftly without explicit burst predictions.^[1]

Intra-Queue Scheduling

In multilevel feedback queue (MLFQ) scheduling, once the highest-priority non-empty queue is selected for execution, processes within that queue are managed using queue-specific algorithms designed to balance responsiveness and throughput. Higher-priority queues, intended for interactive or short jobs, typically employ round-robin (RR) scheduling with short time quanta to ensure quick response times for user-facing processes.^[1] For example, a top-priority queue might use RR with a 10 ms quantum, allowing multiple processes to share the CPU in a circular manner without long waits.^[1] This approach preempts a running process after its quantum expires, returning it to the end of the same queue for fair sharing among peers.^[1] Lower-priority queues, which handle CPU-bound or long-running jobs, often use first-come, first-served (FCFS) or RR with longer quanta to prioritize overall system throughput over individual responsiveness. In FCFS queues, the process at the head runs to completion without intra-queue preemption, minimizing context-switch overhead for batch-oriented workloads.^[12] Alternatively, RR in these queues may employ quanta of 100 ms or more, enabling some interleaving while favoring sustained execution.^[1] The choice between FCFS and extended RR in lower queues depends on system goals, with FCFS common in the base (lowest) queue to ensure completion of demoted processes.^[12] Queue selection follows a strict priority order: the scheduler always dispatches from the highest non-empty queue, preempting any lower-queue process if a higher-priority one becomes ready.^[1] This inter-queue preemption occurs immediately upon arrival to a higher queue, suspending the current execution and switching contexts.^[1] Within an RR queue, ties in readiness are resolved through standard circular queuing, where processes are served in the order they were enqueued, with each getting an equal turn up to the quantum limit.^[1] In the base queue using FCFS, no such intra-queue ties arise, as execution proceeds sequentially until all processes finish.^[12]

Configuration and Tuning

Time Quantum Assignments

In multilevel feedback queue (MLFQ) scheduling, time quanta are assigned differently to each queue to optimize for varying process behaviors, with shorter durations in higher-priority queues to favor interactive or short-burst tasks and longer durations in lower-priority queues to accommodate CPU-intensive workloads. This variation ensures that short jobs receive quick service without excessive waiting, while long-running jobs progress gradually without monopolizing the CPU. The original MLFQ design in the Compatible Time-Sharing System (CTSS) used a fixed base quantum q, but allowed processes to run for an effective slice of $2^\ell \times q at level \ell, where levels increase downward from the highest priority, effectively doubling the allowable runtime per level to reflect growing tolerance for larger bursts.^[13] The assignment of time quanta is primarily determined by expected process burst times, balancing the trade-off between system responsiveness and overhead from context switches. Quanta that are too short in higher queues (e.g., under 10 ms) lead to frequent preemptions and increased context-switching costs, which can degrade overall throughput, whereas excessively long quanta reduce interactivity for users awaiting response. In practice, higher-priority queues typically use quanta of 8-10 ms to approximate human perception limits for responsiveness, while lower queues extend to 16-32 ms or more, and the bottom queue may employ an infinite quantum under first-come, first-served (FCFS) to complete batch jobs efficiently. These choices stem from empirical tuning based on workload characteristics, as overly aggressive short quanta exacerbate switching overhead in systems with high process density.^[1] Tuning quanta often involves scaling them progressively across queues, such as doubling per level in a three-queue system—e.g., 8 ms for the top queue (prioritizing quick interactive tasks), 16 ms for the middle (medium bursts), and 32 ms or infinite for the bottom (long CPU-bound jobs)—to align with anticipated burst distributions and ensure fairness. This exponential growth allows short processes to complete within the initial small quanta, while longer ones accumulate service time across levels without starvation. An approximate response time for a process demoted to level k can be estimated as the sum of effective quanta up to that level: \sum_{i=0}^{k} 2^i q \approx 2^{k+1} q - q, highlighting how early levels contribute minimally to delay for short jobs but build cumulatively for others. Demotion occurs when a process exhausts its assigned quantum in a queue, prompting re-evaluation in the next lower-priority level. In real-world implementations like Solaris, quanta range from 20 ms at the highest priority to hundreds of milliseconds at the lowest, configurable via system tables to adapt to specific hardware and load profiles.^[1]^[14]

Priority Adjustment Mechanisms

In multilevel feedback queue (MLFQ) scheduling, priority adjustment mechanisms are essential for maintaining fairness and preventing issues such as starvation, where lower-priority processes might indefinitely wait without CPU access. The primary mechanism is aging, which periodically increases the priority of processes that have been waiting excessively in lower queues. This involves monitoring the wait time of a process and promoting it to a higher-priority queue after a predefined interval, ensuring that even CPU-bound or long-running jobs eventually receive service. For instance, under Rule 5 of a standard MLFQ implementation, all jobs are elevated to the topmost queue every S time units, where S serves as a tunable parameter—often set to around 100 milliseconds or 1 second—to balance responsiveness and equity.^[1] In practical systems like Solaris, aging is implemented through a starvation timer that checks process wait times at regular intervals, such as every second via a ts_update function. If a process's dispatch wait time (ts_dispwait) exceeds its maximum wait threshold (ts_maxwait), its priority is boosted to a higher level, typically ts_lwait (e.g., 50 or above), with the default ts_maxwait set to 0 for most queues but extended to 32000 ticks at the lowest priority (59) to guarantee eventual promotion. This approach ensures no process starves indefinitely, as the aging rate can be parameterized—for example, promoting after 100 ms of wait—to adapt to workload characteristics. Demotion thresholds may also extend beyond single quantum exhaustion; in refined MLFQ variants, a process is only moved down after consuming a full time slice at its current level or accumulating multiple quanta, preventing premature penalization of variable burst processes.^[15]^[1] Additional adjustments enhance fairness in specialized MLFQ configurations. For example, tracking an estimated burst time, often denoted as τ, allows dynamic priority recalculation based on historical CPU usage, promoting short-burst interactive jobs while demoting those exceeding their τ estimate. Optional integration of lottery scheduling within queues allocates CPU shares probabilistically to low-priority processes, using tickets proportional to wait time to probabilistically boost their chances without rigid promotions. These mechanisms, while not universal, address edge cases like priority inversion and are tunable parameters in implementations to optimize for specific environments.^[1]

Performance Analysis

Key Advantages

One of the primary advantages of the multilevel feedback queue (MLFQ) scheduling algorithm is its adaptability to varying process behaviors without requiring prior knowledge of burst times. By dynamically adjusting process priorities based on observed CPU usage, MLFQ effectively approximates the optimal short-job-first (SJF) strategy in mixed workloads, leading to improved average response times as short or interactive processes remain in higher-priority queues and complete quickly.^[1] This learning mechanism allows the scheduler to evolve with the system's demands, enhancing overall performance in environments with unpredictable process characteristics.^[2] MLFQ is particularly user-centric in time-sharing systems, where it prioritizes interactive and I/O-bound tasks by assigning them to higher queues with smaller time quanta, ensuring short bursts execute promptly and improving perceived system responsiveness. For instance, processes that frequently yield the CPU due to I/O operations receive priority boosts, resulting in faster turnaround for user-facing applications compared to round-robin scheduling.^[1] This focus on short jobs prevents long-running processes from dominating the CPU, thereby maintaining a balanced user experience in multi-user operating systems.^[2] In terms of efficiency, MLFQ achieves high throughput for CPU-bound jobs by relegating them to lower-priority queues with larger time quanta, which minimizes context switches for these processes while avoiding interference with shorter jobs in upper queues. This structure reduces overall scheduling overhead relative to pure round-robin implementations, as CPU-intensive tasks run for extended periods once demoted, optimizing resource utilization without sacrificing fairness through periodic priority adjustments.^[1]^[2]

Limitations and Solutions

One significant limitation of multilevel feedback queue (MLFQ) scheduling is the risk of starvation for CPU-bound processes. These processes, which require long execution times, can be repeatedly demoted to lower-priority queues due to exceeding time quanta in higher queues, potentially waiting indefinitely if interactive or short jobs continually occupy the higher queues.^[1] To mitigate this, aging mechanisms periodically boost the priority of lower-queue processes, promoting them to higher queues after a fixed time interval, as detailed in priority adjustment strategies.^[1] MLFQ is highly sensitive to parameter configuration, where inappropriate settings—such as uniform time quanta across queues or an insufficient number of queues—can lead to thrashing (excessive context switching) or unfair resource allocation, degrading overall system performance.^[1] Effective tuning demands workload-specific calibration, often informed by empirical analysis to adjust quanta, queue counts, and boost periods for balanced responsiveness and throughput.^[1] Additionally, MLFQ introduces notable scheduling overhead compared to simpler algorithms like round-robin, arising from frequent process movements between queues, priority recalculations, and checks for demotion or promotion criteria, which increase implementation complexity and runtime costs.^[16]

Comparisons and Implementations

Differences from Other Scheduling Algorithms

The multilevel feedback queue (MLFQ) scheduling algorithm distinguishes itself from round-robin (RR) by incorporating multiple priority levels with varying time quanta and a feedback mechanism that dynamically adjusts process priorities based on observed CPU usage, enabling favoritism toward short or interactive jobs that RR's uniform cyclic allocation cannot achieve.^[1] In RR, all processes share a single queue and receive equal fixed time slices (typically 10-100 milliseconds), promoting fairness and low response times for time-sharing but often delaying interactive tasks when long CPU-bound processes are present, as it treats all jobs identically without adaptation.^[17] MLFQ mitigates this by initially placing processes in the highest-priority queue with small quanta to quickly identify and complete short jobs, demoting CPU-intensive ones to lower queues with larger quanta, thus enhancing overall responsiveness for mixed workloads while retaining RR-like scheduling within each queue.^[1] Compared to shortest job first (SJF), MLFQ approximates the optimal average waiting time of SJF—provably achieved by scheduling processes with the smallest predicted burst times first—through its multi-level structure and feedback rules that infer job lengths from runtime behavior, without requiring advance knowledge of burst times that makes SJF impractical for interactive systems.^[17] SJF excels in batch environments by minimizing wait times for short jobs but falters with unknown arrivals or predictions, potentially leading to starvation of longer jobs.^[1] MLFQ's approach handles these uncertainties by promoting quick-executing processes (emulating short jobs) and accommodating dynamic arrivals, though its heuristic adjustments can introduce minor approximation errors relative to SJF's theoretical optimality.^[1] In relation to multilevel queue scheduling, MLFQ extends the fixed partitioning of processes into separate priority queues (e.g., by job type like foreground or background) by adding process mobility via feedback, preventing indefinite trapping in low-priority queues and allowing adaptation to changing process characteristics that static assignments overlook.^[17] Multilevel queue systems assign processes permanently to queues with distinct algorithms (e.g., RR for interactive, FCFS for batch), which efficiently separates workload classes but risks starvation for lower queues without promotion mechanisms.^[1] MLFQ's dynamic demotion for CPU hogs and promotion (e.g., for I/O-bound jobs) ensures better resource allocation across diverse behaviors, making it more flexible for time-sharing environments.^[17]

Real-World Applications

The multilevel feedback queue (MLFQ) scheduling algorithm originated in early time-sharing systems and has since been adopted in various operating systems for its ability to adapt to diverse workloads through dynamic priority adjustments. It was first implemented in the Compatible Time-Sharing System (CTSS) developed at MIT in the early 1960s, where it enabled efficient handling of interactive user processes alongside batch jobs.^[1] This design influenced the Multics operating system, a collaborative project between MIT, GE, and Bell Labs, which further refined feedback mechanisms for multiprogramming.^[1] By the 1970s, MLFQ principles were integrated into early Unix variants, providing a foundation for process management in time-sharing environments.^[1] Derivatives of BSD Unix, including FreeBSD, have employed MLFQ-based schedulers to balance responsiveness and throughput. In FreeBSD's time-sharing algorithm, threads are dynamically reassigned across multiple run queues based on resource consumption and priority feedback, with periodic boosts to favor interactive tasks.^[18] Similarly, the Windows NT kernel, introduced in 1993, incorporates MLFQ elements in its priority-driven scheduler, utilizing 32 levels with feedback for preempting processes and adjusting priorities to support both foreground and background activities.^[1] In modern Linux kernels, the O(1) scheduler (2002–2007) explicitly drew from MLFQ by organizing tasks into 140 priority queues with varying time slices, allowing short jobs to complete quickly while demoting CPU-intensive ones.^[19] Its successor, the Completely Fair Scheduler (CFS) introduced in 2007, retains MLFQ-inspired features like user-adjustable "nice" priority levels and aging mechanisms to elevate starved processes, though it shifts toward proportional fair sharing via a red-black tree.^[19] MLFQ variants find application in embedded and real-time systems, where adaptability to unpredictable bursts is crucial without compromising deadlines. For example, the New MLFQ (NMLFQ) extends traditional MLFQ with modules for real-time constraints, making it suitable for resource-limited embedded domains like IoT devices.^[20] In cloud computing, 2010s research adapted MLFQ for virtual machine orchestration to handle dynamic task arrivals and resource heterogeneity; a notable queue-based job scheduling algorithm using MLFQ reduced makespan in cloud environments by prioritizing short flows and providing feedback for long-running virtualized workloads.^[21] Recent developments as of 2025 have focused on enhancing MLFQ with machine learning for dynamic priority adjustments and intelligent time quantum allocation, improving performance in AI-driven and high-variability workloads in cloud and edge computing environments.^[22]^[23]