Least frequently used
The Least Frequently Used (LFU) is a cache eviction algorithm employed in computer systems to manage limited cache memory by removing the items accessed the fewest times when the cache capacity is exceeded.[1] This policy tracks the frequency of access for each cached item, prioritizing the retention of more popular data to optimize hit rates in scenarios with skewed access patterns.[2]
In operation, LFU maintains a frequency counter for every item upon insertion or access, incrementing the count with each use to reflect ongoing popularity.[2] When eviction is required, the algorithm selects and discards the item(s) with the lowest counter value; in implementations handling ties, secondary mechanisms like recency (e.g., least recently used) may break them.[2] Efficient realizations, such as those using probabilistic counters with logarithmic scaling and decay factors, achieve near-constant time complexity for insertions, updates, and evictions, addressing traditional LFU's logarithmic overhead.[2]
LFU excels in environments with stable, repetitive access distributions, such as content delivery networks or databases with persistent "hot" items, by retaining frequently requested data longer than recency-based alternatives like LRU.[2] However, it is susceptible to cache pollution, where transiently popular items accumulate high counts and resist eviction despite later irrelevance, and it adapts slowly to shifting patterns compared to LRU.[2] Implementation complexity is higher due to counter management, though modern variants mitigate this.[3]
Notable extensions include the Least Recently/Frequently Used (LRFU) policy, which blends frequency and recency via a tunable parameter to form a spectrum subsuming both pure LFU and LRU, improving adaptability across workloads.[4] Hybrid approaches, such as combining LFU with LRU in segmented caches, further enhance performance by leveraging frequency for long-term popularity and recency for short-term trends.[1] These developments underscore LFU's role in advancing cache efficiency for diverse applications, from web proxies to in-memory stores.[5]
Fundamentals
Definition and Purpose
The Least Frequently Used (LFU) cache replacement policy is an eviction strategy that removes the cached item with the lowest access frequency count, approximating the optimal offline Belady's algorithm by prioritizing frequency-based predictions of future reuse under stationary request patterns.[6]
The core purpose of LFU in memory management and caching is to improve cache hit rates by retaining high-frequency items that are expected to be accessed repeatedly, making it particularly suitable for workloads with skewed or heavy-tailed access distributions, such as web proxies where popular content dominates requests or databases handling recurring queries on hot data.[7]
LFU was introduced in the context of virtual memory paging research in the late 1960s and early 1970s, with formal evaluation techniques including frequency-based policies like LFU described by Mattson et al. in 1970.[8]
This policy rests on the fundamental assumption that item access frequencies are stable or slowly varying over time, enabling frequency metrics to serve as reliable proxies for long-term reuse probability.
Unlike the recency-focused Least Recently Used (LRU) policy, LFU emphasizes cumulative access counts to address scenarios where item popularity persists beyond immediate temporal locality.[7]
Core Mechanism
In the Least Frequently Used (LFU) cache replacement policy, each cached item maintains a dedicated counter to track its access frequency, which is incremented by 1 each time the item is accessed or referenced. Upon insertion of a new item—typically triggered by a cache miss—its frequency counter is initialized to 1 to reflect the initial access event. This frequency-based tracking prioritizes retaining items that demonstrate higher usage rates over time.[9]
The eviction process in standard LFU occurs when the cache reaches its capacity and a new item must be added. In its basic form, the policy identifies the item(s) with the minimum frequency counter value by scanning the cache contents and selects one for removal to make space. While auxiliary data structures can optimize this selection for efficiency in practice, the core rule remains eviction of the least frequently accessed item.[9]
When multiple items share the same minimum frequency, a tie-breaking rule is applied to resolve the selection. A common approach integrates recency by evicting the least recently used item among the tied candidates, effectively combining LFU's frequency focus with an LRU-like mechanism for fairness and to mitigate issues like cache pollution from infrequent but recent insertions.[2]
A basic outline of frequency updates and eviction selection in pseudocode illustrates these rules:
function access(key):
if key in cache:
cache[key].frequency += 1
update_recency(key) // Optional for tie-breaking
else:
if cache is full:
min_freq = minimum frequency in cache
candidates = items with frequency == min_freq
evict_key = least recently used among candidates
remove(evict_key)
insert(key, frequency=1)
update_recency(key)
function access(key):
if key in cache:
cache[key].frequency += 1
update_recency(key) // Optional for tie-breaking
else:
if cache is full:
min_freq = minimum frequency in cache
candidates = items with frequency == min_freq
evict_key = least recently used among candidates
remove(evict_key)
insert(key, frequency=1)
update_recency(key)
Implementation
Required Data Structures
To implement the Least Frequently Used (LFU) cache efficiently, a hash map is employed for constant-time (O(1)) lookup from keys to corresponding item nodes, with each node storing the associated value and metadata such as a frequency counter.[10]
Items are grouped by their access frequency using a secondary hash map that associates each frequency value with a doubly linked list (or equivalent priority queue structure) containing all nodes at that frequency level; this enables rapid identification and eviction of the minimum-frequency group in O(1) time.[10]
When multiple items share the same minimum frequency, tie-breaking favors recency using least recently used (LRU) ordering within the frequency group, achieved by maintaining the doubly linked list such that accessed items are moved to the tail (most recent position), with eviction from the head (least recently used position).[2][10]
The overall space complexity is O(c), where c denotes the cache capacity, as each of the c items requires O(1) additional space for its frequency counter, pointers, and optional timestamp.[10]
Eviction Algorithm
The eviction algorithm in Least Frequently Used (LFU) caching operates by maintaining a count of access frequencies for each item and prioritizing the removal of those with the lowest counts when the cache reaches capacity. Upon a cache hit, the frequency of the accessed item is incremented, and the item is repositioned within the frequency-ordered structure to reflect its updated priority. This ensures that more frequently accessed items are protected from eviction.[10]
For a cache miss, a new item is inserted with an initial frequency of 1. If the cache is full, the algorithm identifies and evicts the item with the minimum frequency before adding the new one; in case of ties among minimum-frequency items, a secondary policy such as least recently used (LRU) is commonly applied within the frequency group. These operations leverage data structures like frequency-sorted doubly linked lists and hash maps for efficient tracking, as detailed in efficient implementations.[10][2]
The full eviction cycle can be outlined in pseudocode as follows, assuming a hash map for key-to-node mapping, a min-frequency doubly linked list for quick access to the lowest frequency bucket, and per-frequency doubly linked lists for items:
function access(key):
if key not in cache:
return miss // handled by insert
node = hash_map[key]
old_freq = node.frequency
remove node from old_freq list
node.frequency += 1
new_freq = node.frequency
add node to new_freq list
if old_freq list is empty:
remove old_freq from min_freq structure
function insert(key, value):
if cache size == capacity:
evict()
node = new Node(key, value, frequency=1)
add node to frequency 1 list
hash_map[key] = node
if frequency 1 list was empty:
add frequency 1 to min_freq structure
function evict():
min_freq_bucket = get_min_freq_bucket() // O(1) via head of min_freq list
node_to_evict = min_freq_bucket.head // least recently used in min bucket
remove node_to_evict from its frequency list
delete hash_map[node_to_evict.key]
free node_to_evict
if min_freq_bucket is now empty:
remove min_freq_bucket from min_freq structure
update min_freq to next lowest if needed
function access(key):
if key not in cache:
return miss // handled by insert
node = hash_map[key]
old_freq = node.frequency
remove node from old_freq list
node.frequency += 1
new_freq = node.frequency
add node to new_freq list
if old_freq list is empty:
remove old_freq from min_freq structure
function insert(key, value):
if cache size == capacity:
evict()
node = new Node(key, value, frequency=1)
add node to frequency 1 list
hash_map[key] = node
if frequency 1 list was empty:
add frequency 1 to min_freq structure
function evict():
min_freq_bucket = get_min_freq_bucket() // O(1) via head of min_freq list
node_to_evict = min_freq_bucket.head // least recently used in min bucket
remove node_to_evict from its frequency list
delete hash_map[node_to_evict.key]
free node_to_evict
if min_freq_bucket is now empty:
remove min_freq_bucket from min_freq structure
update min_freq to next lowest if needed
This pseudocode represents an amortized O(1) implementation using doubly linked lists per frequency level and a separate structure to track the minimum frequency, avoiding full scans. In a naive implementation without optimized structures, finding the minimum frequency item requires scanning all entries, resulting in O(n) time complexity for eviction, where n is the cache size.[10]
Variants and Extensions
Adaptive LFU
Adaptive LFU variants incorporate mechanisms to dynamically adjust access frequencies, enabling the policy to better handle evolving workloads where access patterns shift over time. These adaptations address limitations in standard LFU, such as cache pollution from items that were once popular but are no longer relevant, by introducing temporal decay or recency-based adjustments.[11]
A key adaptation is the aging mechanism, which periodically reduces frequency counts to forget outdated accesses and prevent low-frequency items from persisting indefinitely. In the LFU with Dynamic Aging (LFU-DA) policy, frequency counts are halved whenever the average frequency in the cache exceeds a predefined threshold, effectively decrementing all counts by a factor to prioritize recent activity. This process occurs dynamically based on workload characteristics, with an additional parameter capping maximum frequencies to avoid dominance by exceptionally popular items. Similar variants may decrement frequencies by a fixed value, such as 1, at regular intervals, ensuring that even infrequently accessed items gradually lose priority without abrupt eviction.[12]
Promotion policies further enhance adaptability by boosting frequencies for recently accessed items, integrating recency to balance long-term frequency with short-term relevance. The Least Recently/Frequently Used (LRFU) policy exemplifies this by computing a combined score for each item as R(i) = \lambda \cdot F(i) + (1 - \lambda) \cdot recency(i), where F(i) is the access frequency, recency(i) measures time since last access (often exponentially decaying), and \lambda (between 0 and 1) tunes the emphasis on frequency versus recency; upon access, both components are updated to promote the item. This integration allows LFU to respond more quickly to emerging hot items in dynamic environments.[13]
Another prominent adaptive approach is the Adaptive Replacement Cache (ARC), which maintains two lists to dynamically balance recency (LRU-like) and frequency (LFU-like) without fixed parameters, adapting to workload changes by adjusting list sizes based on past evictions.[14]
These adaptations prove particularly beneficial for non-stationary workloads, where access distributions change frequently, as aging and promotion prevent stagnation and improve hit rates over plain LFU in various traces. Parameter choices, such as the decay rate \alpha in exponential aging (updating freq_{new} = freq_{old} \times \alpha per time unit, with $0 < \alpha < 1), allow fine-tuning for workload volatility; lower \alpha accelerates forgetting for rapid changes, while higher values retain stability in steadier patterns.[12]
LFU with Windowing
LFU with windowing addresses the high maintenance overhead of pure LFU by approximating frequency counts through a limited observation of recent accesses, thereby reducing space and computational costs while preserving much of the frequency-based eviction benefits.[7] In a fixed-size window approach, frequencies are tracked only over the last k accesses, where k is a tunable parameter much smaller than the full access history, allowing the policy to approximate long-term popularity without storing exhaustive records.[7] This method, often termed Window LFU (WLFU), slides the window forward with each new access, updating counts for items within it and discarding older data to focus on recent trends.[7]
The sliding window LFU (SW-LFU) variant refines this by maintaining a FIFO queue of recent items as the window, counting hits exclusively within that bounded set to adjust frequencies dynamically.[15] Upon a cache miss or access, the window advances by adding the new item and removing the oldest if full, ensuring only recent interactions influence counts; eviction then selects the item with the minimum adjusted frequency from the window.[15] This recency-limited sampling helps mitigate the popularity skew from one-time bursts but relies on the window size to balance responsiveness and stability.[15]
Implementation typically employs a double-ended queue (deque) to manage the FIFO window efficiently, allowing O(1) insertions and deletions at both ends, paired with a hash table to store and update per-item frequency counters within the current window.[15] For eviction, the policy computes an adjusted score as the sum of the window-specific frequency and a historic frequency estimate (often from a compact sketch like a Counting Bloom Filter for scalability), selecting the minimum across candidates to remove.[7] This structure supports amortized O(1) operations per access, making it suitable for high-throughput systems.[7]
Compared to pure LFU, windowing variants require significantly less space— for instance, TinyLFU implementations use about 5 bytes per cache entry versus nearly 100 bytes for exact window tracking— but may introduce inaccuracies in workloads with sudden bursts, where short-term spikes can overshadow stable long-term frequencies.[7] Such approximations have been adopted in production caches like Google's Caffeine library, where a small probationary window (e.g., 1% of capacity) filters admissions before frequency-based promotion, demonstrating improved hit rates over traditional LFU in diverse traces.[7]
Theoretical Guarantees
The Least Frequently Used (LFU) policy approximates Belady's MIN algorithm, the optimal offline eviction strategy that removes the item whose next access occurs farthest in the future, by instead evicting the item with the lowest historical access frequency, which serves as a proxy for low future demand under certain assumptions.[16] This approximation enables online analysis of competitive ratios, defined as the ratio of LFU's cache misses to the minimum achievable by Belady's MIN, though LFU lacks a constant-factor guarantee in the general paging model, unlike LRU's k-competitive bound where k is the cache size.[16]
Under the Independent Reference Model (IRM), where accesses to items are independent and follow fixed probabilities p_i, LFU minimizes the expected number of misses by retaining the c items with the highest probabilities, achieving the optimal steady-state hit probability of P_{\text{hit}} = \sum_{i=1}^{c} p_i, where the p_i are sorted in decreasing order.[17] Here, frequencies serve as maximum-likelihood estimates of the p_i, so LFU's eviction aligns with selecting the lowest-probability candidate. Stack distance (the number of distinct items accessed between two references to the same item) and reuse distance (similar but allowing repeats) metrics further support this: under IRM, LFU effectively minimizes the expected reuse distance across the cache by prioritizing high-frequency items, whose geometric inter-arrival times yield shorter average distances compared to low-frequency ones.[14]
In worst-case scenarios, LFU's competitive ratio is unbounded, as demonstrated by request sequences with many unique low-frequency items that pollute the cache, forcing repeated evictions of high-demand items; however, in models assuming bounded frequencies or access patterns, variants achieve an O(\log n) ratio relative to Belady's MIN, where n is the number of items.[16] Unlike LRU, which offers a constant-factor guarantee but performs poorly on frequency-skewed workloads, LFU has no such universal bound but excels in probabilistic settings.
A proof sketch for LFU's near-optimality under stationary IRM proceeds as follows: assume item probabilities p_1 \geq p_2 \geq \cdots \geq p_m > 0 with \sum p_i = 1, and cache size c < m. In steady state, the probability an access hits the cache is the sum of probabilities for the retained items. LFU, by evicting the lowest-frequency item (estimating the lowest p_i), converges to retaining the top-c probability items, matching Belady's MIN under IRM since future accesses are identically distributed to past ones. For Zipf distributions, common in practice where p_i \propto 1/i^\alpha for \alpha > 0, this yields near-optimal performance, with LFU's hit rate approaching the optimum as frequencies stabilize, often within 5-10% for web traces.[14]
Empirical Evaluations
Empirical evaluations of the Least Frequently Used (LFU) cache eviction policy have demonstrated its effectiveness in workloads exhibiting strong frequency-based locality, though it often incurs higher computational overhead compared to simpler policies like Least Recently Used (LRU). Early studies using 1990s Digital Equipment Corporation (DEC) traces, which capture file system and web proxy accesses, showed LFU achieving slightly higher hit ratios than LRU in scenarios with moderate skew, such as the DEC trace where LFU reached 36.0% hit ratio versus LRU's 35.4% at a 1 GB cache size.[18] In more skewed access patterns modeled by Zipf distributions, LFU has been found to outperform LRU by 10-20% in absolute hit rate terms, particularly when LRU hit rates range from 10-50%, as LFU better captures long-term popularity in independent reference model (IRM) simulations.[19]
Key metrics in these evaluations include cache hit ratio, miss rate, and eviction overhead. For instance, in trace-driven simulations on database and file system workloads, LFU exhibited lower miss rates than LRU in frequency-dominant scenarios but higher eviction costs due to maintaining and updating frequency counters, often requiring O(log n) operations per access in naive implementations.[9] The Adaptive Replacement Cache (ARC) study on DEC traces (e.g., P4, P8, P12) and OLTP workloads revealed LFU's hit ratios lagging behind adaptive policies, with LFU at 56.22% for a 15,000-entry cache in OLTP versus ARC's 65.40%, highlighting LFU's vulnerability to evolving access patterns where frequency alone fails to adapt.[14] Eviction overhead for LFU is typically 2-5 times higher than LRU in these benchmarks due to priority queue maintenance, though optimized variants mitigate this to near O(1) amortized time.
LFU excels in read-heavy workloads following power-law distributions, such as web caches and databases with Zipf-like access skew (α ≈ 0.8-1.1), where it sustains 15-20% higher hit ratios than LRU by prioritizing persistently popular items.[19] However, it degrades in sequential scan-heavy traces, like SPC1-like benchmarks, where LFU's static frequency focus leads to pollution from one-time accesses, resulting in miss rates up to 44% higher than LRU or adaptive policies.
Post-2010 evaluations in cloud caching environments, including approximations like TinyLFU integrated with LRU, confirm LFU's advantages in dynamic, skewed workloads such as YouTube video requests and search engine queries. On real traces from these domains, TinyLFU-augmented policies achieved 5-15% hit ratio improvements over pure LRU, with windowed variants matching or exceeding state-of-the-art like ARC in hit rates up to 50% on Wikipedia accesses at 20x cache sampling.[7] Recent studies as of 2025, such as hybrid LRU-LFU strategies, show continued improvements in cache efficiency for modern applications, outperforming pure LFU by 10-20% in thermal-aware and ML-optimized scenarios.[5][20] These results underscore LFU's edge in Redis-like setups for frequency-skewed cloud applications, though hybrid approaches are often needed to balance overhead and adaptability.[7]
| Workload Type | Example Trace | LFU Hit Ratio Improvement over LRU | Source |
|---|
| Skewed (Zipf) | IRM Simulations | 10-20% absolute | Hasslinger et al. (2017)[19] |
| File System | DEC traces | ~0.6% (at 1 GB) | Cao and Irani (2000)[18] |
| Scan-Heavy | SPC1-like | Significantly worse (higher miss rates) | Megiddo & Modha (2003)[14] |
| Cloud (Video/Search) | YouTube/Wikipedia | 5-15% | Einziger et al. (2017)[7] |
Applications and Comparisons
Real-World Usage
In software systems, the Least Frequently Used (LFU) policy is deployed for efficient caching in high-traffic environments. The Varnish HTTP accelerator employs LFU within its Massive Storage Engine (MSE) to handle terabyte-scale content, where a segmented LFU variant replaces the basic implementation to boost cache hit rates by prioritizing frequently accessed objects in disk-backed storage.[21] Similarly, variants of distributed key-value stores like Memcached-inspired systems, such as those in Redis (since version 3.0 in 2017), integrate approximated LFU eviction to retain high-frequency items, enhancing performance in read-heavy workloads.[2] In Java applications, the Caffeine library adopts Window TinyLFU, an optimized LFU approximation, for in-memory caching since its initial release in 2016, enabling near-optimal hit rates with low overhead through frequency sketching.[22]
Hardware implementations leverage LFU elements for memory management. Translation Lookaside Buffers (TLBs) in processor research have proposed frequency-based policies like Random-Least-Frequently-Used (RLFU) to evict entries, particularly for workloads with large instruction footprints, reducing miss rates by favoring recurrent translations.[23] Operating system kernels, such as historical versions of Linux (e.g., 2.0), incorporated LFU in page replacement to approximate least-used pages.[24] For database systems, PostgreSQL's buffer manager can benefit from LFU as an alternative eviction policy in extensions or custom setups, though studies often favor adaptive policies like ARC over both LFU and the default LRU for query caches handling skewed access patterns.[25]
Configuration of LFU in production involves tuning parameters to suit workload dynamics. A maximum frequency cap limits counter overflow in high-access scenarios, preventing skewed eviction, while window size in variants like Window TinyLFU—typically set to 1% of total cache capacity—balances recency and frequency; increasing it to 20-40% adapts better to bursty traffic, as seen in OLTP traces.[7] In content delivery networks (CDNs), LFU configurations have yielded measurable gains; a study on streaming media CDNs found that LFU-SIZE, a size-aware LFU variant, improved hit ratios over traditional LFU and LRU, reducing bandwidth usage and response times in resource-constrained setups.[26]
Despite these advantages, LFU faces challenges in production due to computational overhead from maintaining access counters, which can degrade performance in highly dynamic environments with rapidly shifting patterns.[2] This often leads to hybrid deployments, such as combining LFU frequency estimation with LRU recency in policies like Adaptive Replacement Cache (ARC) or Window TinyLFU, to mitigate overhead while preserving strong hit rates in systems like Redis and Caffeine.[5] As of 2025, LFU approximations continue to be used in cloud services like AWS ElastiCache for managed Redis instances.[27]
Comparison to Other Policies
The Least Frequently Used (LFU) policy contrasts with the Least Recently Used (LRU) policy primarily in its emphasis on access frequency over recency. LFU excels in workloads exhibiting frequency skew, such as those following a Zipf distribution where a small portion of items (e.g., 20% of objects accounting for 80% of accesses) dominate requests, as it prioritizes retaining high-frequency items regardless of their recent usage.[7] In contrast, LRU performs better in scenarios with recency bursts, where temporal locality drives sudden spikes in access to recently introduced items, as LFU may retain outdated popular items during shifts.[28] For instance, in stable frequency-based patterns like content recommendation systems, LFU achieves higher hit ratios than LRU, but it can degrade by up to 40% relative to LRU during workload transitions due to poor adaptability.[29]
| Workload Type | Cache Size | LFU Hit Ratio | LRU Hit Ratio | Source |
|---|
| Zipf 0.9 (synthetic, frequency-skewed) | 1,000 items | ~45% (approximated via TinyLFU) | ~35% | TinyLFU Paper |
| Wikipedia trace (real, mixed) | 1,000 items | ~60% (via WLFU variant) | ~50% | TinyLFU Paper |
| OLTP trace (real, recency-heavy) | 1,000 pages | 27.98% | 32.83% | ARC Paper |
Compared to the First-In-First-Out (FIFO) policy, LFU avoids evicting frequently accessed items that enter the cache early and remain popular, particularly in looping or cyclic workloads where FIFO rigidly removes the oldest entry regardless of usage patterns.[30] This leads to substantial improvements in hit rates for frequency-dominated scenarios; for example, LFU variants can outperform FIFO by over 30% in simulations involving repeated access loops over a fixed set of objects, as FIFO repeatedly discards viable candidates stuck at the head of the queue.[31] FIFO, with its low overhead, suits streaming or uniform access patterns but underperforms in any environment with persistent hot items.
As an online approximation to Belady's optimal offline policy (which evicts the item with the furthest next access), LFU provides competitive performance under the independent reference model by targeting low-frequency items, but it incurs linear regret in dynamic settings due to its inability to foresee future accesses.[32] Theoretical analyses show LFU's competitive ratio approaches optimality for stationary distributions, with regret bounds scaling with cache size and request horizon, though it remains sub-optimal compared to clairvoyant strategies in adversarial traces.[33]
Hybrid approaches, such as combining LFU with LRU elements (e.g., Window-LFU), are recommended for workloads with static popularity distributions, where LFU's frequency focus maximizes long-term hits, whereas pure LRU or adaptive policies like ARC suit highly dynamic environments with frequent shifts in access patterns.[2] Selection depends on workload characteristics: LFU for predictable skew, others for volatility.[30]