Fact-checked by Grok 2 weeks ago

Cuckoo hashing

Cuckoo hashing is a collision-resolution technique for hash tables that uses two independent hash functions and two associated tables, allowing each key to occupy one of two possible locations and guaranteeing worst-case O(1) lookup time through at most two memory probes.^[1] Introduced by Rasmus Pagh and Flemming Friche Rodler in 2001, the method draws its name from the parasitic nesting behavior of the cuckoo bird, where inserting a new key may "kick out" an existing key from its position, prompting a chain of relocations until an empty slot is found or the process fails, necessitating a full rehash of the table.^[1] In operation, lookups for a key x check only the positions h₁(x) in the first table and h₂(x) in the second; if neither holds x, it is absent. Insertions start by attempting to place x in an available position, displacing any occupant to its alternate location if needed, with the process repeating up to a bounded number of steps to avoid infinite loops.^[1] Deletions are handled by simply removing the key from its stored position, identified via the same two-probe lookup, without requiring reorganization.^[2] The scheme achieves amortized expected O(1) time for insertions and deletions under a load factor of at most 1/2, using approximately three words of space per key on average, making it space-efficient compared to alternatives like binary search trees.^[1] Theoretical guarantees rely on (2, k)-universal hash families, ensuring that with table size r ≥ (1 + ε)n for n keys and ε > 0, the probability of insertion failure is O(1/n), with rehashing occurring in expected O(n) time.^[2] Cuckoo hashing's practical appeal lies in its simplicity and performance in cache-friendly scenarios, particularly for small-to-medium dictionaries, though it can suffer from poor worst-case insertion times at high load factors or with suboptimal hash functions.^[1] Variants, such as those using multiple hash functions or oriented cuckoo hashing, extend its applicability to scenarios like approximate membership queries in cuckoo filters.^[2]^[3]^[4]

Fundamentals

Definition and Motivation

Cuckoo hashing is a probabilistic data structure for implementing dictionaries, specifically a form of open-addressing hash table that employs two (or more) independent hash functions to map each key to one of multiple possible slots across two arrays. Unlike traditional open-addressing techniques, it permits the relocation of existing keys during insertion to resolve collisions, ensuring that each key occupies exactly one of its designated alternative positions. This design enables lookups to be performed in worst-case constant time by examining only a fixed number of potential locations per key, typically two memory probes. The name "cuckoo hashing" derives from the behavior of cuckoo birds, which lay eggs in the nests of other species and evict the host's eggs, analogous to the eviction process during key insertion.^[5] Hash tables resolve collisions—the event where multiple keys hash to the same slot—through methods such as separate chaining or open addressing. In separate chaining, each array slot contains a pointer to a linked list (or equivalent structure) holding colliding keys, which introduces space overhead from the pointers and can lead to cache inefficiencies due to non-contiguous memory access; the expected number of probes for a successful search is $1 + \alpha, where \alpha is the load factor (number of keys divided by table size). Open addressing, by contrast, stores all keys directly in the array and uses probing sequences to find vacant slots, avoiding pointer overhead but risking performance degradation from clustering effects. Linear probing, a common open-addressing variant, increments the index linearly from the initial hash, but it suffers from primary clustering: sequences of occupied slots form contiguous blocks, causing the expected number of probes for a successful search to rise to \frac{1}{2} \left(1 + \frac{1}{(1 - \alpha)^2}\right), with the longest probe sequence reaching O(\log n) in the worst case.^[5] Cuckoo hashing was motivated by the need for a practical hashing scheme that combines the space efficiency of open addressing with guaranteed constant-time lookups, mitigating the space costs of chaining and the clustering issues of probing methods, while supporting high load factors up to approximately 50% without rehashing. It achieves these goals through its multi-hash relocation mechanism, delivering amortized constant-time insertions and deletions alongside worst-case O(1) lookups, using roughly twice the space of the stored keys. Introduced in 2001 as a simpler alternative to prior dynamic perfect hashing techniques—which required up to 35 times the key space for similar guarantees—cuckoo hashing prioritizes conceptual simplicity and real-world applicability in memory-constrained environments like routers.^[5]

Hash Table Setup and Hash Functions

Cuckoo hashing utilizes a hash table structure composed of two separate arrays, denoted as T_1 and T_2, each consisting of m slots, where m is a fixed size typically selected as a power of two to optimize indexing operations with multiply-shift hash functions.^[5] This configuration provides $2m total slots, allowing each key to occupy exactly one of two possible positions determined by the hash functions.^[1] The table size m is chosen such that m \geq (1 + \epsilon) n for some constant \epsilon > 0 and n the expected number of keys, ensuring the tables remain under half full to facilitate key placements.^[1] Keys are elements from a universe U, and the setup employs two hash functions h_1: U \to [0, m-1] and h_2: U \to [0, m-1], which map any key k \in U to indices in the arrays.^[1] These functions are independent and selected from a strongly universal hashing family, specifically an (O(1), O(\log n))-universal family, to achieve approximate uniformity and pairwise independence in key distribution.^[1] This randomness property ensures that collisions are probabilistically balanced, mimicking truly random functions with high probability.^[5] Upon initialization, both T_1 and T_2 are filled with a special null value \perp in every slot, signifying an empty table ready for key insertions.^[1] No keys are present at this stage, and the structure supports lookup by examining the two candidate positions T_1[h_1(k)] and T_2[h_2(k)].^[1]

Operations

Lookup

In cuckoo hashing, the lookup operation for a given key k begins by computing the two candidate indices using the predefined hash functions h_1 and h_2, denoted as i_1 = h_1(k) and i_2 = h_2(k). The hash tables T_1 and T_2 are then examined at these exact positions to determine if the key resides there. Specifically, if T_1[i_1] = k or T_2[i_2] = k, the key is present and returned; otherwise, the key is absent from the table. This direct verification process requires precisely two memory probes, independent of the table's load factor or the distribution of other keys.^[6] The design ensures worst-case O(1) time complexity for lookups, as the operation is non-iterative and bounded by a constant number of steps, regardless of table size or occupancy. Unlike chained hashing, which may require traversing linked lists of varying lengths, or linear/quadratic probing, which can involve sequential scans along collision chains, cuckoo hashing eliminates such variability by restricting searches to fixed locations. This makes it particularly suitable for applications demanding predictable retrieval performance, such as in high-throughput data structures. Empirical evaluations on modern processors confirm its efficiency, demonstrating lookup times of approximately 6.7 cycles per operation on architectures like the Itanium 2, outperforming chained hashing by up to a factor of 3 at high load factors due to better cache locality and reduced branch predictions.^[6]^[7] Edge cases in lookup are handled deterministically without altering the table state. If a candidate slot is empty (null or uninitialized), or if it contains a different key, the check simply fails for that position, and the process moves to the next without any relocation or additional probes. For instance, when both slots hold unrelated keys or are vacant, the operation terminates immediately with a "not found" result, preserving the integrity of the structure for subsequent operations. This simplicity contrasts with probing methods, where empty slots might trigger further exploration, but in cuckoo hashing, no such extension occurs.^[6] The following pseudocode illustrates the lookup algorithm:

function lookup(key k):
    i1 ← h1(k)
    i2 ← h2(k)
    if T1[i1] == k:
        return k
    if T2[i2] == k:
        return k
    return not present
function lookup(key k):
    i1 ← h1(k)
    i2 ← h2(k)
    if T1[i1] == k:
        return k
    if T2[i2] == k:
        return k
    return not present

This code highlights the operation's brevity and reliance on equality comparisons alone.^[6]

Insertion

Insertion in cuckoo hashing begins by verifying that the key to be inserted does not already exist in the table, which can be done using the lookup procedure.^[1] If the key is absent, the algorithm attempts to place it in its primary position, defined by the first hash function h_1(k) in the first table T_1. If this position is empty, the insertion succeeds immediately.^[5] When the primary position is occupied by another key k', the insertion triggers the cuckoo eviction process: k' is evicted and relocated to its alternate position h_2(k') in the second table T_2. If that alternate position is also occupied by some k'', then k'' is evicted in turn to its other possible position, and the process continues iteratively. This eviction chain, known as the cuckoo walk, alternates between the two tables, displacing keys as needed until an empty slot is found for the current nestless key.^[1]^[5] To prevent infinite loops from cycles in the eviction graph, the algorithm limits the number of iterations in the cuckoo walk to a threshold, such as \lfloor 3 \log_{1+\epsilon} r \rfloor, where r is the table size and \epsilon > 0 is a load factor parameter. If the threshold is exceeded without finding an empty slot, the insertion fails, and the entire hash table is rehashed using new hash functions, followed by reinserting all existing keys before retrying the original insertion.^[5] The following pseudocode illustrates the core insertion procedure, assuming two tables T_1 and T_2 of size r, and hash functions h_1 and h_2:

function insert(key k):
    if lookup(k) returns true:
        return  // Key already present
    current_key = k
    current_pos = h1(k)
    current_table = T1
    iterations = 0
    while iterations < MaxLoop:
        if current_table[current_pos] is empty:
            current_table[current_pos] = current_key
            return
        else:
            swap current_key with current_table[current_pos]
            // Now current_key is the evicted one; find its alternate
            if current_table == T1:
                current_pos = h2(current_key)
                current_table = T2
            else:
                current_pos = h1(current_key)
                current_table = T1
            iterations += 1
    // Failure: rehash
    rehash_tables()
    insert(k)  // Retry
function insert(key k):
    if lookup(k) returns true:
        return  // Key already present
    current_key = k
    current_pos = h1(k)
    current_table = T1
    iterations = 0
    while iterations < MaxLoop:
        if current_table[current_pos] is empty:
            current_table[current_pos] = current_key
            return
        else:
            swap current_key with current_table[current_pos]
            // Now current_key is the evicted one; find its alternate
            if current_table == T1:
                current_pos = h2(current_key)
                current_table = T2
            else:
                current_pos = h1(current_key)
                current_table = T1
            iterations += 1
    // Failure: rehash
    rehash_tables()
    insert(k)  // Retry

This implementation uses a simple swapping mechanism within a loop to manage evictions, ensuring each key is placed in one of its two possible locations.^[1]^[5]

Deletion

Deletion in cuckoo hashing is a straightforward operation that maintains the integrity of the hash table without requiring relocation of other keys or rebalancing. To delete a key x, the algorithm first performs a lookup to determine its position by computing the two possible hash locations h_1(x) and h_2(x), checking the corresponding slots in the two tables T_1 and T_2. If the key is found in either slot—say at h_1(x) in table T_1—the slot is simply set to empty (denoted as \perp or null), removing the key from the structure. This process ensures constant-time worst-case performance, as it involves only a fixed number of hash computations and memory accesses.^[1]^[8] Unlike insertion, which may trigger eviction chains, deletion does not disrupt the placement of existing keys, as vacancies do not block future operations and instead provide additional space that can facilitate subsequent insertions by reducing the overall load factor. No resizing or reorganization is necessary immediately after deletion, though tables may be shrunk if the load becomes too low to optimize space usage.^[9]^[2] An edge case arises if the key is not present in the table; in this scenario, the lookup simply confirms absence, and the table remains unchanged without any modifications. If the key resides in the alternate position h_2(x) rather than the primary h_1(x), the algorithm verifies both locations sequentially to identify and nullify the correct slot. This handles variations in key placement due to prior insertions without ambiguity.^[8]^[1] The following pseudocode illustrates the deletion procedure, assuming two tables T_1 and T_2:

Algorithm: Delete(x)
Input: Key x to delete
Output: Updated hash tables T1 and T2 with x removed if present

1. Compute h₁(x) and h₂(x)
2. If T1[h₁(x)] = x then
   3. Set T1[h₁(x)] to empty
   4. Return
5. If T2[h₂(x)] = x then
   6. Set T2[h₂(x)] to empty
   7. Return
8. Return  // x not found
Algorithm: Delete(x)
Input: Key x to delete
Output: Updated hash tables T1 and T2 with x removed if present

1. Compute h₁(x) and h₂(x)
2. If T1[h₁(x)] = x then
   3. Set T1[h₁(x)] to empty
   4. Return
5. If T2[h₂(x)] = x then
   6. Set T2[h₂(x)] to empty
   7. Return
8. Return  // x not found

This implementation ensures the operation is correct and efficient, directly extending the lookup mechanism.^[8]

Theoretical Analysis

Expected Performance and Load Factors

In cuckoo hashing, the load factor \alpha = \frac{n}{m} is defined as the ratio of the number of keys n to the total number of table slots m. For the standard variant using two hash functions, the scheme achieves efficient performance when \alpha < 0.5, as this ensures that the underlying random bipartite graph remains subcritical, allowing insertions to succeed with high probability.^[6] At this load factor threshold, the probability of insertion failure approaches zero as the table size grows, enabling constant-time operations in expectation.^[10] Lookups and deletions in cuckoo hashing require a constant number of probes—specifically, up to two hash computations and memory accesses—yielding worst-case O(1) time regardless of the load factor.^[6] Insertions, however, involve a sequence of potential evictions along alternating paths in the cuckoo graph, but under \alpha < 0.5, the expected length of these paths is constant, resulting in amortized O(1) time per insertion with high probability. This amortized bound accounts for occasional rehashes, which occur with probability O(1/n) and take O(n) time overall, but their rarity ensures the average cost remains constant.^[6]^[10] The probability analysis relies on modeling the cuckoo graph as a random bipartite multigraph, where vertices represent keys and slots, and edges correspond to hash function outputs. Eviction paths during insertion are analyzed using branching process approximations, revealing that for \alpha < 0.5, the process is subcritical with mean offspring less than 1, leading to finite paths with overwhelming probability. Specifically, the probability that an insertion fails (i.e., requires more than O(\log n) steps or triggers an infinite loop) is bounded by O\left(\frac{(\log n)^4}{n}\right), which is polynomially small in the table size.^[10] This bound implies that all n insertions succeed with probability $1 - O\left(\frac{(\log n)^4}{n^2}\right), confirming linear expected construction time O(n).^[10]

Worst-Case Guarantees and Thresholds

In cuckoo hashing, the worst-case time for an insertion operation can reach O(n), where n is the number of elements, if the eviction chain forms a long path or cycle in the underlying graph structure. However, the probability of encountering such a long chain exceeding a logarithmic length is bounded by O(1/n^2), which is less than 1/n^{\Omega(1)}. To mitigate this, the algorithm limits the number of eviction steps to O(\log n) and triggers a full rehashing of the table upon failure, ensuring that the amortized insertion time remains O(1) across all operations.^[5] The load factor thresholds in cuckoo hashing determine the maximum utilization of the table before the probability of insertion failure becomes significant. For the standard case with k=2 hash functions, the maximum sustainable load factor is approximately 0.5; above this threshold, the structure fails to accommodate all elements with constant probability, necessitating rehashing. For general k \geq 2 hash functions, the threshold c_k rises with k, allowing loads up to about 0.92 for k=3 and 0.98 for k=4, approaching 1 asymptotically as 1 - \Theta((\ln k)/k). Above c_k, the failure probability is \omega(1), while below it, success occurs with probability 1 - o(1/n).^[11]^[5] From a graph-theoretic perspective, cuckoo hashing corresponds to a random k-regular bipartite multigraph between n keys and m table slots, where each key connects to k possible slots via the hash functions. Insertion succeeds if there exists an injection from keys to slots respecting the edges (i.e., a matching covering all keys); failure occurs if a connected component in this graph has more edges than vertices or contains an augmenting path that cannot be resolved without cycles. Chernoff-like concentration bounds ensure that oversized problematic components (e.g., those exceeding O(\log n) size) arise with probability o(1/n) below the load threshold c_k, guaranteeing reliability with high probability.^[11]^[5] The precise load threshold c_k for k-ary cuckoo hashing is characterized by the supremum load where the random hypergraph's 2-core admits a perfect matching with high probability, satisfying the fixed-point equation \beta \Pr[\mathrm{Po}(\beta) \geq 1]^{k-1} = 1 for the critical branching factor \beta, with asymptotic \beta_k \approx 1 / (k \ln 2) providing an approximate scaling for the threshold behavior and failure probability o(1/n) below c_k.^[11]

Practical Implementation

Space and Time Trade-offs

Cuckoo hashing achieves space efficiency through a single contiguous array of size m, where m is typically approximately $2n to maintain a load factor of 0.5, resulting in O(m) overall space complexity without additional storage for pointers or lists.^[5] Unlike chained hashing, which typically requires more space for n elements due to pointer overhead in linked lists, cuckoo hashing stores keys and values directly in table slots, minimizing memory footprint.^[5] Runtime performance involves constant-time operations dominated by hash function computations, with lookups requiring at most two hash evaluations and memory accesses.^[5] A key trade-off arises in selecting hash functions: faster, simpler ones (e.g., multiply-shift) reduce computation time but may offer less statistical independence, potentially increasing insertion failure rates, while more independent hashes improve reliability at the cost of higher constant factors.^[5] The contiguous layout of the hash table enhances memory efficiency by promoting cache-friendly access patterns, as sequential or nearby slot probes benefit from spatial locality when the working set fits in cache.^[5] For practical tuning, selecting table sizes as powers of two optimizes modulo operations for indexing, avoiding expensive division in hash computations.^[5] This aligns with theoretical load factors below 0.5 to ensure high success probability in insertions.^[5]

Handling Collisions and Resizing

In cuckoo hashing, collisions during insertion are handled through an eviction process where a new key displaces an existing key from one of its two possible locations, prompting the displaced key to move to its alternate location. This process continues iteratively until an empty slot is found or a maximum number of iterations is reached to prevent infinite loops. If the eviction loop exceeds this threshold—typically set to a small constant like 500 iterations in practice or theoretically to O(log n) to bound the probability of failure—the insertion fails, and a full rehash of the entire table is triggered using newly generated random hash functions to rearrange all keys.^[1] To maintain efficiency as the number of keys grows, the hash table resizes dynamically when the load factor approaches a predefined threshold, typically below 50% (e.g., at ≈42%) for standard implementations, though variants can handle up to 90%. Resizing typically involves doubling the table size (or halving it upon deletions to reclaim space), followed by reinserting all existing keys using the updated table structure and the same or new hash functions. This operation has an amortized expected time complexity of O(1) per insertion, as the cost is spread across multiple operations, though individual resizes take O(n) time in the worst case. Recent variants, such as bubble-up d-ary cuckoo hashing (as of January 2025), enable load factors exceeding 90% while maintaining efficiency.^[1]^[12] An advanced technique for improving resizing efficiency is partial rehashing, which rehashes only the affected portions of the table rather than the entire structure. In implementations like DyCuckoo for GPU environments, this is achieved by dividing the table into multiple subtables and resizing only the smallest (for growth) or largest (for shrinkage) subtable incrementally when the overall fill factor deviates from a target range, such as 50% to 85%, thereby limiting rehashed elements to a fraction of the total and preserving amortized O(1) performance without global disruptions.^[13] To further mitigate insertion failures and guarantee worst-case O(1) lookup and insertion times with high probability, cuckoo hashing can incorporate a small stash—a auxiliary buffer outside the main table—for temporarily storing keys that cannot be placed via the standard eviction process. The stash size is typically Θ(log n) to ensure the probability of overflow (and thus failure requiring rehash) is negligible, such as O(1/n^k) for arbitrary k, allowing failed insertions to be resolved by placing the key in the stash during lookups and deletions.^[14]

Examples and Illustrations

Basic Insertion Example

To illustrate the core mechanics of insertion in cuckoo hashing, consider a simple hash table with 4 slots, indexed from 0 to 3, initially empty, and two hash functions h_1 and h_2 that map keys to these slots.^[5] The initial state of the table is as follows:

Slot	0	1	2	3
Key	-	-	-	-

First, insert the key 10, for which h_1(10) = 0 and h_2(10) = 3. Since slot 0 is empty, place 10 there. The updated table state is:

Slot	0	1	2	3
Key	10	-	-	-

Next, insert the key 20, for which h_1(20) = 0 and h_2(20) = 1. Slot 0 is occupied, creating a conflict, so attempt to place 20 at its alternate slot h_2(20) = 1, which is empty. Place 20 there without needing to displace any existing key. The final table state is:

Slot	0	1	2	3
Key	10	20	-	-

This insertion succeeds with a single relocation attempt for the new key, avoiding any eviction chain or cycle.^[5]

Cycle Formation and Resolution

In cuckoo hashing, cycle formation occurs during insertion when the eviction process creates a closed loop in the placement graph, preventing the new key from finding an empty slot without infinite relocation. Consider a simple example with a hash table of size 4 (indices 0 to 3) and two hash functions h_1 and h_2. Suppose the table initially holds two keys: key 10 at position 0 (where h_1(10) = 0) and key 20 at position 3 (where h_1(20) = 3). Now, insert key 30, which hashes to the same conflicting positions: h_1(30) = 0 and h_2(30) = 3. The insertion begins by attempting to place 30 at position 0, which is occupied by 10. Thus, 10 is evicted and relocated to its alternate position h_2(10) = 3, which holds 20. In turn, 20 is evicted to its alternate position h_2(20) = 0, the original starting point of the eviction chain. This forms a cycle: 0 → 3 → 0, where the process loops indefinitely without freeing a slot for 30. Cycle detection typically occurs after a bounded number of eviction steps (e.g., proportional to the table size), identifying the loop when a previously visited position is revisited. To resolve the cycle, cuckoo hashing abandons the current insertion attempt and performs a full rehash: new independent hash functions h_1' and h_2' are generated, and all existing keys (including the new one) are reinserted from scratch into the now-empty table. In this example, after rehashing, suppose the new functions assign positions as follows: h_1'(10) = 1, h_2'(10) = 2; h_1'(20) = 0, h_2'(20) = 3; h_1'(30) = 2, h_2'(30) = 1. Reinsertion proceeds without cycles, placing 20 at 0, 10 at 1, and 30 at 2, breaking the prior conflict. This resolution ensures amortized constant-time insertions under typical load factors below the threshold (around 0.5 for two hashes). The eviction path in the cycle example can be visualized as a directed graph over the table positions:

Start: Insert 30 → attempt position 0 (occupied by 10)
→ Evict 10 → attempt h_2(10) = 3 (occupied by 20)
→ Evict 20 → attempt h_2(20) = 0 (cycle detected)

Post-rehash table state (size 4):

Position	Key
0	20
1	10
2	30
3	(empty)

This illustrates how cycles, though rare, are systematically resolved to maintain the structure's efficiency.

Variations

Standard Modifications

One common modification to the basic cuckoo hashing algorithm involves using more than two hash functions, known as k-ary or d-ary cuckoo hashing, where k > 2 provides multiple alternative positions for each key. In this approach, each key is hashed to k distinct locations, and during insertion, the cuckoo walk selects a random alternate position from the remaining k-1 options if the current one is occupied, allowing for higher load factors while maintaining constant-time lookups. Experiments demonstrate that with k=4 hash functions, space utilization reaches approximately 97%, significantly improving efficiency over the standard 50% load factor of 2-ary cuckoo hashing.^[15] Another standard tweak is oriented cuckoo hashing, which models the cuckoo graph with directed edges to represent possible assignments and evictions, thereby reducing the likelihood of cycles in the eviction chain. By orienting edges from keys to their potential buckets such that no bucket receives more than one incoming edge (an orientation with maximum in-degree 1), the algorithm ensures a valid matching exists without cyclic dependencies, facilitating smoother insertions. This directed structure helps bound the insertion time and avoids infinite loops by prioritizing acyclic paths during the walk.^[16] Balanced allocations modify standard cuckoo hashing by incorporating the two-choice paradigm to ensure an even distribution between the positions defined by h1 and h2 (or more generally across alternatives). In this variant, keys are assigned to one of two random buckets per hash function, with rearrangements during insertion promoting tight packing and load balance across the table's two halves, achieving near-optimal space usage without overflow in constant-sized bins. This approach guarantees that the number of items placed via h1 and h2 remains roughly equal, minimizing imbalances that could degrade performance.^[17] For static sets where the full key collection is known in advance, peelable cuckoo hashing orders insertions to eliminate runtime evictions entirely. The construction begins by identifying "peelable" hypergraphs, where items with only one available position (leaves) are placed first, iteratively removing them to expose new leaves until the entire set is assigned. This offline peeling process succeeds with high probability below the peeling threshold (around 0.92 for k=3), constructing a valid table in linear time without displacements.^[18]

Advanced Extensions

Cuckoo++ hashing augments standard cuckoo hashing by incorporating a small "stash" that functions as partial chaining, allowing a limited number of elements to be stored outside the main tables in a separate overflow area. This extension ensures worst-case constant-time lookups by bounding the maximum chain length in the stash to a small constant, such as 3 or 4 elements, which resolves the rare failure cases in standard cuckoo hashing without requiring rehashing. The stash size remains logarithmic in the failure probability but constant in expectation for high load factors up to 95%, providing robust performance guarantees.^[19] Tabulation hashing enhances cuckoo hashing by replacing complex universal hash functions with simple lookup tables, where each character or byte of the key indexes into a precomputed table to produce hash values for h1 and h2. This approach accelerates hash computations significantly, as table lookups are faster than arithmetic operations in universal hashing, while maintaining the probabilistic guarantees of cuckoo hashing with high probability of success exceeding 1 - 1/n^{1/3}. Seminal analysis shows that simple tabulation suffices for cuckoo hashing, enabling practical implementations with minimal overhead and strong space bounds.^[20] Blocked cuckoo hashing partitions the hash table into blocks of fixed size b, where each block can hold up to b key-value pairs, improving cache efficiency by aligning operations to cache line boundaries and reducing the number of memory accesses during insertions and lookups. This variant supports higher load factors approaching 1 - Θ(1/log log n) while preserving constant expected insertion time, and it facilitates parallelism by allowing independent operations within blocks. The design, originally proposed for tightly packed bins, demonstrates that with b ≈ log(1/ε)/log log(1/ε), failure probabilities drop to ε with only a constant factor space increase.^[21] Bubble-up cuckoo hashing is a 2025 advancement in d-ary cuckoo hashing that supports load factors up to 1 - ε using a dynamic maximum number of hash functions d = ⌈ln(1/ε) + α⌉ for small α > 0. It employs a "bubble-up" mechanism where elements preferentially move to higher-indexed hash positions during insertions, enabling online insertions with expected O(1/δ) time for load factors 1 - δ ≤ 1 - ε and O(1) expected positive query time independent of d and ε.^[12]

Comparisons

With Chaining Hashing

Cuckoo hashing differs fundamentally from separate chaining in its approach to collision resolution. In cuckoo hashing, each key is assigned to one of two fixed positions determined by two independent hash functions, and collisions are handled by relocating existing keys to their alternative positions through a process known as cuckoo eviction, without using dynamic data structures. In contrast, separate chaining employs a single hash function to map keys to buckets, where each bucket maintains a linked list (or similar dynamic structure) to store multiple colliding keys, allowing unlimited growth per bucket. Performance characteristics also diverge notably. Cuckoo hashing guarantees worst-case O(1) lookup time by checking at most two positions, providing predictable access even under high load factors up to 0.5, though insertions exhibit higher variance due to potential eviction chains. Separate chaining, however, offers expected O(1 + α) time for lookups and insertions, where α is the load factor, but performance can degrade to O(n in the worst case if chains become unbalanced, and it typically operates efficiently at load factors around 0.7. Overall, cuckoo hashing achieves better space efficiency at high loads by avoiding pointer overhead, using approximately three words of space per key on average, compared to chaining's additional memory for list nodes. Key advantages of cuckoo hashing over separate chaining include its cache-friendliness, as lookups involve a fixed number of direct array accesses without pointer chasing through lists, which reduces latency in modern hardware. Chaining, while simpler to implement and more robust to poor hash functions, struggles with deletions that require no relocation but can lead to fragmented memory and slower traversals in long chains. Cuckoo hashing is particularly preferred in read-heavy workloads, such as key-value stores with temporal locality, where its consistent lookup speed outperforms chaining by up to 2-3 times in experimental settings.^[22]

With Linear Probing

Cuckoo hashing and linear probing represent two distinct approaches within open addressing, a collision resolution strategy that stores all elements in the primary hash table array without auxiliary structures. In cuckoo hashing, each key is assigned exactly two candidate positions in the table, determined by two independent hash functions, allowing insertions to evict existing keys to their alternate positions if necessary.^[5] In contrast, linear probing begins at the initial hash position h(k) for a key k and sequentially examines the next slots in a linear fashion until an empty slot is found or the key is located. A key performance distinction arises from their handling of collisions: cuckoo hashing avoids the formation of long probe sequences by limiting lookups to at most two fixed positions, yielding a true worst-case O(1) time complexity for searches and deletions.^[5] Linear probing, however, is susceptible to primary clustering, where collided keys tend to occupy contiguous blocks, resulting in longer average probe sequences; the expected number of probes for an unsuccessful search is approximately \frac{1}{2(1 - \alpha)^2}, where \alpha is the load factor, which degrades significantly as \alpha approaches 1. These differences lead to complementary strengths and weaknesses. Cuckoo hashing provides more resilient performance against clustering effects, even with moderately imperfect hash functions, but its insertion process is more intricate due to potential eviction chains that may require rehashing in rare cases.^[5] Linear probing offers a simpler implementation with excellent cache locality from sequential accesses, yet it becomes space-inefficient at high load factors (typically above 0.7), where clustering amplifies search times.^[23] In some experimental evaluations of distributed systems, cuckoo hashing variants outperform linear probing at load factors exceeding 90%, though such high loads are impractical for standard implementations.^[24] In practice, cuckoo hashing suits scenarios demanding uniform, predictable access patterns and high load efficiency, such as in embedded systems or databases with balanced workloads. Linear probing, by virtue of its straightforward design, is often favored in applications prioritizing ease of implementation and low-constant-factor operations, like initial prototyping or cache-sensitive environments.^[5]

Applications

Real-World Usage

Cuckoo hashing was introduced in 2001 by Rasmus Pagh and Flemming Friche Rodler, offering a collision resolution technique that achieves worst-case constant-time lookups while maintaining high space efficiency.^[1] The method has seen adoption in production systems, particularly in scenarios requiring high-performance lookups, such as in the libcuckoo library for concurrent hash tables used in networking software like DPDK.^[25] In database systems, RocksDB incorporates a cuckoo hashing-based SST file format for optimized point lookups.^[26] Variants of Apache Cassandra, such as Rockssandra, which uses RocksDB as its storage engine, benefit from this for efficient handling of high insert and delete rates.^[26]^[27] Practical trade-offs in these implementations often involve occasional rehashing during insertions to resolve cycles, balancing worst-case guarantees with amortized performance.^[1]

Performance in Specific Domains

In networking applications, such as IP lookups in routers, cuckoo hashing provides worst-case O(1) lookup time, making it suitable for high-speed packet processing where low latency is critical. By using multiple hash functions and potential item relocation, it minimizes memory accesses to typically 2-3 per lookup, offering performance comparable to Ternary Content Addressable Memory (TCAM) hardware but at lower cost and with greater scalability in software implementations.^[28]^[29] To achieve strict worst-case guarantees akin to TCAM, variants incorporate a small stash for handling rare relocation failures, ensuring constant-time operations even under adverse conditions.^[30] In database indexing, particularly for NoSQL systems like variants of Apache Cassandra (e.g., Rockssandra, which uses RocksDB as its storage engine), cuckoo hashing supports efficient handling of high insert and delete rates compared to balanced trees at low heights. Its cache-friendly design enables 1-2 memory accesses per lookup, reducing overhead in key-value stores and improving throughput for dynamic workloads with frequent updates.^[26]^[27] This is especially beneficial in distributed environments where partition keys require fast resolution without deep traversals. In machine learning, cuckoo hashing can enhance feature hashing for sparse vectors by dynamically resolving collisions through relocation, leading to fewer conflicts than standard hash maps and better preservation of feature sparsity. This approach supports efficient storage and access in high-dimensional data, such as text or recommendation systems, where vector operations benefit from constant-time lookups.^[31]^[32] Benchmarks in read-intensive applications demonstrate that cuckoo hashing can achieve faster lookups than chaining-based hash tables, primarily due to avoided pointer chasing and predictable access patterns. For instance, in networking workloads, optimized cuckoo implementations deliver higher throughput for negative lookups compared to chaining alternatives.^[29]^[33]

References

[1]
[PDF] Cuckoo Hashing - BRICS
Our experiments show cuckoo hashing to be quite competitive, especially when the dictionary is small enough to fit in cache. We thus believe it to be attractive ...
[2]
[PDF] An Overview of Cuckoo Hashing 1 Abstract 2 Introduction
Cuckoo Hashing is a technique for resolving collisions in hash tables that produces a dic- tionary with constant-time worst-case lookup and deletion ...
[3]
[PDF] Cuckoo Hashing
Dec 8, 2003 · The contribution of this paper is a new hashing scheme called Cuckoo Hash- ing, which possesses the same theoretical properties as the classic ...Missing: motivation | Show results with:motivation
[4]
[PDF] Cuckoo Hashing - Rasmus Pagh
The contribution of this paper is a new, simple hashing scheme called cuckoo ... A dictionary with worst case constant lookup time was first obtained by.Missing: motivation | Show results with:motivation
[5]
[PDF] Architecture-Conscious Hashing - CMU School of Computer Science
Jun 25, 2006 · Since Cuckoo Hashing does not use the linked list, its performance does not depend on the load factor. Moreover, since the next array is ...
[6]
[PDF] Lecture 18: Bloom Filters and Cuckoo Hashing March 12th, 2019
Mar 12, 2019 · Algorithm 5: Cuckoo Hashing Delete input : x output: H with x deleted. 1 Compute h1(x) and h2(x). 2 if H[h1(x)] = x then. 3 set H[h1(x)] to ...<|control11|><|separator|>
[7]
[PDF] Cuckoo Hashing - Rasmus Pagh
Deletion is of course simple to perform in constant time, not counting the possible cost of shrinking the tables if they are becoming too sparse. As for ...
[8]
[PDF] CUCKOO HASHING: FURTHER ANALYSIS - Luc Devroye
Nov 9, 2001 · Abstract. We consider cuckoo hashing as proposed by Pagh and Rodler in 2001. We show that the expected construction time of the hash table ...
[9]
[PDF] Tight Thresholds for Cuckoo Hashing via XORSAT (Extended Abstract)
We seek thresholds ck such that, as n goes to infinity, if n/m ≤ c for some c<ck then a hash table can be constructed successfully with high probability, and if ...
[10]
[PDF] DyCuckoo: Dynamic hash tables on GPUs - InK@SMU.edu.sg
In this paper, we propose a dynamic cuckoo hash table on GPUs, known as DyCuckoo. Cuckoo hashing [30] uses several hash functions to give each key multiple ...
[11]
[PDF] Explicit, Closed-form, General bounds for Cuckoo Hashing with a ...
Jun 16, 2023 · In these protocols, cuckoo hashing build failures constitute a security failure. Failure is desired to be negligible in some parameter N, but ...Missing: factor 0.5<|control11|><|separator|>
[12]
[PDF] Space Efficient Hash Tables with Worst Case Constant Access Time
A particularly fast and simple hash table with worst case constant access time is Cuckoo Hashing [23]: Each element is mapped to two tables t1 and t2 of size (1 ...
[13]
[PDF] Load Thresholds for Cuckoo Hashing with Double Hashing - DROPS
In k-ary cuckoo hashing, each of cn objects is associated with k random buckets in a hash table of size n. An `-orientation is an assignment of objects to ...
[14]
Balanced allocation and dictionaries with tightly packed constant ...
Jun 21, 2007 · We study a particular aspect of the balanced allocation paradigm (also known as the “two-choices paradigm”): constant sized bins, packed as ...
[15]
[PDF] Insertion Time of Random Walk Cuckoo Hashing below the Peeling ...
Apr 25, 2022 · Random walk cuckoo hashing has O(1) expected amortized insertion time when k ≥ 3 and the load factor is below the peeling threshold.
[16]
[PDF] More Robust Hashing: Cuckoo Hashing with a Stash∗
Cuckoo hashing uses a stash, a small memory area outside the main table, to store items and reduce the need for full rehashing, improving robustness.
[17]
[1011.5200] The Power of Simple Tabulation Hashing - arXiv
Nov 23, 2010 · Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC'77).
[18]
[PDF] Balanced Allocation and Dictionaries with Tightly Packed Constant ...
Jul 18, 2005 · In two experiments we compared the performance of four methods for implementing a dynamic dictionary: • blocked cuckoo hashing (cuckoo-block), ...
[19]
[PDF] Fully De-Amortized Cuckoo Hashing for Cache-Oblivious ... - arXiv
Jul 21, 2011 · Examining computational geometry, van emde boas trees, and hashing from the perspective of the fusion tree. SIAM J. Comput., 29:1030–1049 ...
[20]
[PDF] Open & Bucket Hashing - UPenn CIS
Primary Clustering. Linear probing leads to primary clustering in a big way! Linear probing is one of the worst collision resolution methods. HASHING I. CIT ...
[21]
[PDF] Efficient Usage of One-Sided RDMA for Linear Probing
Aug 31, 2020 · We find that cuckoo hashing outperforms linear probing only in very highly loaded hash tables (load factors at 90% or higher) that would be ...Missing: seminal | Show results with:seminal
[22]
Algorithmic improvements for fast concurrent Cuckoo hashing
This paper presents the design, implementation, and evaluation of a high-throughput and memory-efficient concurrent hash table that supports multiple readers ...
[23]
[PDF] Mitigating Asymmetric Read and Write Costs in Cuckoo Hashing for ...
Jul 10, 2019 · Therefore, all buckets with subgraph number 2 have the actual subgraph number of 1. ... thread cuckoo hashing throughput for Insert-only workload.
[24]
Open-sourcing F14 for faster, more memory-efficient hash tables
Apr 25, 2019 · If an algorithm doesn't limit itself to power-of-two table sizes, this overhead can be avoided when the caller uses reserve , but that is not ...Missing: limitations | Show results with:limitations
[25]
abseil / Abseil Containers
The Abseil container library contains a number of useful hash tables generally adhering to the STL container API contract: absl::flat_hash_map; absl:: ...Hash Tables · Absl::Node_hash_map And Absl... · B-Tree Ordered ContainersMissing: cuckoo | Show results with:cuckoo<|separator|>
[26]
High performance Java implementation of a Cuckoo filter - GitHub
Overall the single-threaded speed of the two libraries is comparable. This library supports concurrent access through multithreading (Guava's Bloom does not).
[27]
[PDF] MicroCuckoo Hash Engine for High-Speed IP Lookup - VTechWorks
May 5, 2017 · The Cuckoo hashing technique is an open addressing hash mechanism that provides an efficient solution for data query operations while ...
[28]
[PDF] High-Performance Hash Tables for Networking Applications - arXiv
Dec 27, 2017 · Among these, cuckoo hash tables provide excellent performance by allowing lookups to be processed with very few memory accesses (2 to 3 per.Missing: separate | Show results with:separate
[29]
More Robust Hashing: Cuckoo Hashing with a Stash
Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a ...
[30]
Cuckoo Hashing Table Format - RocksDB
Sep 12, 2014 · The Cuckoo Hashing table format in RocksDB is optimized for fast point lookups, using multiple hash functions, and a cache-friendly algorithm ...
[31]
Enhancing lookup performance of key-value stores using cuckoo ...
Oct 1, 2013 · Experiments show that the proposed scheme is much faster than the hash table with separate chaining for temporal locality workloads.
[32]
[PDF] Cuckoo Feature Hashing: Dynamic Weight Sharing for Sparse ...
In Cuckoo hashing, the key is stored in the hash buckets ... All other methods aim to hash the high-dimension features into a lower dimensional space.Missing: fusion | Show results with:fusion
[33]
Cuckoo Linear Algebra | Proceedings of the 21th ACM SIGKDD ...
In this paper we present a novel data structure for sparse vectors based on Cuckoo hashing. It is highly memory efficient and allows for random access at ...
[34]
[PDF] Algorithmic Improvements for Fast Concurrent Cuckoo Hashing
Apr 16, 2014 · Our performance results demonstrate that our new hash table design—based around optimistic cuckoo hashing— outperforms other optimized ...