Fact-checked by Grok 2 weeks ago

Double hashing

Double hashing is a collision resolution technique employed in open-addressing hash tables, where two distinct hash functions are used to compute both the initial probe position and the increment for subsequent probes when a collision occurs.^[1] The primary hash function h_1(k) determines the starting slot for a key k, while the secondary hash function h_2(k) provides the step size, generating the probe sequence h(k, i) = (h_1(k) + i \cdot h_2(k)) \mod m for i = 0, 1, 2, \dots, where m is the table size and h_2(k) is chosen to be coprime with m to ensure all slots are probed.^[2] This method was first introduced by W. W. Peterson in his 1957 paper on addressing for random-access storage, marking an early advancement in hashing strategies for efficient data retrieval.^[2] Unlike linear probing, which uses a fixed step size of 1 and suffers from primary clustering where collisions tend to form long contiguous runs, double hashing mitigates clustering by varying the probe increments based on the key, leading to more uniform distribution of probes across the table.^[1] It approximates the performance of ideal uniform hashing, with average search times asymptotically equivalent to uniform probing for load factors up to approximately 0.31, and empirically strong performance even at higher loads when keys are uniformly distributed.^[2] Double hashing excels in preventing collisions compared to alternatives like quadratic probing or Robin Hood hashing, showing up to 79.6% fewer collisions than Robin Hood hashing at a 0.8 load factor, though it may incur longer average probe lengths in some implementations.^[3] In practice, double hashing requires careful selection of hash functions to avoid degenerate sequences, and it operates under open addressing, meaning all elements are stored directly in the table array without auxiliary structures like linked lists.^[1] While its worst-case time complexity for insertions, deletions, and searches is O(n) due to potential full table traversal, average-case performance remains O(1) for load factors below 0.8, making it suitable for applications demanding fast lookups in static or moderately dynamic datasets.^[3] Extensions like double hashing with passbits or in parallel environments further optimize it for specific hardware constraints.^[4]^[5]

Overview

Definition and Purpose

Double hashing is a collision resolution technique employed in open-addressing hash tables, where all elements are stored directly in a single array without the use of auxiliary structures like linked lists. In this method, two distinct hash functions are utilized: the primary hash function determines the initial position for a key in the table, while the secondary hash function specifies the increment or step size for probing alternative positions when a collision occurs at the initial slot. This approach ensures that probes for a given key follow a sequence that visits distinct slots until an empty one is found or the key is located. The primary purpose of double hashing is to efficiently resolve collisions in hash tables by producing probe sequences that approximate a random permutation of the table slots, thereby minimizing clustering effects that degrade performance in simpler methods. Unlike linear probing, which can lead to primary clustering where consecutive probes form long chains of occupied slots, double hashing distributes probes more evenly across the table, reducing the likelihood of long search times. Similarly, it mitigates secondary clustering inherent in quadratic probing, where keys sharing the same initial hash position follow identical probe paths; here, the secondary function ensures unique step sizes for each key, promoting better load distribution and faster insertions, deletions, and searches.^[6] Double hashing was first introduced by W. W. Peterson in 1957 in his paper "Addressing for random-access storage," building on earlier collision resolution techniques to address their limitations in handling high load factors. It was notably discussed and analyzed in the seminal work by Donald E. Knuth, who examined its theoretical foundations and performance characteristics in the context of searching and sorting algorithms.^[7]^[8]^[6] To illustrate, consider a hash table of size 7 (m=7) inserting keys 14 and 21, assuming a primary hash h1(k) = k mod 7 yields h1(14) = 0 and h1(21) = 0, causing a collision at slot 0, which is occupied by 14. With a secondary hash h2(k) = 1 + (k mod (m-1)), h2(14) = 3 and h2(21) = 4. For 21, the probe sequence starts at slot 0 (occupied), then steps by 4 to slot 4 (empty), placing 21 there. Subsequent searches for 21 follow the same non-consecutive path: 0, 4, 1, 5, 2, 6, 3, etc., avoiding dense clusters.

Comparison to Other Collision Resolution Methods

Double hashing is a technique within open addressing, where all probe sequences for resolving collisions remain confined to the contiguous array of the hash table, unlike separate chaining, which allocates separate lists (often linked lists) at each slot to handle multiple keys, potentially leading to pointer traversals and higher memory overhead.^[9] In comparison to other open addressing methods, linear probing advances through the table using a fixed increment, such as +1 modulo the table size, which is simple to implement but prone to primary clustering, where consecutive occupied slots form long chains that degrade search performance.^[9] Double hashing mitigates this by employing a key-dependent step size derived from a secondary hash function, producing more dispersed probe sequences and avoiding the formation of such clusters. Quadratic probing addresses primary clustering to some extent by using quadratic increments (e.g., +i² modulo the table size), which spreads probes more evenly than linear probing, but it can introduce secondary clustering where keys with the same initial hash value follow identical probe paths, and it requires careful table sizing (often prime) to guarantee full table traversal.^[9] Double hashing further improves upon this by generating truly key-dependent and pseudo-random probe steps, reducing both primary and secondary clustering when the secondary hash function is chosen appropriately relative to the table size. A key trade-off of double hashing is the computational overhead of evaluating two hash functions per insertion or search, making it slightly slower than linear probing's single hash and fixed-step arithmetic, yet it achieves probe sequences that approximate uniform hashing, enabling efficient operation at load factors up to approximately 0.8 without excessive clustering.^[10] In contrast, chaining avoids clustering entirely but incurs costs from list traversals, while open addressing variants like double hashing benefit from better cache locality due to sequential array access.^[9] Double hashing is particularly advantageous in memory-constrained settings over chaining, as it maintains a denser table structure with superior spatial locality for cache performance, making it suitable for applications where minimizing memory footprint and maximizing probe efficiency are priorities.^[9]

Mechanism

Primary and Secondary Hash Functions

In double hashing, the primary hash function, denoted h_1(k), computes the initial probe position for a key k in the hash table of size m. This function maps the key to an index in the range \{0, 1, \dots, m-1\}, serving as the starting point for collision resolution. A common choice for h_1(k) is the division method, expressed as h_1(k) = k \mod m, which distributes keys based on the remainder of division by the table size.^[11] This approach ensures a straightforward initial placement while relying on the uniformity of the key distribution to minimize initial collisions.^[12] The secondary hash function, h_2(k), determines the step size used to generate subsequent probe positions when a collision occurs at the initial index. Unlike the primary function, h_2(k) must produce a positive integer that is never zero for any key k, as a zero step would trap the probing sequence at the starting position. Additionally, to guarantee that the probe sequence can traverse the entire table and avoid cycles shorter than m, h_2(k) should be relatively prime to the table size m for every key. This property ensures that the increments cover all slots before repeating, providing a permutation of the table indices.^[11]^[12] Both hash functions should be computationally efficient to maintain the overall constant-time performance of hash table operations, while exhibiting good uniformity to approximate random distribution of keys. Ideally, h_1(k) and h_2(k) are pairwise independent, meaning that for any distinct keys k_1 and k_2, the pair (h_1(k_1), h_2(k_1)) is independent of (h_1(k_2), h_2(k_2)), which enhances the randomness of probe sequences and reduces clustering effects.^[13] Common implementations include using multiplicative hashing for h_1(k), such as h_1(k) = \lfloor m (k \cdot A \mod 1) \rfloor where A is a constant like the golden ratio, to achieve better distribution for integer keys. For h_2(k), a typical form is h_2(k) = R - (k \mod R), where R is a prime number smaller than m, ensuring the required non-zero and coprime properties.^[11]^[14]

Probe Sequence Computation

In double hashing, the probe sequence for a key k is computed using the formula

h(i, k) = (h_1(k) + i \cdot h_2(k)) \mod m,

where i is the probe number starting from 0, h_1(k) is the primary hash function providing the initial position, h_2(k) is the secondary hash function determining the step size, and m is the table size. This arithmetic progression generates a unique probe path for each key, traversing the table in increments dictated by h_2(k).^[11]^[15] The insertion process begins by evaluating h(0, k) to determine the starting slot. If this slot is empty, the key is inserted there. Otherwise, i is incremented to 1, and h(1, k) is computed; this continues sequentially until an empty slot is located, where the key is placed, or until all slots have been probed, indicating a full table. To ensure the probe sequence visits every slot exactly once and avoids premature cycling, h_2(k) must be coprime with m; if \gcd(h_2(k), m) > 1, the sequence may enter an infinite loop by revisiting positions without covering the entire table.^[11]^[15] For searching a key k, the algorithm follows the identical probe sequence starting from i = 0, examining each slot until the key is found or an empty slot is encountered, at which point the search concludes that the key is absent. Deletion follows a similar path: the sequence is probed until the matching key is located, after which the slot is marked with a special deletion indicator (rather than emptied) to preserve the integrity of probe sequences for other keys during future searches.^[11]^[15] To illustrate, consider a hash table of size m = 7 with h_1(k) = k \mod 7 and h_2(k) = 1 + (k \mod 5). Inserting the key 12 yields h_1(12) = 5 and h_2(12) = 3, so the initial probe (i=0) places it at position 5, assuming the slot is empty. Next, inserting 19: h_1(19) = 5 (colliding with the prior key) and h_2(19) = 5, so the next probe (i=1) computes (5 + 1 \cdot 5) \mod 7 = 3, and the key is inserted at position 3, assuming it is empty. This step-by-step probing demonstrates how double hashing resolves collisions by following key-specific increments.

Design Considerations

Choosing Hash Functions

The primary hash function h_1(k) in double hashing should distribute keys uniformly across the table indices to minimize the probability of initial collisions and promote even load distribution.^[16] Additionally, it should demonstrate the avalanche effect, ensuring that small changes in the input key k result in substantial changes to the output hash value, thereby reducing patterns in collisions.^[16] The secondary hash function h_2(k) must satisfy stricter conditions to generate effective probe steps. It should never evaluate to zero, as this would trap probes at the initial position h_1(k), preventing traversal of the table.^[17] Critically, \gcd(h_2(k), m) = 1 for all keys k, where m is the table size, to guarantee that the probe sequence permutes all m slots and avoids short cycles or unreachable positions.^[18] h_2(k) should also be computationally inexpensive relative to h_1(k) and independent of it, to preserve the randomness of probe paths and prevent secondary clustering where keys sharing h_1(k) follow identical sequences.^[16] Practical selections for h_2(k) depend on the table size m. When m is prime, simpler modular functions suffice, such as h_2(k) = 1 + (k \mod (m-1)), which yields values from 1 to m-1 and ensures the gcd condition holds due to primality.^[18] For m a power of 2, h_2(k) must produce odd values to maintain coprimality with m; a common approach is h_2(k) = 1 + (k \mod (m-1)), which tends to generate odd outputs when m-1 is odd, though further adjustment (e.g., via bitwise operations) may be needed for consistency.^[17] To validate choices, empirical testing through simulations is recommended, measuring metrics like average insertion probe lengths or degree of clustering under representative key distributions to confirm uniform coverage without degenerate paths.^[16] Common pitfalls include selecting an h_2(k) that is constant or even for power-of-2 m, which reduces double hashing to linear probing behavior and creates long clusters; similarly, dependence on h_1(k) can cause all colliding keys to share probe sequences, exacerbating performance degradation.^[16]

Table Size Requirements

In double hashing, the table size m is ideally chosen as a prime number to maximize the probability that \gcd(h_2(k), m) = 1 for any key k, ensuring the probe sequence can traverse all slots without premature cycling.^[19]^[20] This coprimality condition, where h_2(k) is the secondary hash value serving as the step size, guarantees a full permutation of the table indices, mimicking a random probe order and minimizing clustering.^[21] If \gcd(h_2(k), m) > 1, the probe sequence will cycle through only a subset of the m slots, specifically m / d positions where d = \gcd(h_2(k), m), potentially leading to failed insertions or searches even when space is available elsewhere in the table.^[21] For example, with m = 7 (a prime), any odd h_2(k) allows full traversal of all 7 slots; in contrast, for m = 8 (a power of 2) and an even h_2(k) = 2, the sequence visits only 4 slots due to \gcd(2, 8) = 2, restricting access to even indices.^[22] Powers of 2 for m are acceptable alternatives if h_2(k) is adjusted to ensure coprimality, such as by making it always odd, though this may introduce minor clustering risks compared to primes.^[23] To maintain performance as the table fills, a common resizing strategy involves doubling the table size to the next suitable value (prime or power of 2) and rehashing all elements when the load factor exceeds a threshold like 0.75, preventing excessive probe lengths.^[24] In modern implementations, table sizes are often aligned with cache line boundaries using powers of 2 to optimize memory access patterns and reduce cache misses, despite the potential for slightly reduced permutation quality in double hashing.^[25]^[26]

Performance Analysis

Expected Probe Lengths

Under the uniform hashing assumption, where the primary hash function h_1 and secondary hash function h_2 are independent and uniformly distribute keys over the table, the probe sequences in double hashing approximate random permutations, reducing primary clustering effects compared to simpler methods.^[27] For an unsuccessful search, which probes until an empty slot is found, the expected number of probes is \frac{1}{1 - \alpha}, where \alpha = \frac{n}{m} is the load factor with n elements and table size m. This follows from the geometric distribution, as each probe has success probability $1 - \alpha of finding an empty slot.^[27] For a successful search, which stops upon finding the target key, the expected number of probes is \frac{1}{\alpha} \ln \frac{1}{1 - \alpha} under the uniform distribution assumption. This formula arises from the approximation to uniform probing, where the position of the key along the probe sequence follows a geometric distribution adjusted for the load factor.^[18] Insertion follows a process similar to an unsuccessful search to locate an empty slot, yielding the same expected probe count of \frac{1}{1 - \alpha}; however, it may incur additional amortized cost from rehashing the entire table if the load factor exceeds a predefined threshold to maintain performance bounds.^[27] Deviations from independence or uniformity in h_1 and h_2 can introduce secondary clustering, increasing actual probe lengths beyond these expectations.^[27] Asymptotically, these expected probe lengths scale as O\left(\frac{1}{1 - \alpha}\right) under high load factors, outperforming linear probing's O\left(\frac{1}{(1 - \alpha)^2}\right) for unsuccessful searches due to reduced clustering.^[2]

Load Factor and Capacity

In double hashing, the load factor α is defined as the ratio of the number of elements stored (n) to the total number of slots in the table (m), representing the fraction of the table that is occupied. To ensure efficient operation and bound the average number of probes in an unsuccessful search to approximately 4 or fewer, implementations typically maintain α below 0.75. This threshold approximates the behavior of uniform probing, minimizing clustering while allowing reasonable space utilization.^[18] Capacity management in double hashing involves monitoring α during insertions; when it exceeds the predefined threshold (often 0.75), the table size is increased—commonly to the next prime number larger than twice the current size—to restore the load factor and prevent excessive collisions. All existing elements are then rehashed into the new table, a process that, while temporarily costly, maintains amortized O(1) time per insertion through geometric growth in capacity.^[28] Deletions pose challenges in double hashing due to the fixed probe sequences used in open addressing; simply removing an element could disrupt searches for subsequent keys in the same sequence. To address this, tombstone markers are employed to flag deleted slots as available for new insertions but traversable during searches, preserving sequence integrity at the cost of increasing the effective load factor since tombstones continue to occupy space and contribute to probe lengths. Empirical studies through simulations demonstrate that performance degrades noticeably above α = 0.8, with probe lengths exceeding 40 in some cases and operation times rising due to heightened collision rates, underscoring the need for proactive resizing. The amortized cost of rehashing remains O(1) per operation, as the infrequent resizing events are offset by the benefits across many insertions.^[3]

Variants and Extensions

Triple Hashing

Triple hashing is a conceptual extension of double hashing in open addressing hash tables, where additional hash functions (e.g., a third) can be used to further diversify probe sequences and reduce clustering.^[29] While not commonly implemented, it generalizes the approach by applying multiple independent hash functions sequentially or in combination if collisions persist with the first two. Standard double hashing typically suffices for most applications, and triple hashing introduces added computational overhead without proportional benefits in practice.

Enhanced Double Hashing

Enhanced double hashing refines the double hashing approach by incorporating a quadratic term into the probe sequence formula, primarily developed for generating multiple hash locations in Bloom filters but analogous to probing in open-addressed hash tables. The probe position for the i-th attempt is given by h(i, k) = \left( h_1(k) + i \cdot h_2(k) + \frac{i(i-1)}{2} \right) \mod m, where h_1(k) and h_2(k) are the primary and secondary hash functions, respectively, and m is the table size. This adjustment helps approximate more uniform permutations, reducing risks of short cycles or clustering compared to basic double hashing, particularly when table sizes are not prime.^[30] The primary purpose of this enhancement is to maintain performance guarantees similar to double hashing while easing constraints on hash function selection. Although originating in probabilistic data structures like Bloom filters, the technique can be applied to hash table probing to improve distribution in constrained environments. However, it is not widely adopted in standard hash table implementations, where simpler double hashing with prime table sizes is preferred.^[30]

Implementation Example

An efficient implementation of enhanced double hashing uses incremental updates to compute probes in O(1) time per step, avoiding recomputation. The following pseudocode illustrates insertion and search, assuming table size m is prime and hash functions h1, h2 produce values in [0, m-1]. The quadratic term is accumulated by increasing the step size linearly.

function insert(key, table, m):
    h = h1(key) % m
    step = h2(key) % m
    if step == 0:
        step = 1  # Avoid zero step
    i = 0
    pos = h
    current_step = step
    while table[pos] is occupied and table[pos] != key:
        i += 1
        pos = (pos + current_step) % m
        current_step = (current_step + 1) % m
        if i >= m:
            return failure  # Table full, resize or fail
    if table[pos] is empty or table[pos] == key:
        table[pos] = key
        return success

function search(key, table, m):
    h = h1(key) % m
    step = h2(key) % m
    if step == 0:
        step = 1
    i = 0
    pos = h
    current_step = step
    while table[pos] is not empty and table[pos] != key:
        i += 1
        pos = (pos + current_step) % m
        current_step = (current_step + 1) % m
        if i >= m:
            return not found
    return table[pos] == key
function insert(key, table, m):
    h = h1(key) % m
    step = h2(key) % m
    if step == 0:
        step = 1  # Avoid zero step
    i = 0
    pos = h
    current_step = step
    while table[pos] is occupied and table[pos] != key:
        i += 1
        pos = (pos + current_step) % m
        current_step = (current_step + 1) % m
        if i >= m:
            return failure  # Table full, resize or fail
    if table[pos] is empty or table[pos] == key:
        table[pos] = key
        return success

function search(key, table, m):
    h = h1(key) % m
    step = h2(key) % m
    if step == 0:
        step = 1
    i = 0
    pos = h
    current_step = step
    while table[pos] is not empty and table[pos] != key:
        i += 1
        pos = (pos + current_step) % m
        current_step = (current_step + 1) % m
        if i >= m:
            return not found
    return table[pos] == key

This approach ensures O(1) time per probe, matching standard double hashing, with improved distribution from the quadratic adjustment. The method was introduced in work on Bloom filter verification and has influenced optimizations in probabilistic structures, offering potential benefits for hash tables in resource-limited settings, though empirical gains vary by load factor and hash quality.^[30]

References

[1]
[PDF] Collision Resolution Techniques in Hash Table: A Review
Hashing collisions occur when multiple keys generate the same hash value. Techniques like probing, double hashing, and separate chaining are used to resolve ...
[2]
[PDF] Hash Tables - Computational Geometry Lab
31, the performance of double hashing is asymptotically equivalent to that of uniform hashing. ... Peterson [51] wrote the first major article discussing the ...
[3]
[PDF] Efficiency in Hash Table Design - DiVA portal
Double Hashing excels in collision prevention. Robin Hood and its variant perform better in average lookup, insertion, deletion time and probe length.Missing: sources | Show results with:sources
[4]
Parallel Hashing: Collision Resolution Strategies and Performance
In this paper, we first study the performance of data parallel hash algorithms with conventional collision resolution strategies. ... double hashing yields ...
[5]
The analysis of double hashing(Extended Abstract)
In this paper we analyze the performance of a well known algorithm known as double hashing [Knuth]. ... Knuth, Donald E.,The Art of Computer Programming, Vol. 3, ...
[6]
The Art of Computer Programming (TAOCP)
Volume 3. Sorting and Searching , Second Edition (Reading, Massachusetts: Addison-Wesley, 1998), xiv+780pp.+foldout.
[7]
[PDF] Lecture 10: Hash Collision Resolutions - Washington
-Various schemes: -Linear Probing – easiest, but lots of clusters. -Quadratic Probing – middle ground, but need to be more careful about 𝜆. -Double Hashing – ...
[8]
[PDF] 4.2 Hashing - cs.Princeton
[Knuth 1962] Let # = N / M < 1 be average length of list. Parameters ... double hashing. load factor #. 50%. 66%. 75%. 90% linear probing search. 1.5. 2.0.
[9]
[PDF] Chapter 19 Hashing - CMU School of Computer Science
Hashing is used to implement hash tables. In hash tables one is given a set of keys K ⊂ α and needs to map them to a range of integers so they can stored ...
[10]
[PDF] 6.006 Lecture 10: Open addressing, cryptographic hashing
actually hit all slots (permutation) if h2(k) is relatively prime to m for all k ... • but double hashing can come close. Analysis. Suppose we have used open ...
[11]
[PDF] CS 124 / Department of Computer Science
Double hashing is another approach to resolving hash collisions. Page 3. Double hashing. We've seen that linear probing is prone to primary ...
[12]
[PDF] CMSC 341 Hashing (Continued) - UMBC
▫ Double hashing is a form of collision-handling where ... ( k ) = 0 for any k. ❑ A good choice: h. 2. ( k ) = R - ( k mod R ) with R a prime smaller than m.
[13]
[PDF] Hashing
Double hashing. Given two ordinary hash functions h1(k) and h2(k), double hashing uses the hash function h(k,i) = (h1(k) + i⋅h2(k)) mod m. This method ...<|control11|><|separator|>
[14]
Hashing - UT Computer Science
Double Hashing or rehashing: Hash the key a second time, using a different ... Then the probe sequence will be 0, 5, 10, 0, 5, 10, and so on, repeating ...Missing: formula | Show results with:formula
[15]
Hashing Tutorial: Section 6.4 - Double Hashing - Research
Aug 24, 2011 · This method is called double hashing. Use this applet to try out double hashing for yourself. Insert several values that all hash to the same slot.
[16]
11.4 Open addressing - CLRS Solutions - walkccc.me
Suppose that we use double hashing to resolve collisions—that is, we use the ... gcd(m, h_2(k)) d=gcd(m,h2(k)), the ...<|control11|><|separator|>
[17]
Double hashing - UCSD CSE
1. Set indx = H(K); offset = H 2 (K) · 2. If table location indx already contains the key, no need to insert it. Done! · 3. Else if table location indx is empty, ...Missing: original source<|control11|><|separator|>
[18]
[PDF] Hashing
Double hashing is an improvement over linear and quadratic probing in that Θ(m2) sequences are used rather then Θ(m) since every (h1(k),h2(k)) pair yields a ...
[19]
[PDF] CSE 332: Data Structures & Parallelism Lecture 10:Hashing
Jan 29, 2025 · More arguments for a prime table size. If TableSize is 60 and… – Lots ... • Double Hashing. 1/29/2025. 19. Page 19. Separate Chaining.
[20]
[PDF] 11 / 29 / 2023 Instructor - Michael Eckmann - Skidmore College
Nov 29, 2023 · another method is called double hashing. – if choices are done well we get the retrieval time to be a constant, but the worst case is O(n).
[21]
[PDF] CSE 241 Class 7
Sep 21, 2015 · How does using OA w/double hashing constrain our hash function design? ... true iff gcd(h2(k),m) = 1. • One possibility: make m a prime number ...
[22]
java - Would rehashing in double hashing also change the primary ...
Nov 18, 2013 · Practically, for double hashing, I use table size 2^N, and two hash functions, computes values h1(position) and h2(step). Step must be odd ...why are composite numbers bad for hashing by division?Moving from Linear Probing to Quadratic Probing (hash collisons)More results from stackoverflow.com
[23]
[PDF] Resizing Hash Tables - csail
Feb 25, 2011 · sequence (e.g. linear probing, quadratic probing, double hashing). To analyze the performance of operations in open addressing, we must ...
[24]
Optimizing Open Addressing - Max Slater
Jan 8, 2023 · Overall, two-way chaining is another good choice of hash table, at least when memory efficiency is not the highest priority. In the next section ...Separate Chaining · Linear Probing · Quadratic Probing · Double Hashing
[25]
More numerical experiments in hashing: a conclusion - Paul Khuong
May 11, 2011 · ... double hashing, and that's not particularly cache friendly. Yet, it ... power of two), maps to the locations uniformly, and preserves ...
[26]
The analysis of double hashing(Extended Abstract)
1.1 Overview. In this paper we analyze the performance of a well known algorithm known as double hashing [Knuth]. In this method.
[27]
CS 312 Lecture 20 Hash tables and amortized analysis
To make hash tables work well, we ensure that the load factor α never exceeds some constant αmax, so all operations are O(1) on average. The worst-case ...
[28]
[PDF] Hashing
▻ Linear probing: h(k, i)=(h0(k) + i)mod m. ▻ Quadratic probing: h(k, i)=(h0(k) + c1i + c2i2)mod m. ▻ Double hashing: h(k, i)=(h1(k) + i × h2(k))mod m.Missing: h3( | Show results with:h3(
[29]
Double Hashing - GeeksforGeeks
Mar 29, 2024 · Double hashing is a collision resolution technique used in hash tables. It works by using two hash functions to compute two different hash values for a given ...
[30]
http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf
[31]
[PDF] Lecture 7: Hashing III: Open Addressing - csail
h(k, i) =(h1(k) +i·h2(k)) mod m where h1(k) and h2(k) are two ordinary hash func- tions. 3. Page 4. Lecture 10. Hashing III: Open Addressing. 6.006 Fall 2011.<|control11|><|separator|>
[32]
Hashing and Hash Tables — CS 340
Open Addressing: Double Hashing (or triple, or quadruple for that matter)¶. If a collision occurs on one hash function, simply use another one. What do we do if ...Missing: recommendation | Show results with:recommendation
[33]
[PDF] Data Structures and Algorithm Analysis in C++ - UoITC
... cubic probing”; here the ith probe is at hash(x) + i3. Does cubic probing improve on quadratic probing? 5.9. Using a standard dictionary, and a table size ...<|control11|><|separator|>
[34]
[PDF] Bloom Filters in Probabilistic Verification
Our double and triple hashing techniques for Bloom filters employ two and three indices (respectively) to derive all k index values [5]. Dou- ble hashing is a ...
[35]
[PDF] Less Hashing, Same Performance: Building a Better Bloom Filter
Abstract. A standard technique from the hashing literature is to use two hash functions h1(x) and h2(x) to simulate additional hash functions.