Fact-checked by Grok 2 weeks ago

Hash collision

A hash collision is a situation in which two distinct inputs to a hash function produce identical output values, also known as a hash clash. In computer science, hash functions map data of arbitrary size to fixed-size values, typically for efficient storage and retrieval in data structures like hash tables, where collisions arise inevitably from the pigeonhole principle when the input space exceeds the output space. To manage collisions in hash tables, common resolution strategies include separate chaining, which links colliding elements in lists at the same index, and open addressing, which probes for the next available slot using methods like linear probing or double hashing. In , hash collisions pose significant security risks, as they can undermine the integrity of digital signatures, certificates, and applications by allowing attackers to forge data with the same hash. Cryptographic hash functions are thus designed for , making it computationally infeasible for adversaries to find such pairs using probabilistic polynomial-time algorithms, often relying on properties like preimage and second-preimage resistance as well. Notable real-world vulnerabilities include the 2004 practical collision attack on , which demonstrated forging certificates, and the 2017 SHAttered attack on , which produced colliding PDFs and accelerated its deprecation in favor of stronger standards like SHA-256 from the NIST SHA-2 family and the family. These developments highlight the ongoing evolution of hash functions to counter advancing computational power and cryptanalytic techniques.

Fundamentals

Hash functions

A hash function is a deterministic that takes an input of arbitrary size and produces a fixed-size output value, often referred to as a hash code, hash value, or digest. This mapping enables efficient data organization and retrieval in structures like hash tables, where the output serves as an into a fixed . Key properties of effective hash functions include determinism, which ensures that identical inputs always yield the same output; uniformity, which promotes an even distribution of hash values across the output space to minimize clustering; and computational efficiency, allowing rapid calculation even for large inputs. These attributes are crucial for practical applications, as poor uniformity can lead to uneven load distribution, while inefficiency hampers performance in time-sensitive operations. Hash functions are broadly categorized into non-cryptographic and cryptographic types. Non-cryptographic hash functions, such as those used in , prioritize speed and uniformity over security, with examples including multiplicative hashing where the input key k is multiplied by a constant fraction and scaled by the table size m, yielding h(k) = \lfloor m \cdot \{k a\} \rfloor for some constant a between 0 and 1. In contrast, cryptographic hash functions, like , emphasize resistance to attacks such as finding collisions or preimages, making them suitable for security protocols but often slower for general-purpose use. Mathematically, a simple can be represented using the division method: h(k) = k \mod m, where k is the input key treated as an and m is the size of the , producing an in the range [0, m-1]. This approach is straightforward but requires careful selection of m (often a ) to achieve good distribution. Representative examples include the polynomial rolling hash, commonly used for string processing, defined as h(s) = \sum_{i=0}^{n-1} s \cdot b^{n-1-i} \mod p, where s is the , b is a base (e.g., 31), and p is a large prime ; this allows efficient incremental updates for substrings. Another example is Java's hashCode() method in the Object class, which returns a consistent for the object to support hashing in collections like HashMap, with the contract requiring equal objects to produce equal hash codes for consistent bucketing.

Hash tables

A hash table is an array-based that implements an , mapping keys to values by using a to compute an index into an array of slots or buckets from which the corresponding value can be retrieved. This structure enables efficient storage and access, supporting operations such as lookups, insertions, and deletions with average constant under suitable conditions. The serves as the primary indexing mechanism, transforming keys into array indices to facilitate rapid access. The core operations of a hash table are straightforward in their basic form. For insertion, the is applied to the to determine the target in the , where the key-value pair is then stored. Search involves computing the of the to locate the and retrieve the associated if present. Deletion follows a similar process: the identifies the , and the key-value pair is removed from it. In the ideal scenario with no collisions—where each maps uniquely to a distinct —these operations achieve O(1) , akin to direct access. Selecting an appropriate table size is crucial for effective performance. The array size is often chosen as a to promote even distribution of keys when using modular hashing, or as a power of 2 to enable efficient bitwise operations in certain implementations. The load factor, defined as the ratio of the number of stored elements to the total number of slots, is typically maintained below 0.7; exceeding this threshold increases the potential for performance degradation, prompting resizing such as doubling the table size and rehashing all elements.

Definition of collision

In hashing, a collision occurs when two distinct keys, denoted as k_1 and k_2 where k_1 \neq k_2, are mapped to the same hash value by the , such that h(k_1) = h(k_2). This phenomenon arises in hash tables, where the h transforms keys from a potentially large into a fixed number of slots in an . The term "collision" specifically refers to this mapping overlap, distinguishing it from other hashing issues like poor . Collisions are inevitable in practical hash tables due to the : if the number of n exceeds the number of available slots m (i.e., n > m), at least one slot must contain more than one , guaranteeing a collision. Even when n \leq m, collisions can occur probabilistically because the compresses a vast key space into a finite range, making perfect injectivity impossible for most inputs. In hashing , colliding keys are sometimes called "synonyms," a term used to describe elements that share the same hash address, though "collision" is the more standard modern designation. To illustrate, consider a simple with 5 slots (indices 0 to 4) and a h(k) = k \mod 5. Inserting keys 3, 8, and 13 yields h(3) = 3, h(8) = 3, and h(13) = 3, causing all three to collide at slot 3:
Slot:  0  1  2  3  4
Keys:     |     3,8,13
This clustering at a single slot exemplifies how multiple distinct keys can target the same position. Without proper resolution, collisions degrade performance significantly: ideal operations like search, insert, and delete achieve O(1) average time by direct slot access, but in the worst case—such as when all keys collide in one slot—they revert to O(n) time due to linear scanning of the chain or probes. This underscores collisions as a fundamental failure mode in the otherwise efficient direct-addressing paradigm of hash tables.

Causes and probability

Deterministic factors

Deterministic factors contributing to hash collisions arise primarily from flaws in design and predictable patterns in input data, leading to non-uniform distribution of keys across buckets. A common issue is the selection of a poor , such as one that uses division by a non-prime m, which can result in systematic clustering where multiple keys map to the same subset of buckets. For instance, if the table size is a power of 2 or 10, keys sharing similar low-order bits—such as strings or numbers with repeating patterns—will frequently collide because the operation preserves these correlations, exacerbating uneven load distribution. Input clustering further amplifies deterministic collisions when similar keys, like consecutive integers or strings with shared prefixes (e.g., "apple" and "apricot"), are processed. In such cases, a simplistic hash function that emphasizes only certain bits or characters may direct these keys to identical or adjacent buckets, creating localized overloads independent of table load factor. An illustrative example involves treating strings as base-256 integers and hashing modulo 255, where permutations like "mac" and "cam" yield the same value since $256 \equiv 1 \pmod{255}, causing unavoidable collisions for patterned textual data. Fixed table sizes also introduce deterministic effects during resizing operations, where rehashing all entries into a larger can temporarily spike collisions if the new size poorly interacts with the existing . This , typically triggered when the load factor exceeds a like 0.75, requires recomputing hashes for every key, potentially leading to short-term performance degradation before the reduced load factor stabilizes distribution. Certain hash functions, such as linear congruential generators of the form h(k) = (a k + b) \mod [m](/page/M), are particularly prone to failures on patterned inputs; for example, when applied to strings with common prefixes or sequential data, poor choices of parameters a, b, or [m](/page/M) (e.g., m = 41 with a radix of ) result in systematic mappings to few buckets, undermining uniformity. To mitigate these issues, selecting a prime modulus [m](/page/M) promotes better spreading by minimizing arithmetic correlations, while employing families of hash functions—where the probability of collision between any two distinct keys is at most $1/[m](/page/M)—provides robustness against adversarial or patterned inputs without relying on fixed designs.

Probabilistic analysis

In probabilistic analysis of hash collisions, a key assumption is that of uniform random hashing, where each key is independently and uniformly mapped to any of the m slots in the with equal probability $1/m. This model, introduced in the context of universal functions, ensures that for any two distinct keys x and y, the probability that they hash to the same slot is at most $1/m. Under this assumption, the probability that no collisions occur when inserting n keys into a table of size m can be bounded and approximated. Specifically, the probability p(n, m) that all keys to distinct slots satisfies p(n, m) \leq \exp\left(-\frac{n(n-1)}{2m}\right), derived from the union bound over all pairwise collisions, where each pair collides with probability at most $1/m. For large m and uniform , this bound is asymptotically tight, yielding the approximation p(n, m) \approx \exp\left(-\frac{n(n-1)}{2m}\right). The expected number of collisions can be derived using linearity of expectation on indicator random variables for each pair of keys. Let X_{ij} be the indicator that keys i and j (for $1 \leq i < j \leq n) collide, so \Pr(X_{ij} = 1) \leq 1/m. The total number of collisions X = \sum_{1 \leq i < j \leq n} X_{ij} has expectation \mathbb{E}[X] = \sum_{1 \leq i < j \leq n} \mathbb{E}[X_{ij}] \leq \binom{n}{2} \cdot \frac{1}{m} = \frac{n(n-1)}{2m}. $$ For uniform hashing, equality holds in the approximation.[](https://www.cs.princeton.edu/~hy2/teaching/fall22-cos521/notes/lec2.pdf) The load factor $\alpha = n/m$ plays a central role, as the expected number of collisions grows quadratically with $\alpha$, approximately $\alpha^2 m / 2$. Collisions thus remain low for $\alpha \ll 1$ but escalate rapidly as $\alpha$ approaches 1, motivating table resizing in practice.[](https://courses.engr.illinois.edu/cs473/sp2016/notes/12-hashing.pdf) To illustrate, consider varying $n$ and $m$: for $m = 1000$ and $n = 10$ ($\alpha = 0.01$), the probability of no collision is approximately $\exp(-0.045) \approx 0.956$, with expected collisions $\approx 0.045$. For $n = 100$ ($\alpha = 0.1$), this drops to $\approx \exp(-4.95) \approx 0.007$, with expected collisions $\approx 4.95$. For $n = 500$ ($\alpha = 0.5$), the probability is $\approx \exp(-124.75) \approx 10^{-54}$, with expected collisions $\approx 124.75$, showing near-certainty of multiple collisions. These values highlight how collision rates surge with increasing load.[](https://didawiki.cli.di.unipi.it/lib/exe/fetch.php/matematica/asd/asd_17/clrsperfecthash.pdf) ### Birthday paradox The birthday paradox illustrates a counterintuitive aspect of probability: in a group of just 23 randomly selected people, the probability that at least two share the same birthday exceeds 50%, assuming 365 equally likely birthdays and ignoring leap years. This result arises because the likelihood of a shared birthday grows quadratically with the group size, rather than linearly as intuition might suggest.[](http://belohlavek.inf.upol.cz/vyuka/Cormen-birthday-paradox-analysis.pdf) In the context of hashing, the birthday paradox serves as a direct analogy for collisions in hash tables, where birthdays represent hash slots and people represent keys. For n keys hashed uniformly at random into m slots, the probability of at least one collision is approximately $1 - e^{-n^2 / (2m)}$, which reaches 50% when $n \approx \sqrt{2 m \ln 2}$. To sketch the derivation, consider the probability of no collisions: it equals the product $\prod_{i=1}^{n-1} \left(1 - \frac{i}{m}\right)$. For $n \ll m$, this approximates $e^{-\sum_{i=1}^{n-1} i/m} = e^{-n(n-1)/(2m)} \approx e^{-n^2 / (2m)}$, so the collision probability follows as its complement. This shows collisions become likely far sooner than the naive $n > m$ threshold.[](http://belohlavek.inf.upol.cz/vyuka/Cormen-birthday-paradox-analysis.pdf) Applying this to practical hashing, a 32-bit [hash function](/page/Hash_function) provides $m = 2^{32} \approx 4.3 \times 10^9$ slots, yet inserting about 77,000 keys yields a 50% chance of collision—demonstrating how even vast hash spaces fill unexpectedly quickly under random mapping. The [birthday problem](/page/Birthday_problem) itself originated in [probability theory](/page/Probability_theory) with an analysis by [Richard von Mises](/page/Richard_von_Mises) in 1939 and gained prominence in [computer science](/page/Computer_science) for [hash table](/page/Hash_table) analysis starting in the 1970s, notably in seminal texts on algorithms.[](http://belohlavek.inf.upol.cz/vyuka/Cormen-birthday-paradox-analysis.pdf)[](https://mathworld.wolfram.com/BirthdayProblem.html) ## Resolution strategies ### Separate chaining Separate chaining is a collision resolution technique for hash tables in which each slot of the underlying [array](/page/Array) points to a [linked list](/page/Linked_list) that stores all keys hashing to that slot, thereby grouping colliding elements together without overwriting any. This approach transforms the hash table into an [array](/page/Array) of lists, where collisions—occurrences of multiple distinct keys mapping to the same slot—are handled by appending elements to the appropriate list.[](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/d3e4d64266d481c74c9e7c15e09999fe_MIT6_006F11_lec08.pdf) Insertion involves computing the [hash](/page/Hash) value of the [key](/page/Key) to identify the target slot and then appending the key-value pair to the front of the [linked list](/page/Linked_list) at that slot, ensuring constant-time addition under simple uniform hashing assumptions. Search and deletion operations require traversing the [linked list](/page/Linked_list) at the computed slot linearly from the head until the matching [key](/page/Key) is located or the list ends, with deletion typically involving pointer adjustments to remove the [node](/page/Node). These processes leverage the auxiliary storage of the lists to maintain all relevant elements without probing other slots.[](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/d3e4d64266d481c74c9e7c15e09999fe_MIT6_006F11_lec08.pdf)[](https://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/hash.pdf) The average-case time complexity for insertion, search, and deletion is $O(1 + \alpha)$, where $\alpha = n/m$ is the load [factor](/page/Factor) with $n$ elements and $m$ slots, assuming keys [hash](/page/Hash) uniformly and independently; this remains efficient even for $\alpha > 1$. In the worst case, performance degrades to $O(n)$ if all keys collide into a single list.[](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/d3e4d64266d481c74c9e7c15e09999fe_MIT6_006F11_lec08.pdf)[](https://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/hash.pdf) Separate chaining offers simplicity in implementation and robust handling of high load factors without requiring table resizing during insertions, making it suitable for scenarios with unpredictable collision rates. A key drawback is the memory overhead from pointers in linked lists and potential inefficiency from pointer chasing, which can hinder performance on modern hardware. For practical implementations, especially with long chains, linked lists may be replaced by balanced [binary](/page/Binary) search trees to achieve $O(\log k)$ search time per chain, where $k$ is the chain length, though this adds complexity.[](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-fall-2011/d3e4d64266d481c74c9e7c15e09999fe_MIT6_006F11_lec08.pdf)[](https://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/hash.pdf) ### Open addressing Open addressing, also known as closed hashing, is a collision resolution strategy in hash tables where all elements are stored directly within the fixed-size [array](/page/Array) of the table itself, without using external data structures. When a collision occurs—meaning the computed hash index for a key is already occupied—the method [probes](/page/P.R.O.B.E.) for the next available slot in the array according to a predefined sequence. This approach contrasts with separate chaining, which resolves collisions by linking multiple elements at the same index. The probe sequence is typically defined as $ h(k, i) = (h'(k) + f(i)) \mod m $, where $ h'(k) $ is the initial [hash function](/page/Hash_function), $ i $ is the probe number starting from 0, $ f(i) $ is the probing function, and $ m $ is the table size.[](https://users.ece.utexas.edu/~adnan/360C/hash.pdf) Several common probing methods exist to determine the offset $ f(i) $. Linear probing uses $ f(i) = i $, sequentially checking the next slots, which is simple but prone to primary clustering where occupied slots form long contiguous blocks, increasing probe lengths. Quadratic probing employs $ f(i) = i^2 $ (or more generally $ c_1 i + c_2 i^2 $), which spreads probes more evenly and mitigates primary clustering, though it can still exhibit secondary clustering for keys sharing the same initial hash; it guarantees finding a slot within $ m/2 + 1 $ probes if $ m $ is prime and the load factor $ \alpha \leq 0.5 $. Double hashing uses $ f(i) = i \cdot h_2(k) $, where $ h_2(k) $ is a second independent hash function, providing better distribution and minimizing clustering, with experimental evidence showing strong performance when the table size is prime and $ \alpha < 0.5 $.[](https://users.ece.utexas.edu/~adnan/360C/hash.pdf)[](https://www.rose-hulman.edu/class/cs/csse230/202020/Slides/19-HashTableAnalysis.pdf) For insertion, the process begins at the initial hash index and follows the probe sequence until an empty slot (NIL) is found, at which point the key is placed there; the average number of probes required is $ 1/(1 - \alpha) $, assuming uniform hashing and load factor $ \alpha = n/m < 1 $. Searching follows the same probe sequence from the initial index, continuing until the key is matched or an empty slot is encountered, indicating the key is absent; the expected probes for an unsuccessful search is at most $ 1/(1 - \alpha) $, while for a successful search it approximates $ (1 + 1/(1 - \alpha))/2 $. Deletion is more challenging, as simply emptying a slot would break probe sequences for subsequent searches; instead, a "tombstone" or deleted marker is used to indicate the slot is available for insertion but must be traversed during searches to preserve chain integrity.[](https://users.ece.utexas.edu/~adnan/360C/hash.pdf) The time complexity for average-case insertion, search, and deletion in open addressing is $ O(1) $ when $ \alpha $ is kept below 0.75, but performance degrades sharply as $ \alpha $ approaches 1, potentially leading to $ O(n) $ worst-case probes without resizing. To maintain efficiency, tables are typically resized (e.g., doubled) and rehashed when $ \alpha > 0.5 $ or 0.75. Advantages include better [cache](/page/Cache) locality due to sequential [memory](/page/Memory) [access](/page/Access) and no overhead from pointers, making it space-efficient at low loads. However, it suffers from clustering issues in simpler probing schemes, complicates deletions due to tombstones (which effectively increase the load factor), and cannot handle $ \alpha > 1 $ without overflow. These trade-offs are analyzed in detail in foundational works on hashing.[](https://www.rose-hulman.edu/class/cs/csse230/202020/Slides/19-HashTableAnalysis.pdf) ### Coalesced hashing Coalesced hashing is a collision resolution [strategy](/page/Strategy) for [hash tables](/page/Hash_table) that integrates aspects of [open addressing](/page/Open_addressing) and separate [chaining](/page/Chaining), utilizing a unified [overflow](/page/Overflow) structure to manage collisions more efficiently than pure [linear probing](/page/Linear_probing). In this method, the hash table consists of a primary address region followed by a [single](/page/Single) overflow area implemented as linked lists. When a collision occurs at the initial hash-computed slot, [linear probing](/page/Linear_probing) begins from that slot to locate the first available position in the overflow area, where the new entry is inserted and linked back to the original collision point, forming coalesced chains that tend to grow contiguously and reduce primary clustering.[](https://dl.acm.org/doi/10.1145/358728.358745) The technique was originally proposed in [1959](/page/1959) as part of early efforts to handle [identifiers](/page/Cyclooxygenase-1) in language processors, marking one of the initial practical implementations of hash tables during the [1950s](/page/1950s) and [1960s](/page/1960s). For insertion, a key is first [hash](/page/Hash)ed to its primary slot; if empty, it is placed directly there. On collision, probing proceeds linearly through occupied slots until an empty one is found in the overflow area, at which point the new entry is stored and a backward link is established from the overflow slot to the primary [hash](/page/Hash) slot, ensuring chains coalesce around collision origins.[](https://www2.math.uu.se/~svantejs/papers/sj177.pdf) Search operations in coalesced hashing start at the primary hash slot for the key. If the slot is empty or contains a non-matching key, the process follows the forward link to traverse the associated chain in the overflow area until the key is found or the chain ends, indicating an unsuccessful search. This approach limits probe sequences primarily to chain traversals rather than extensive linear scans, enhancing locality compared to standard open addressing.[](https://dl.acm.org/doi/10.1145/358728.358745)[](https://epubs.siam.org/doi/10.1137/0212046) Coalesced hashing mitigates the clustering issues inherent in pure [linear probing](/page/Linear_probing) by confining overflow chains to a dedicated area, leading to more predictable performance under high load factors. The expected number of probes for search is approximately $1 + \frac{\alpha}{2(1 - \alpha)}$, and for insertion and deletion approximately $1 + \frac{\alpha}{1 - \alpha}$, where $\alpha$ is the load factor. This provides better performance than linear probing's $\frac{1}{1 - \alpha}$ for unsuccessful searches while using fewer pointers than separate chaining.[](https://dl.acm.org/doi/10.1145/322374.322375)[](https://epubs.siam.org/doi/10.1137/0212046) Variants of coalesced hashing include early-insertion schemes, where colliding entries are linked immediately after the primary slot rather than at the chain's end, and deletion methods that use deferred removal to maintain chain integrity by marking slots as logically deleted without immediate relocation. These adaptations, analyzed in subsequent works, further optimize probe counts and support dynamic operations in resource-constrained environments.[](https://dl.acm.org/doi/10.1145/358728.358745)[](https://epubs.siam.org/doi/10.1137/0212046) ## Performance considerations ### Time and space complexity The [time complexity](/page/Time_complexity) of hash table operations, including insertion, search, and deletion, is analyzed using amortized bounds, which account for the average cost over a sequence of operations assuming uniform hashing and low load factors. Under these conditions, all major collision resolution methods—separate [chaining](/page/Chaining), [open addressing](/page/Open_addressing), and coalesced hashing—achieve an average-case [time complexity](/page/Time_complexity) of O(1) per operation, as collisions are resolved efficiently without excessive probing or traversal. However, the worst-case [time complexity](/page/Time_complexity) can degrade to [O(n](/page/O(n)) for separate [chaining](/page/Chaining) if all keys hash to the same bucket, or for [open addressing](/page/Open_addressing) and coalesced hashing in cases of degenerate probing sequences leading to primary clustering.[](https://algs4.cs.princeton.edu/lectures/keynote/34HashTables.pdf)[](https://www.cs.rice.edu/~as143/COMP480_580_Fall24/scribe/chaining_probing_Analysis.pdf)[](https://dl.acm.org/doi/pdf/10.1145/358728.358745) Space complexity varies by resolution method. Separate chaining requires O(n + m) space, where n is the number of keys and m is the number of buckets, due to additional storage for pointers in linked lists at each bucket. In contrast, open addressing uses a fixed [array](/page/Array) of size m (with m > n) for O(m) [space](/page/Space_complexity), avoiding extra pointers but necessitating larger m to maintain performance. Coalesced hashing, a hybrid approach, achieves O(m) [space](/page/Space_complexity) similar to open addressing while incorporating limited linking for collisions, enabling higher load factors without overflow.[](https://graphics.stanford.edu/courses/cs161-18-winter/LectureSlides/ValiantCS161Lecture08.pdf)[](https://algs4.cs.princeton.edu/lectures/keynote/34HashTables.pdf)[](https://dl.acm.org/doi/pdf/10.1145/358728.358745) The load factor α = n/m significantly influences performance, as it measures table utilization and collision likelihood. In separate chaining, the average chain length approximates α, leading to O(α) time for traversals in the average case. For [linear probing](/page/Linear_probing) in [open addressing](/page/Open_addressing), the average probe sequence length for unsuccessful searches is approximately $\frac{1}{2(1-\alpha)^2}$, which grows rapidly as α approaches 1, often prompting resizing at α ≈ 0.5 to keep costs near O(1). Coalesced hashing mitigates clustering, yielding average probe lengths around 1.3–1.4 at α ≈ 0.73, outperforming pure [linear probing](/page/Linear_probing) at higher loads.[](https://www.jsums.edu/nmeghanathan/files/2017/08/CSC323-Fall2017-Module-7-Hashing.pdf)[](https://www.cs.rice.edu/~as143/COMP480_580_Fall24/scribe/chaining_probing_Analysis.pdf)[](https://kuscholarworks.ku.edu/server/api/core/bitstreams/2c4b0386-f00d-4a2c-a004-28c0711c1f8f/content) To manage load factor growth, hash tables employ rehashing: when α exceeds a [threshold](/page/Threshold) (typically 0.75), the table size doubles to 2m, and all n keys are reinserted into the new [array](/page/Array), incurring O(n) cost for that [operation](/page/Operation). Over a sequence of n insertions starting from an empty [table](/page/Table), the total rehashing cost is O(n), as each key is rehashed O(log n) times across doublings, yielding an amortized O(1) cost per insertion across all methods.[](https://www.cs.cornell.edu/courses/cs3110/2008fa/lectures/lec22_amort.html) | Operation | Separate Chaining (Average/Worst) | Open Addressing (Average/Worst) | Coalesced Hashing (Average/Worst) | |-----------------|-----------------------------------|---------------------------------|-----------------------------------| | Insert | O(1) / O(n) | O(1) / O(n) | O(1) / O(n) | | Search | O(1) / O(n) | O(1) / O(n) | O(1) / O(n) | | Delete | O(1) / O(n) | O(1) / O(n) | O(1) / O(n) | This table summarizes the big-O complexities under uniform hashing assumptions, with averages holding for α < 1 and worst cases arising from adversarial inputs or high loads.[](https://algs4.cs.princeton.edu/lectures/keynote/34HashTables.pdf)[](https://www.cs.rice.edu/~as143/COMP480_580_Fall24/scribe/chaining_probing_Analysis.pdf)[](https://dl.acm.org/doi/pdf/10.1145/358728.358745) ### Cache efficiency In hash tables, collision resolution strategies significantly influence [CPU cache](/page/CPU_cache) performance due to how they access [memory](/page/Memory). Separate [chaining](/page/Chaining), which links colliding elements via pointers, often leads to random [memory](/page/Memory) jumps during traversal, resulting in poor spatial locality and increased cache misses as each pointer [chase](/page/Chase) may fetch a new [cache](/page/Cache) line.[](https://arxiv.org/pdf/1807.04345) In contrast, open addressing methods like [linear probing](/page/Linear_probing) promote sequential [memory](/page/Memory) access, enhancing spatial locality by keeping related probes within the same or adjacent [cache](/page/Cache) lines, thereby reducing cache misses per operation.[](https://www.cs.cornell.edu/courses/JavaAndDS/files/CachingAffectsHashing.pdf) To further optimize cache efficiency, cache-conscious variants of [open addressing](/page/Open_addressing) have been developed. [Cuckoo hashing](/page/Cuckoo_hashing), which uses multiple hash functions and displaces elements between tables or positions, can suffer from scattered accesses but benefits from compact storage that minimizes overall footprint. Hopscotch hashing improves upon this by restricting displacements to a small "neighborhood" (typically 32-64 slots) around the ideal position, ensuring related elements remain within a few cache lines and limiting lookups to at most two cache loads even at high load factors. Key metrics for cache efficiency include cache misses per lookup or insertion. Linear probing generally incurs fewer misses than double hashing, as the latter's non-sequential jumps disrupt locality, though double hashing may perform comparably or better with larger records where probe length savings offset the cache penalty.[](https://www.cs.cornell.edu/courses/JavaAndDS/files/CachingAffectsHashing.pdf) For instance, at 70% load factors with uniform data distributions, linear probing can achieve up to 20-30% fewer cache misses than double hashing for small records (e.g., 8 bytes).[](https://www.cs.cornell.edu/courses/JavaAndDS/files/CachingAffectsHashing.pdf) Additional optimizations target [hardware](/page/Hardware) specifics. Using power-of-two table sizes enables modulo-free indexing via bit masking, avoiding expensive division operations and improving instruction-level [cache](/page/Cache) utilization during resizing and probing.[](https://www.usenix.org/system/files/conference/atc16/atc16_paper-breslow.pdf) SIMD instructions further enhance efficiency by vectorizing probe sequences, allowing multiple comparisons per [cache](/page/Cache) line load; for example, [SSE](/page/SSE)/AVX extensions can process 4-8 probes simultaneously, reducing effective misses in [linear probing](/page/Linear_probing) schemes.[](https://www.usenix.org/system/files/conference/atc16/atc16_paper-breslow.pdf) Empirical studies from the [2000s](/page/2000s) highlight the impact of these designs. Benchmarks on early multi-core systems showed [hopscotch](/page/Hopscotch) hashing delivering up to 2x the throughput of traditional concurrent hash maps (e.g., Java's ConcurrentHashMap) at 40% load, with sustained [performance](/page/Performance) up to 90% density due to minimized [cache](/page/Cache) misses, compared to 2-3x slowdowns in [cuckoo hashing](/page/Cuckoo_hashing) from poor locality. Overall, cache-friendly collision resolution yielded 2-5x speedups over naive [chaining](/page/Chaining) in memory-bound workloads on processors like [Intel](/page/Intel) [Xeon](/page/Xeon) ([2000s](/page/2000s) era).[](http://jakubiuk.net/stuff/hash_tables_cache_performance.pdf) ## Cryptographic aspects ### Collision resistance Collision resistance is a core security property of cryptographic hash functions, requiring that it is computationally infeasible for an adversary to find two distinct inputs that produce the same output hash value.[](https://eprint.iacr.org/2004/035.pdf) This property ensures the hash function behaves as a one-way shrinking [map](/page/Map) while resisting deliberate attempts to generate collisions, distinguishing it from preimage resistance—where recovering any input from a given hash is hard—and second-preimage resistance—where finding a different input matching the hash of a known input is infeasible.[](https://eprint.iacr.org/2004/035.pdf) Unlike these, collision resistance targets the simultaneous discovery of any pair of colliding inputs without prior knowledge of either.[](https://csrc.nist.gov/glossary/term/cryptographic_hash_function) For an ideal n-bit [cryptographic hash function](/page/Cryptographic_hash_function), collision resistance demands approximately $2^{n/2}$ operations to find a collision, representing the lower bound established by the birthday paradox as the theoretical basis for such attack complexity.[](https://eprint.iacr.org/2004/035.pdf) This quadratic reduction in effort compared to the full output space underscores the need for sufficiently large output sizes to maintain practical security; for instance, the SHA-256 [hash function](/page/Hash_function), with a 256-bit output, provides 128 bits of [collision resistance](/page/Collision_resistance), meaning an attacker would require about $2^{128}$ operations for a successful collision under generic attacks.[](https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-107r1.pdf) The importance of collision resistance manifests in critical applications where hash integrity prevents tampering or forgery. In digital signatures, it guarantees that an attacker cannot replace a signed message with a colliding one that verifies under the same signature, preserving authenticity and [non-repudiation](/page/Non-repudiation).[](https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-107r1.pdf) [Blockchain](/page/Blockchain) systems rely on it to secure hash-based structures like chains and Merkle trees, ensuring that alterations to transaction data would require infeasible collisions to go undetected.[](https://eprint.iacr.org/2019/315.pdf) Even in password hashing, collision resistance bolsters overall resilience by mitigating risks from hash equalities that could indirectly aid in dictionary or brute-force attacks on stored credentials.[](https://auth0.com/blog/birthday-attacks-collisions-and-password-strength/) ### Known attacks and vulnerabilities MD5, designed in the 1990s by Ronald Rivest as a 128-bit [cryptographic hash function](/page/Cryptographic_hash_function), was first demonstrated to be vulnerable to collision attacks in 2004 when researchers led by Xiaoyun Wang published a method to find collisions in less than 2^{39} MD5 compression function calls using differential cryptanalysis techniques. This breakthrough relied on identifying differential paths in MD5's structure, allowing the construction of two distinct 512-bit messages with the same hash value. By 2007, Marc Stevens extended this to chosen-prefix collisions, enabling attackers to forge messages with arbitrary differing prefixes that collide after appending specific suffixes, at a cost of approximately 2^{50} MD5 evaluations.[](https://marc-stevens.nl/research/papers/EC07-SLdW.pdf) Practical exploitation materialized in 2008 when researchers, including Stevens and [Alexander](/page/Alexander) Sotirov, created colliding [X.509](/page/X.509) certificates using MD5, demonstrating the ability to forge digital certificates for different identities that browsers would accept as valid.[](https://marc-stevens.nl/research/hashclash/rogue-ca/) SHA-1, a 160-bit [hash function](/page/Hash_function) standardized by NIST in 1995, faced increasing scrutiny leading to its deprecation for most uses by NIST in 2011 due to theoretical weaknesses and partial [collision attack](/page/Collision_attack)s.[](https://csrc.nist.gov/projects/hash-functions) A landmark practical collision was achieved in 2017 by a team from [Google](/page/Google) and CWI [Amsterdam](/page/Amsterdam), who generated two distinct PDF files with identical [SHA-1](/page/SHA-1) hashes using a chosen-prefix collision attack requiring about $2^{63}$ [SHA-1](/page/SHA-1) computations on a custom [GPU cluster](/page/GPU_cluster).[](https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html) This attack built on earlier differential cryptanalysis methods, incorporating near-collision techniques and biclique structures originally developed for preimage attacks to efficiently bridge differential paths across [SHA-1](/page/SHA-1)'s 80 rounds. Both [MD5](/page/MD5) and [SHA-1](/page/SHA-1) attacks leverage the [birthday](/page/Birthday) paradox by generating collisions in intermediate states (e.g., chosen prefixes) rather than full messages, amplifying the feasibility for real-world forgery. These vulnerabilities have had significant real-world impacts. In 2012, the Flame malware exploited an MD5 chosen-prefix collision to forge a Microsoft code-signing certificate, allowing it to masquerade as a legitimate Windows update and infect systems in targeted espionage campaigns.[](https://www.microsoft.com/en-us/msrc/blog/2012/06/flame-malware-collision-attack-explained) For SHA-1, the 2017 collision raised alarms for systems like Git, which uses SHA-1 for object integrity; while Git's design mitigates some risks through content-addressing, attackers could potentially create colliding repositories to subvert version history.[](https://lwn.net/Articles/715716/) Similarly, SHA-1's use in SSL/TLS certificates led to its phase-out by major browsers in 2017, as collisions could enable man-in-the-middle attacks by forging valid certificates. These incidents underscore the failure of collision resistance in legacy hashes, prompting urgent mitigations.[](https://csrc.nist.gov/projects/hash-functions) In response, the cryptographic community has accelerated transitions to more secure alternatives. NIST finalized [SHA-3](/page/SHA-3) in 2015 as a sponge-based hash family designed to resist known attack vectors, recommending its adoption alongside [SHA-2](/page/SHA-2) for applications requiring high [collision resistance](/page/Collision_resistance).[](https://csrc.nist.gov/pubs/fips/202/final) BLAKE2, an optimized variant of the SHA-3 finalist BLAKE, offers faster performance than [MD5](/page/MD5) or [SHA-1](/page/SHA-1) while maintaining 256- or 512-bit security levels against collisions, and has been standardized in [RFC](/page/RFC) 7693 for broad use. Post-2020 NIST guidelines, including SP 800-131A Revision 2 (2019) and subsequent updates, mandate phasing out [SHA-1](/page/SHA-1) entirely by 2030, prohibiting its use in new digital signatures after 2013 and urging immediate migration to [SHA-2](/page/SHA-2) or [SHA-3](/page/SHA-3) for all protections.[](https://csrc.nist.gov/news/2022/nist-transitioning-away-from-sha-1-for-all-apps)

References

  1. [1]
    collision - Glossary | CSRC
    A collision is when two different messages have the same message digest, or when two distinct inputs produce the same output.
  2. [2]
    Hash collisions – Clayton Cafiero - University of Vermont
    Jan 5, 2025 · A hash collision occurs when two keys yield the same hash value for a given table size. This is due to the table size and the hash function.Missing: definition | Show results with:definition
  3. [3]
    [PDF] CMSC 420: Lecture 11 Hashing - Handling Collisions - UMD CS
    Separate Chaining: If we have additional memory at our disposal, a simple approach to collision resolution, called separate chaining, is to store the colliding ...<|control11|><|separator|>
  4. [4]
    [PDF] Lecture 10: Hash Collision Resolutions - Washington
    -Various schemes: -Linear Probing – easiest, but lots of clusters. -Quadratic Probing – middle ground, but need to be more careful about 𝜆. -Double Hashing – ...Missing: techniques | Show results with:techniques
  5. [5]
    [PDF] Cryptographic Hash Functions - Purdue Computer Science
    Collision Resistance (Strong Col. Res.): It is computationally infeasible to find any two distinct inputs which hash to the same output.
  6. [6]
    Cryptographic hash function - Glossary | CSRC
    2. (Collision-resistant) It is computationally infeasible to find any two distinct inputs that map to the same output.
  7. [7]
    [PDF] Avoiding collisions Cryptographic hash functions - Computer Science
    A collision is when two distinct inputs have the same hash output. A collision-resistant hash function makes it infeasible to find such collisions.
  8. [8]
    [PDF] MD5 and SHA-1 Collision Attacks: A Tutorial - Koc Lab
    MD5 and SHA-1 are two of the most popular hash func- tions and are in widespread use. However, MD5 and SHA-. 1 are vulnerable to collision attacks based on ...Missing: famous | Show results with:famous
  9. [9]
    [PDF] The first collision for full SHA-1 - SHAttered.io
    Feb 23, 2017 · This family originally started with MD4 [30] in 1990, which was quickly replaced by MD5 [31] in 1992 due to serious security weaknesses [7, 9].Missing: famous | Show results with:famous
  10. [10]
    [PDF] NIST Hash Competition
    hash algorithms, particularly MD5 & SHA-1. – No actual collisions yet announced on SHA-1. • We think SHA-1 collision work factor ≈ 260 operations. • Held 2 ...Missing: famous | Show results with:famous
  11. [11]
    Hash Function - an overview | ScienceDirect Topics
    A hash function is a deterministic function that maps a set of strings or keys to a set of bounded integers. It can also include objects, data structures, ...
  12. [12]
    CS 312 Lecture 21 Hash functions - Cornell: Computer Science
    A uniform hash function produces clustering near 1.0 with high probability. A clustering measure of c > 1 greater than one means that the performance of the ...
  13. [13]
    [PDF] 1 Choosing a Hash Function when Using Chaining
    Apr 26, 2016 · Criteria for choosing a good hash function: • it should distribute keys roughly uniformly into slots,. • regularity in key distribution should ...
  14. [14]
    CS 3110 Lecture 21 Hash functions - Cornell: Computer Science
    A faster but often misused alternative is multiplicative hashing, in which the hash index is computed as ⌊m * frac(ka)⌋. Here k is again an integer hash code, a ...
  15. [15]
    Hash function - Glossary - MDN Web Docs - Mozilla
    Jul 11, 2025 · Outside cryptography, for example, hash functions can be used to generate the keys for an associative array such as a map or a dictionary. The ...<|separator|>
  16. [16]
    Hashing Functions - UMBC
    Hashing Functions. The Division Method ( for integers ). h ( k ) = k mod M, where k is the integer key and M is the size of the hash table
  17. [17]
    [PDF] Hashing
    Division method (continued)​​ h(k) = k mod m. Pick m to be a prime not too close to a power of 2 or 10 and not otherwise used prominently in the computing ...
  18. [18]
    String Hashing - Algorithms for Competitive Programming
    Jul 4, 2024 · Calculation of the hash of a string​​ where and are some chosen, positive numbers. It is called a polynomial rolling hash function. It is ...Calculation of the hash of a... · Example tasks · Fast hash calculation of...
  19. [19]
  20. [20]
    [PDF] Lecture Notes - 06 Hash Tables - CMU 15-445/645
    A hash table implements an associative array abstract data type that maps keys to values. It provides on average O (1) operation complexity (O (n) in the worst- ...
  21. [21]
    3.4 Hash Tables - Algorithms, 4th Edition
    The first step is to compute a hash function that transforms the search key into an array index. Ideally, different keys would map to different indices. This ...
  22. [22]
    [PDF] Overview of Hash Tables Hash Functions - csail
    Feb 16, 2011 · A hash table is a data structure that supports the following operations: • insert(k) - puts key k into the hash table.Missing: definition | Show results with:definition
  23. [23]
    [PDF] CS 3137 Class Notes 1 What is Hashing?
    • A Hash table is sometimes referred to as Scatter Table. This is because we are trying to scatter the data throughout the table.<|control11|><|separator|>
  24. [24]
    [PDF] Hashes - CS 15: Data Structures
    What is a good load factor? • Answer: systems typically keep load factor under around 0.7 to 0.75. ‣ Determined empirically: experimentation shows performance.Missing: below | Show results with:below
  25. [25]
    Art of Computer Programming, The: Volume 3: Sorting and ... - InformIT
    30-day returnsApr 24, 1998 · The book contains a selection of carefully checked computer methods, with a quantitative analysis of their efficiency. Outstanding features of ...<|separator|>
  26. [26]
    EECS 311: Hashtables
    Terms related to hashtables you should be familiar with: hash functions; keys; collisions and collision resolution; synonyms; linear probing; quadratic probing ...
  27. [27]
    CIS Department > Tutorials > Software Design Using C++ > Hash ...
    Jul 21, 2021 · Above it was stated that the size of the hash table should be prime. ... This helps to avoid some clustering, but does not eliminate all of it.
  28. [28]
    Load Factor and Rehashing - GeeksforGeeks
    Jul 11, 2025 · Rehashing is the process of increasing the size of a hashmap and redistributing the elements to new buckets based on their new hash values.Missing: spikes | Show results with:spikes
  29. [29]
    Universal classes of hash functions (Extended Abstract)
    This paper gives an input independent average linear time algorithm for storage and retrieval on keys. The algorithm makes a random choice of hash function.
  30. [30]
    [PDF] Universal Classes of Hash Functions - cs.Princeton
    CARTER, AND M. N. WEGMAN,. Analysis of a universal class of hash functions, in “Proceedings of the Seventh. Mathematical. Foundations of Computer. Science.
  31. [31]
    [PDF] Cormen Leiserson Rivest Stein ALG_3rd - DidaWiki
    Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p.n; m/ be the probability that no collisions occur.
  32. [32]
    [PDF] Lecture 2: Hashing - cs.Princeton
    Assume that H is a pairwise-independent hash family. Now, we want to count the expected number of collisions. To do this, let the random variable. Ixy = (. 1.
  33. [33]
    [PDF] Hash Tables
    Linearity of expectation implies that the expected ... In particular, if we set m = cn2. , the expected number of collisions is less than 1/c, which implies.
  34. [34]
    [PDF] ⋆ 5.4 Probabilistic analysis and further uses of indicator random ...
    Our first example is the birthday paradox. How many people must there be in a room before there is a 50% chance that two of them were born on the same day of.
  35. [35]
    Birthday Problem -- from Wolfram MathWorld
    The birthday problem asks how many people are needed for a 50% chance of at least two sharing a birthday. With 365 days, 23 people are needed.
  36. [36]
    [PDF] 6.006 Lecture 08: Hashing with chaining - MIT OpenCourseWare
    Simple Uniform Hashing: An assumption (cheating): Each key is equally likely to be hashed to any slot of table, independent of where other keys ...
  37. [37]
    [PDF] 4.2 Hashing - cs.Princeton
    Separate Chaining. Separate chaining: array of M linked lists. □. Hash: map key to integer i between 0 and M-1. □. Insert: put at front of ith chain (if not ...
  38. [38]
    [PDF] Hash Tables
    Theorem 1 In a hash table in which collisions are resolved by chaining, an unsuccessful search takes Θ(1 + α) time on average, assuming simple uniform hashing.<|separator|>
  39. [39]
    [PDF] Hash Table Analysis - Rose-Hulman
    For a proof, see Knuth, The Art of Computer Programming, Vol 3: Searching ... Why use 31 and not 256 as a base in the. String hash function? ▫ Consider ...
  40. [40]
    Implementations for coalesced hashing | Communications of the ACM
    This paper is a practical study of coalesced hashing for use by those who intend to implement or further study the algorithm.Missing: definition mechanism
  41. [41]
    [PDF] INDIVIDUAL DISPLACEMENTS IN HASHING WITH COALESCED ...
    Introduction. The standard version of hashing with coalesced chains, due to Williams. [10] can be described as follows, where n and m are integers with 0 ...
  42. [42]
    Analysis of Early-Insertion Standard Coalesced Hashing - SIAM.org
    This paper analyzes the early-insertion standard coalesced hashing method (EISCH), which is a variant of the standard coalesced hashing algorithm (SCH) ...Missing: mechanism | Show results with:mechanism
  43. [43]
    Analysis of the Search Performance of Coalesced Hashing
    Analysis of the Search Performance of Coalesced Hashing. Author: Jeffrey Scott Vitter.
  44. [44]
    [PDF] Hash Tables - Algorithms
    ・No time limitation: trivial collision resolution with sequential search. ・Space and time limitations: hashing (the real world). hash("times") = 3 ?? 0. 1.
  45. [45]
    [PDF] Lecture 4: Hashing, Chaining, and Probing Analysis - Rice University
    Sep 10, 2024 · If a collision occurs, check the next slot in the array (index + 1), and continue linearly until an empty slot is found. Example: If the hash ...
  46. [46]
    Implementations for coalesced hashing
    Williams, F.A. Handling identifiers as internal symbols in language processors. Comm ACM 2, 6 (June 1959), 21-24. 926. Communications. December 1982 of. Volume ...
  47. [47]
    [PDF] 1 Hash tables
    Feb 6, 2017 · A hash table is a commonly used data structure to store an unordered set of items, allowing constant time inserts, lookups and deletes (in ...
  48. [48]
    [PDF] Module 7: Hashing - Jackson State University
    Hashing is another example for space-time tradeoff. • A hash table takes space corresponding to the size of the array, but the average time- complexity for a ...
  49. [49]
    Analysis of Early-Insertion Standard Coalesced Hashing
    Aug 5, 2025 · This paper analyzes the early-insertion standard coalesced hashing method (EISCH), which is a variant of the standard coalesced hashing ...
  50. [50]
    CS 3110 Lecture 22 Hash tables and amortized analysis
    Since the bucket array is being doubled at each rehashing, the rehashes must all occur at powers of two. The final rehash rehashes all n elements, the previous ...
  51. [51]
    [PDF] Data-Parallel Hashing Techniques for GPU Architectures - arXiv
    Jul 11, 2018 · evicted, during a hash collision. This helps reduce the aver- age ... The Art of Computer Programming, Volume 3: (2nd. Ed.) Sorting and ...<|separator|>
  52. [52]
    [PDF] How Caching Affects Hashing - CS@Cornell
    Given this double hash function, the authors' experimental results indicated that linear probing tends to outperform double hashing, particularly when the load ...
  53. [53]
    [PDF] Fast Hash Tables for In-Memory Data-Intensive Computing - USENIX
    Jun 22, 2016 · These traits allow efficient SIMD im- plementations of BCHTs that achieve lookup rates supe- rior to linear probing and double-hashing-based ...
  54. [54]
    [PDF] Implementation and Performance Analysis of Hash Functions and ...
    In our experiments, the hopscotch hashing was about as good as the linear probing. Because all the probes are localized, cache is heavily utilized. Timewise, ...<|separator|>
  55. [55]
    [PDF] Cryptographic Hash-Function Basics: Definitions, Implications, and ...
    Abstract. We consider basic notions of security for cryptographic hash functions: collision resistance, preimage resistance, and second-preimage resistance.
  56. [56]
    [PDF] Recommendation for Applications Using Approved Hash Algorithms
    Collision resistance. An expected property of a hash function whereby it is computationally infeasible to find a collision, See “Collision”. Page 7. 4. Digital ...
  57. [57]
    [PDF] Blockchains from Non-Idealized Hash Functions
    Nov 13, 2020 · This property is useful in the blockchain context, since intuitively collision resistance ensures that the hash-chain maintained by the parties ...
  58. [58]
    Birthday Attacks, Collisions, And Password Strength - Auth0
    Mar 23, 2021 · This problem is sometimes called the Birthday Paradox because it may seem counter-intuitive that it only takes 23 people for the chance to be ...The Problem With Hashing... · The Birthday Problem · What If 1234 Is Mapped To...
  59. [59]
    [PDF] Chosen-prefix Collisions for MD5 and Colliding X.509 Certificates ...
    The main contribution of this paper is a method to construct MD5 collisions starting from two arbitrary IHVs. Given this method one can take any two chosen ...
  60. [60]
    MD5 considered harmful today - Marc Stevens
    Dec 30, 2008 · Creating a rogue CA certificate. Alexander Sotirov, Marc Stevens, Jacob Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, Benne de Weger ...
  61. [61]
    Hash Functions | CSRC - NIST Computer Security Resource Center
    Jan 4, 2017 · Collision resistance: It is computationally infeasible to find two different inputs to the hash function that have the same hash value.NIST Policy · News & Updates · Events · SHA-3 Standardization
  62. [62]
    Announcing the first SHA1 collision - Google Online Security Blog
    We are announcing the first practical technique for generating a collision. This represents the culmination of two years of research that sprung from a ...Missing: Shockwave | Show results with:Shockwave
  63. [63]
    Flame malware collision attack explained - Microsoft
    Jun 6, 2012 · They had to perform a collision attack to forge a certificate that would be valid for code signing on Windows Vista or more recent versions of ...
  64. [64]
    Moving Git past SHA-1 - LWN.net
    Feb 27, 2017 · The use of SHA-1 in this way makes it difficult to tamper with the files in a repository; a change to a file will change the resulting hash, so ...
  65. [65]
    SHA-1 - Wikipedia
    In February 2017, CWI Amsterdam and Google announced they had performed a collision attack against SHA-1, publishing two dissimilar PDF files which produced the ...Missing: Shockwave | Show results with:Shockwave
  66. [66]
    SHA-3 Standard: Permutation-Based Hash and Extendable-Output ...
    This Standard specifies the Secure Hash Algorithm-3 (SHA-3) family of functions on binary data. Each of the SHA-3 functions is based on an instance of the ...
  67. [67]
    NIST Transitioning Away from SHA-1 for All Applications | CSRC
    As a result, NIST will transition away from the use of SHA-1 for applying cryptographic protection to all applications by December 31, 2030.