Fact-checked by Grok 2 weeks ago

Birthday attack

In , a birthday attack is a type of collision-finding attack on functions that leverages the birthday paradox—a probabilistic phenomenon where the likelihood of at least one shared birthday in a group of randomly selected people rises sharply with group size—to identify two distinct inputs producing the same output with far fewer computations than exhaustive search would demand. For a function with an output space of size n (e.g., 2^b for b-bit hashes), the attack requires approximately √(2 * n * ln(2)) ≈ 1.177 √n hash evaluations to achieve a 50% probability of collision, reducing the effort from O(n) to O(√n). This makes it a significant threat to systems relying on , such as digital signatures and integrity checks, as demonstrated in early proposals like Gideon Yuval's 1979 method to forge signatures in Rabin's cryptosystem by generating colliding message pairs. The itself, which underpins the attack, illustrates that in a room of just 23 people, the probability of at least two sharing a exceeds 50%, calculated as 1 minus the product of (365 - i)/365 for i from 0 to 22, yielding approximately 0.507. This result stems from the quadratic growth in pairwise comparisons (k choose 2 for k samples), making collisions inevitable sooner than intuition suggests in uniform random distributions. In cryptographic contexts, the paradox translates directly to outputs treated as "birthdays," enabling attackers to precompute and store many hashes until a match emerges, often using storage-optimized variants like for efficiency. Historically, the birthday attack gained prominence through Yuval's work, which highlighted its potential against naive -based by allowing an adversary to craft equivalent messages indistinguishable to the verifier. While pure birthday attacks on strong modern hashes like SHA-256 (requiring ~2^128 operations) remain computationally infeasible, they set the baseline for security analysis and have informed vulnerabilities in weaker functions, such as , where collisions were found in 2^64 effort bounds, though actual breaks used more sophisticated differential cryptanalysis. Defenses include using larger hash outputs, randomized salting, or target-specific in protocols like digital signatures to mitigate the attack's impact.

Background and Paradox

The Birthday Paradox

The birthday paradox, also known as the , describes the counterintuitive probability that, in a random group of 23 people assuming 365 possible birthdays and , there is approximately a 50% chance that at least two individuals share the same birthday. This result defies common intuition, which might suggest a much larger group size is needed given the 365 possible days, but it highlights how probabilities of shared events accumulate rapidly in moderate-sized sets. The intuitive explanation centers on pairwise comparisons rather than sequential checks against all possible dates. In a group of n people, there are \binom{n}{2} = \frac{n(n-1)}{2} unique pairs, and each pair has a \frac{1}{365} probability of matching birthdays under the uniform assumption, ignoring leap years and exact date distributions. As n increases, the number of pairs grows quadratically, amplifying the likelihood of at least one match even though individual pair probabilities remain small. The exact probability P of at least one shared birthday in a group of n people is given by
P(n) = 1 - \frac{_{365}P_n}{365^n},
where _{365}P_n = \frac{365!}{(365-n)!} is the number of permutations of 365 days taken n at a time, valid for n ≤ 365. For n ≪ 365, this simplifies computationally to the product form
P(\text{no shared}) = \prod_{k=1}^{n-1} \left(1 - \frac{k}{365}\right),
so P(at least one shared) = 1 minus that value; for example, at n=23, P ≈ 0.507, and at n=57, P ≈ 0.991.
The problem was first published in 1939 by Austrian mathematician , though its conceptual roots lie in earlier work on probability coincidences. It was popularized in the United States through Martin Gardner's "Mathematical Games" column in , where he discussed it as a striking example of probabilistic surprise. This paradox extends analogously to scenarios like finding collisions in cryptographic hash functions, where output values behave like birthdays.

Relation to Balls and Bins

The balls and bins model serves as a foundational for analyzing collision probabilities in distributions. In this setup, m balls are thrown independently and uniformly at random into n bins, with a collision defined as at least one bin receiving two or more balls. The probability of at least one such collision is given by the complement of the probability that all balls land in distinct bins, which approximates to $1 - e^{-m(m-1)/(2n)} when m is much smaller than n. This framework maps directly to the birthday paradox by treating the n bins as the possible birthdays (typically n = 365 days, ignoring ) and the m balls as individuals. The approximation shows that a collision probability of roughly 50% arises when m ≈ √(2n ln 2), or approximately √(2n) for rough estimation. For the standard case of n = 365, m ≈ 23 yields a collision probability exceeding 50%. The model extends beyond birthdays to any uniform random mapping from a to a of size n, such as random selections or functions over finite sets. The core insight lies in the quadratic scaling of pairwise comparisons: with m elements, there are \binom{m}{2} ≈ m²/2 possible pairs, each colliding with probability 1/n, yielding an expected number of collisions of m²/(2n). Collisions thus become probable after Θ(√n) trials, rather than Θ(n), due to this pairwise growth.

Cryptographic Application

Hash Function Collisions

A occurs when two distinct inputs produce the same output value from a . In cryptographic contexts, ideal are assumed to behave like random functions that map inputs uniformly to a fixed-size output space of $2^k possible values, where k is the of the . This uniformity assumption underpins analyses, treating the hash as unpredictable and evenly distributed across its . Under this model, the birthday paradox implies that finding a collision requires evaluating approximately \sqrt{2^k} = 2^{k/2} distinct inputs to achieve about a 50% probability of success. This effort is significantly less than the exhaustive $2^k trials needed to cover the entire output space, highlighting the practical vulnerability of hash functions to collision searches. Unlike a second preimage attack, which seeks a different input that matches the hash of a specific given input, a collision attack aims only to identify any pair of distinct inputs sharing the same hash value, without targeting a particular output. This distinction makes collision resistance a stricter requirement, as it demands security against finding arbitrary pairs rather than targeted inversions. The cryptographic relevance of such collisions was first noted in 1976 by Diffie and Hellman, who discussed the challenges of constructing collision-resistant one-way functions for authentication. The birthday attack itself was formalized in 1979 by Yuval, who demonstrated its application to forging digital signatures by exploiting hash collisions.

Attack Execution

The primary goal of a birthday attack is to generate two distinct messages that produce the same hash value, enabling an attacker to forge data or circumvent integrity verification mechanisms in cryptographic systems. The procedure involves selecting a with an output size of k bits, yielding $2^k possible values, and then generating approximately $2^{k/2} random input messages. For each input, compute its and store the pairs of (, input) in a such as a , sorted list, or tree to facilitate quick lookups. Continue until a duplicate is detected, indicating a collision between two different inputs. This process exploits the probabilistic nature of hash distributions to find matches with high likelihood after roughly \sqrt{2 \cdot 2^k} trials. The of this attack is O(2^{k/2}) in both time and space under the standard approach, as it requires evaluating the that many times and storing the results. For instance, with k = 80, the attack demands about $2^{40} operations, which is feasible with modern hardware, whereas for k = 256, the $2^{128} scale renders it practically impossible. To address the high storage demands, optimizations like Floyd's cycle detection algorithm (also known as the tortoise-and-hare method) can be applied, treating the hash chain as a functional graph and detecting cycles with constant O(1) space by advancing two pointers at different speeds through iterated applications until they collide. Parallel computation across multiple machines can further distribute the workload, reducing effective memory per node while maintaining the overall . A hypothetical pure birthday attack on , which has k = 128, would require approximately $2^{64} operations, though in 2004, Wang et al. demonstrated practical MD5 collisions using a differential method with far lower complexity of about $2^{39} evaluations, highlighting the function's vulnerability to collision attacks.

Mathematical Foundations

Probability Derivation

The probability of no collision after m trials in a space of n possible outputs is given exactly by the product \prod_{i=1}^{m-1} \left(1 - \frac{i}{n}\right) = \frac{n!}{(n-m)! n^{m-1}}, assuming m \leq n; this follows from the sequential probability that each new output differs from all previous ones, starting with the first trial having probability 1, the second having (n-1)/n, the third (n-2)/n, and so on up to the m-th having (n-m+1)/n. To derive this, consider the outputs as independent and uniformly distributed over \{1, 2, \dots, n\}, akin to drawing balls with replacement from an of n distinct balls; the probability of no repeated ball in m draws is the number of injective functions from m elements to n elements divided by the total n^m possible outcomes, yielding the falling factorial form above. Under the assumption of (a weaker condition than full for collision probabilities), the pairwise collision probability for any two distinct trials is exactly $1/n, leading to the expected number of collisions E = \binom{m}{2} \cdot (1/n) = m(m-1)/(2n). For approximation, take the natural logarithm of the exact no-collision probability: \ln \left( \prod_{i=1}^{m-1} \left(1 - \frac{i}{n}\right) \right) = \sum_{i=1}^{m-1} \ln \left(1 - \frac{i}{n}\right). When m \ll n, use the Taylor expansion \ln(1 - x) \approx -x - x^2/2 - \cdots for small x = i/n, truncating at the linear term to obtain \sum_{i=1}^{m-1} (-i/n) = -m(m-1)/(2n); exponentiating yields the approximation \prod_{i=1}^{m-1} \left(1 - \frac{i}{n}\right) \approx e^{-m(m-1)/(2n)}, so the collision probability is approximately $1 - e^{-m(m-1)/(2n)}. This Poisson-like approximation holds well when m = O(\sqrt{n}), as higher-order terms in the expansion become negligible. Solving for m such that the collision probability equals p, set $1 - e^{-m(m-1)/(2n)} = p, or e^{-m(m-1)/(2n)} = 1-p; taking logs gives m(m-1)/(2n) \approx \ln(1/(1-p)), and for small p or large n, m \approx \sqrt{2n \ln(1/(1-p))}. For a 50% chance (p=0.5), \ln(2) \approx 0.693, so E \approx 0.693 and m \approx \sqrt{2n \ln 2} \approx 1.177 \sqrt{n}. The assumptions of uniform randomness and independence are critical; if outputs are not uniformly distributed, collisions may occur sooner (if biased toward certain values) or later, potentially invalidating the model—real hash functions aim to behave like random oracles to satisfy these. For the exact probability when m/n is small, the binomial expansion of (1 - i/n) can refine the approximation, but the exponential form dominates for practical collision thresholds. This discrete model parallels the balls-and-bins paradigm in probability theory.

Approximation Methods

The Poisson approximation provides a practical for estimating the probability of a collision in the birthday attack without computing the exact combinatorial formula. Under this approximation, the number of potential pairwise collisions follows a with parameter \lambda = \frac{m(m-1)}{2n} \approx \frac{m^2}{2n}, where m is the number of trials and n is the size of the output space. The probability of at least one collision is then p \approx 1 - e^{-\lambda}. To find the number of trials m required for a specific success probability p, solve $1 - e^{-\lambda} = p, yielding \lambda = -\ln(1-p) and thus m \approx \sqrt{2n \lambda} = \sqrt{-2n \ln(1-p)}. This allows quick estimates of attack feasibility. For example, a 50% success probability (p = 0.5) requires \lambda = \ln 2 \approx 0.693, so m \approx \sqrt{1.386 n} \approx 1.177 \sqrt{n}. For higher success probabilities, the required trials scale accordingly. Achieving p = 0.99 demands \lambda \approx 4.605, resulting in m \approx \sqrt{9.21 n} \approx 3.035 \sqrt{n}. These heuristics, derived from the model, enable cryptanalysts to assess vulnerability based on hash output length without exhaustive computation. In cryptographic contexts, such approximations highlight the impracticality of generic birthday attacks against secure hash functions. For SHA-1 with a 160-bit output (n = 2^{160}), approximately $1.177 \times 2^{80} \approx 2^{80} trials are needed for a 50% collision chance. Although the raw computational effort of $2^{80} hash evaluations is large—requiring on the order of years for a large at rates around $10^{18} hashes per second as of 2025—the vanilla attack also demands infeasible storage of \approx 2^{80} hash values (on the order of yottabytes). Space-efficient methods, such as , reduce storage to O(1) or O(\sqrt{n}) with parallelization but retain the O(\sqrt{n}) computational cost, keeping generic attacks on 160-bit hashes challenging for non-nation-state adversaries. This approximation assumes m \ll n to neglect higher-order effects like multiple collisions in the same output value, breaking down when m approaches n and the space nears full occupancy, where the exact probability saturates at 1 but the model overestimates intermediate risks.

Vulnerabilities and Examples

Digital Signature Weaknesses

Digital signature schemes that employ a hash-then-sign paradigm are particularly susceptible to birthday attacks, as these attacks exploit the reduced computational effort required to find hash collisions compared to exhaustive search. In such schemes, the signer computes the hash of the message and signs the hash value using a public-key algorithm like , rather than the entire message. An attacker can generate numerous message variants until two messages, one benign (m1) and one malicious (m2), produce the same hash output, h(m1) = h(m2). By tricking the signer into producing a valid signature on m1, the attacker can then attach that signature to m2, forging without the signer's knowledge or consent. This mechanism was first outlined in the context of Rabin's signature scheme, highlighting how the birthday paradox enables efficient collision discovery to undermine integrity. Vulnerable schemes include classic constructions like RSA-PKCS#1 v1.5 with short hash functions such as or , where collisions in the hash directly compromise the signature's validity for altered content. If an attacker forges a collision before the signing process, the signature remains cryptographically valid across both messages, bypassing checks that assume hash uniqueness. Historical demonstrations underscore this weakness: in 2004, Wang et al. revealed practical collisions for , enabling theoretical forgery in MD5-based signatures with feasible computation. Building on this, Stevens, Lenstra, and de Weger in 2007 extended the attack to chosen-prefix collisions for , allowing targeted forgeries like colliding certificates for different identities, further exposing signature protocols reliant on weak hashes. The impact of these attacks severely erodes the non-repudiation property of digital signatures, as the signer cannot plausibly deny authoring the forged message since the signature verifies correctly against it. To counter this, signers must rigorously verify the full message content prior to signing, rather than trusting the hash alone, though this adds operational overhead. A prominent real-world exploitation occurred in 2012 with the Flame malware, which leveraged a sophisticated MD5 chosen-prefix collision— a variant of techniques from prior research—to forge Microsoft Windows update certificates. This allowed the malware to propagate as legitimate signed software on infected systems, evading detection and highlighting the practical dangers of legacy hash functions in signature chains.

Other Cryptographic Contexts

The birthday attack extends to certificates, where collisions in certificate hashes can enable the creation of forged certificates that appear valid to relying parties. In 2017, the SHAttered attack demonstrated the first practical collision for full , raising immediate concerns about its use in certificate signing, as such collisions could allow attackers to generate rogue certificates mimicking legitimate ones issued by trusted authorities, similar to prior MD5-based forgeries that compromised security. Although modern browsers have phased out SHA-1 support, legacy systems and certain TLS deployments remain vulnerable to chosen-prefix collisions, which build on birthday attack principles to target structured certificate data like serial numbers. In password storage, unsalted hashes are susceptible to collision searches via birthday attacks, particularly with short output lengths that reduce the effective search space. For instance, the legacy (LM) hash, with its 128-bit output, allows attackers to find collisions after approximately 2^{64} attempts in theory; however, practical vulnerabilities arise more from other design flaws, such as unsalted storage and a reduced effective input space due to uppercase conversion and length limits up to 14 characters, enabling efficient preimage and dictionary attacks rather than birthday collisions. This vulnerability is exacerbated in unsalted environments, where precomputed tables like rainbow tables can exploit hash chains, but direct collision attacks further undermine security by allowing multiple weak passwords to map to identical hashes. Weak generators (RNGs) with limited pools heighten the risk of collisions in cryptographic protocols, as the birthday paradox implies that even modestly sized sets can produce duplicates when drawn from a reduced output space. In systems relying on pseudorandom number generation, insufficient —such as from predictable seeds—effectively shrinks the space, making it feasible for adversaries to encounter collisions after generating on the order of the of the bits, compromising confidentiality in applications like session s or nonces. In and Merkle trees, short functions introduce potential collisions during proof-of-work mining, where the birthday paradox halves the effective security margin of the output length. For Merkle trees aggregating into a root commitment, collisions in intermediate nodes could enable or invalid inclusions if the size is insufficient, as attackers might find matching for conflicting after roughly 2^{n/2} trials for an n-bit . As of 2025, increasingly accounts for birthday bounds in designing lattice-based functions, ensuring against both classical and quantum adversaries in protocols like Fiat-Shamir signatures. Lattice-based constructions, such as those in NIST-standardized schemes, parameterize outputs to withstand attacks up to security levels of 2^{128} or higher, incorporating the paradox's probabilistic scaling to maintain integrity in quantum-resistant environments.

Variations and Extensions

Yuval's birthday attack

Yuval's birthday attack is a variant of the birthday attack that finds a collision between two families of related inputs, such as variants of a legitimate m_1 and a fraudulent m_2, such that there exist m_1' and m_2' with \hash(m_1') = \hash(m_2'), by exploiting the birthday paradox through targeted message modifications. This approach adapts the standard collision-finding technique to scenarios where the attacker controls variations around specific inputs, making it particularly relevant for applications like or bypassing verification mechanisms that rely on unique hashes. First described by Gideon Yuval in 1979, the attack highlights vulnerabilities in hash-based systems where minor message tweaks can be used to align hashes probabilistically. In Yuval's method, the attacker starts with a legitimate target m_1 and a desired fraudulent m_2. The process generates approximately t = 2^{n/2} minor modifications of m_1 (e.g., changing a few bits or padding subtly to preserve semantic meaning), computes the of each, and stores the results indexed by the first n/2 bits of the value, where n is the output length. Then, the attacker generates minor modifications of m_2, computes their es, and checks if the leading n/2 bits match any stored entry from the m_1 variations. Due to the birthday paradox, a match in the full is expected after generating about $2^{n/2} such modifications for m_2. Upon finding a match, the corresponding full messages m_1' and m_2' collide under the , allowing the legitimate (or ) for m_1' to apply to m_2'. This requires O(2^{n/2}) computations and , similar to a standard birthday attack but with roughly twice the constant factor due to the dual-set generation. (citing Yuval's method in context of ) The attack's complexity remains O(2^{n/2}) overall, offering a significant improvement over the generic collision search in some contexts, though the higher constants and need for message modifications limit its efficiency for arbitrary forgeries. It is most effective against hash functions with weak collision resistance, such as early designs lacking strong padding or strengthening. For instance, the attack has been applied to broken hash functions like HAVAL (with 128-bit output), where practical collision vulnerabilities (e.g., 5-pass HAVAL collisions in $2^{123} work) enable the targeted variant to succeed within feasible bounds, facilitating verification bypass in legacy systems. Unlike the standard birthday attack, which seeks any collision without prior knowledge of inputs, Yuval's variant requires target-specific adaptations, rendering it less versatile for blind forgery but valuable in scenarios demanding collisions tied to known data.

Generalized Collision Attacks

Generalized collision attacks extend the birthday paradox principle to scenarios involving multiple inputs colliding on the same output or hybrid resource tradeoffs, enabling more sophisticated exploits against -based structures. Multi-collision attacks aim to find sets of $2^t distinct messages that all map to the identical value under a with k-bit outputs. Unlike pairwise collisions, these require identifying larger clusters in the hash space. A generic approach leverages the structure of iterated functions, building collisions incrementally by appending colliding suffixes to prefixes, achieving a of approximately $2^{k/2 + t - 1} evaluations using techniques that exploit cycles and trees in the functional of the hash iteration. In 2004, Jacques Joux introduced an efficient method for constructing such multi-collisions in iterated functions, demonstrating that finding an r-collision (with r = 2^t) requires only about r \cdot 2^{k/2} evaluations, far less than the naive r^2 \cdot 2^k expected for independent searches. This technique iteratively finds collisions at each level of message expansion, allowing the attacker to "peel" the hash computation to reveal multiple paths to the same output. For t = k/2, yielding r = 2^{k/2} colliding messages, the attack runs in O(2^k) time—equivalent to a preimage search—effectively breaking the of tree-based hash constructions like Merkle trees, where an attacker can select from a massive set of colliding inputs to forge valid proofs across branches. Hellman-Merkle time-memory tradeoffs adapt the classic for resource-constrained environments, such as partial searches where full enumeration is infeasible. Originally proposed by in 1980 for key recovery, the method precomputes chains of iterations from starting points, storing endpoints in tables to trade memory M for online time T, satisfying T M^{1/3} \approx 2^k for a k-bit search . In collision contexts, this enables finding hash collisions by treating hash outputs as chain endpoints and using multiple reduction functions to cover the probabilistically, reducing the effective birthday bound when memory is limited (e.g., achieving near-$2^{k/2} collisions with T = M = 2^{2k/3}). refined this in 1980 with distinguished points to optimize chain storage, making it practical for partial attacks in low-memory settings like devices. These attacks have practical applications beyond pure , such as in denial-of-service () via hash flooding, where attackers generate colliding inputs to degrade performance in network devices. In the , vulnerabilities in routers and servers exploited weak hash functions (e.g., in and implementations), causing quadratic resizing and memory exhaustion with minimal traffic, as demonstrated in attacks that force worst-case behavior through collisions. Similar issues affected routing protocols, amplifying impact in BGP sessions. In , multi-collisions aid by identifying multiple candidate fragments matching the same hash signature, though this requires careful validation to avoid false positives. Modern extensions consider quantum threats, where , combined with the Brassard-Høyer-Tapp (BHT) variant, reduces the classical birthday bound from O(2^{k/2}) to O(2^{k/3}) for collision searches. The BHT method uses to simulate multiple parallel birthday trials, querying the hash oracle $2^{k/3} times to find a collision with high probability, halving the effective security of symmetric hashes like SHA-256 against quantum adversaries. As of 2025, with advancing quantum hardware, this prompts recommendations for larger hash outputs (e.g., 384+ bits) in post-quantum designs to restore $2^{k/2} classical-equivalent security.

Countermeasures

Hash Length Recommendations

To achieve against attacks, the output length k of a must be selected such that the expected number of operations required, approximately $2^{k/2}, exceeds the computational resources available to potential attackers. NIST recommends aligning the 's strength with the overall , where the strength for is typically k/2 bits. For a desired level of s bits, a output length of k = 2s bits is thus advised to ensure the bound provides at least s-bit . NIST guidelines specify that for 128-bit security—suitable for most modern cryptographic applications such as digital signatures and key derivation—a minimum hash output length of 256 bits is required, corresponding to the birthday bound of $2^{128} operations. Examples include and SHA3-256, both providing 128-bit under classical computing models. SHA-256 remains secure against practical birthday attacks as of 2025, as the $2^{128} computational cost far exceeds the capabilities of current classical supercomputers, which perform on the order of $10^{18} operations per second. Longer hash outputs enhance resistance but introduce tradeoffs, including increased computational overhead for hashing and storage. For instance, SHA-512 or offers 256-bit security but requires roughly twice the processing time of 256-bit variants on typical hardware. To balance security and performance while future-proofing against potential advances in collision-finding techniques, NIST-standardized (based on the Keccak sponge construction) is recommended over family members for new designs, as it provides equivalent security levels with a different internal structure less vulnerable to certain differential attacks. Additionally, BLAKE3—a parallelizable derived from BLAKE2—offers 256-bit output with superior speed on multi-core systems (up to 10-20 times faster than SHA-256 for large inputs) while maintaining provable under the same birthday bounds, making it suitable for high-throughput applications. Historical shifts in hash adoption underscore the importance of these recommendations. Following the practical collision attack on demonstrated in 2005, which reduced its effective security below 64 bits using differential cryptanalysis, widespread migration occurred to SHA-256 for 128-bit security. Similarly, after and CWI researchers produced the first practical collision in 2017—confirming its 80-bit security was inadequate—NIST deprecated for all uses, accelerating adoption of longer hashes like SHA-256 and SHA-3.

Alternative Security Measures

One approach to mitigating birthday attacks involves employing with associated data (AEAD) schemes or hash-then-MAC constructions, which integrate message directly into the process without relying solely on unauthenticated es for verification. These methods prevent collision-based forgery by ensuring that any alteration in the input, including those induced by hash collisions, invalidates the authentication tag, thereby protecting against unauthorized modifications in protocols like secure messaging. For instance, AEAD modes such as those analyzed in generic constructions provide security against forgery even when underlying primitives approach their birthday bounds, offering robustness in scenarios where direct hash signing is avoided. Randomized hashing techniques further enhance resistance by incorporating salts or nonces into the hashing process for each unique computation, effectively expanding the input domain and forcing attackers to generate distinct collision sets for every instance rather than reusing precomputed ones. In digital signature schemes, this is exemplified by the Probabilistic Signature Scheme (PSS), where a random salt is appended to the message before hashing, randomizing the input and thwarting existential forgery attacks that exploit hash collisions across multiple messages. This randomization ensures that even if a collision exists in the base hash function, the probabilistic padding makes it computationally infeasible to craft colliding padded messages that align with valid signatures. To achieve provable security against collision-based attacks, cryptographers often model hash functions as s, an idealized framework where the hash behaves as a truly random function, providing up to the birthday bound with high probability. Under this model, schemes like hash-and-sign signatures can be proven secure assuming the underlying trapdoor permutation (e.g., ) is hard, as the random oracle prevents efficient collision finding beyond the of the output size. This approach underpins widely adopted standards, ensuring that protocols remain secure as long as the hash instantiation approximates the ideal oracle behavior. Detection mechanisms, such as hash tree verification using , enable efficient integrity checks in high-volume systems by allowing parties to verify subsets of without recomputing all , potentially identifying anomalous collisions if they propagate to inconsistent . In such structures, each holds a , and internal nodes aggregate pairwise , providing a compact proof of or alteration; if a birthday-induced collision affects a leaf, the verifier can traverse the path to detect discrepancies during audits. complements this by monitoring distributions in large-scale deployments, flagging unusual collision rates that may indicate an . These methods are particularly useful in distributed systems like blockchains, where verifying vast datasets against collisions is essential. Post-quantum cryptographic alternatives, such as lattice-based digital signature schemes like CRYSTALS-, offer inherent resistance to birthday bounds by basing on the hardness of problems rather than solely on resistance, while still incorporating secure for transforms like Fiat-Shamir. 's module-lattice construction achieves chosen-message with parameters tuned to withstand both classical and quantum adversaries, including those exploiting quantum-accelerated searches on auxiliary , ensuring the scheme remains viable even as classical hash lengths alone prove insufficient against advanced threats. This shift to non-hash-centric primitives provides long-term protection in environments transitioning to quantum-resistant .

References

  1. [1]
    [PDF] birthday attack - Columbia Math Department
    The birthday attack is a method to find collisions in a cryptographic hash function, based on the birthday paradox. Having the same birthday is the analogue of ...
  2. [2]
    Birthday Attack - an overview | ScienceDirect Topics
    A birthday attack is a cryptographic attack where the probability of two or more people sharing a birthday is greater than 50% in a room with 23 or more people.
  3. [3]
  4. [4]
    [PDF] Hash Function Balance and its Impact on Birthday Attacks - CSE Home
    In practice, hash functions need to be designed not only to resist the birthday attack but also to resist cryptanalytic attacks, and for this, random behavior ...
  5. [5]
  6. [6]
    Birthday problem | Definition, Solution, Equations, & Facts - Britannica
    Oct 3, 2025 · Although there is debate over who came up with the original birthday problem, a version was first published by Austrian mathematician Richard ...
  7. [7]
    Birthday Problem -- from Wolfram MathWorld
    Consider the probability Q_1(n,d) that no two people out of a group of n will have matching birthdays out of d equally possible birthdays.Missing: source | Show results with:source
  8. [8]
    [PDF] Hexaflexagons and Other Mathematical Diversions
    Hexaflexagons and other mathematical diversions : the first Scientific American book of puzzles & games : with a new afterword 1 by Martin Gardner. -University ...
  9. [9]
    [PDF] Lecture 5 - Independence 1 Independent Events - csail
    3 The Birthday Paradox. Suppose that there are 23 people in a room. ... I throw at random into a collection of 1,000,000 bins before the probability that some bin.
  10. [10]
    [PDF] Lecture 3: Balls in Bins
    Theorem 15 (Birthday Paradox). If you gather 23 people in a room, then with probability 50% there will be two sharing a birthday. ... balls fall in this bin ...
  11. [11]
    collision - Glossary | CSRC
    Definitions: An event in which two different messages have the same message digest. Sources: NIST SP 800-106 under Collision. NIST SP 800-107 Rev.
  12. [12]
    [PDF] Random Oracles are Practical: A Paradigm for Designing Efficient ...
    For simplicity assume it is uniform; see below for how to patch things for standard ones. Let H: {0,1}∗ → {0,1}k denote as usual a random hash function. Let G ...
  13. [13]
    [PDF] New Directions in Cryptography - Stanford Electrical Engineering
    Diffie and M. E. Hellman, “Multiuser cryptographic techniques,” presented at National Computer Conference, New York, June 7-10,. 1976. [6] D. Knuth, The Art of ...
  14. [14]
    HOW TO SWINDLE RABIN
    HOW TO SWINDLE RABIN. Gideon Yuval. CRYPTOLOGIA. Abstract: a computational short-cut is shown, which can compromise the secur- ity of Rabin's digital signature ...
  15. [15]
    [PDF] Notes #12: Collision Finding 12.1 Birthday Attack
    The birthday attack finds collisions by computing H(k, x) for different x values until a collision is found, exploiting the birthday paradox.
  16. [16]
    [PDF] How to Break MD5 and Other Hash Functions - Merlot
    In this paper we present a new powerful attack on MD5 which allows us to find collisions efficiently. We used this attack to find collisions of MD5 in about 15 ...
  17. [17]
    None
    ### Summary of Birthday Attacks and Probability of Collision from Chapter 2 of Handbook of Applied Cryptography
  18. [18]
    [PDF] SHA-1 is a Shambles∗ - Cryptology ePrint Archive
    Our actual attack required two months of computations using 900 Nvidia GTX 1060 GPUs (we paid US$ 75k because GPU prices were higher, and we wasted some time ...
  19. [19]
    [PDF] Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD
    Aug 17, 2004 · MD5 is the hash function designed by Ron Rivest [9] as a strengthened version of MD4 [8]. In 1993 Bert den. Boer and Antoon Bosselaers [1] ...
  20. [20]
    [PDF] Chosen-prefix Collisions for MD5 and Colliding X.509 Certificates ...
    We describe how chosen-prefix collisions for MD5 can be constructed, and show that our method is practical by constructing two MD5 based X.509 certificates with ...
  21. [21]
    Flame malware collision attack explained - Microsoft
    Jun 6, 2012 · On systems that pre-date Windows Vista, an attack is possible without an MD5 hash collision. This certificate and all certificates from the ...
  22. [22]
    [PDF] Reverse-engineering of the cryptanalytic attack used in the Flame ...
    Sep 7, 2015 · Stevens recently proposed to counter these threats using counter-cryptanalysis. [Ste13], specifically a collision detection algorithm, i.e., an ...
  23. [23]
    [PDF] The first collision for full SHA-1 - SHAttered.io
    Feb 23, 2017 · As it turned out, it used a forged signature to infect Windows machines via a man-in-the-middle attack on. Windows Update. Using a new technique ...
  24. [24]
    [PDF] First Chosen-Prefix Collision on SHA-1 and Application to the PGP ...
    Aug 12, 2020 · When renting cheap GPUs, this translates to a cost of US$ 11k for a collision, and US$ 45k for a chosen-prefix collision, within the means of ...<|control11|><|separator|>
  25. [25]
    Of History & Hashes: A Brief History of Password… - TrustedSec
    May 30, 2015 · This is the classic “Birthday Attack”. ... The two cipher texts are then concatenated to produce the LM hash and store the hash in the SAM file.
  26. [26]
    Birthday Attacks, Collisions, And Password Strength - Auth0
    Mar 23, 2021 · The answer to this question is better illustrated by a famous problem in probability theory known as the Birthday Problem.
  27. [27]
    [PDF] Cryptanalytic Attacks on Pseudorandom Number Generators
    A permanent compromise attack occurs if, once an attacker compromises S at time t, all future and past. S values are vulnerable to attack. (c) Iterative ...<|separator|>
  28. [28]
    [PDF] Merkle Trees in Blockchain: A Study of Collision Probability ... - arXiv
    Therefore, the "birthday paradox" strategy leads to a halving of the "effective" hash code length. In the context of our problem, this means that a ...
  29. [29]
    A post-quantum lattice-based lightweight anonymous authentication ...
    Apr 7, 2025 · However, this scheme is not secure against the quantum attack. An ... Utilizing the birthday paradox [16], we get the following result:.
  30. [30]
    Multicollisions in Iterated Hash Functions. Application to Cascaded ...
    Advances in Cryptology – CRYPTO 2004 (CRYPTO 2004). Multicollisions in ... We also discuss the potential impact of our attack on several published schemes.
  31. [31]
    [PDF] A Cryptanalytic Time - Memory Trade-Off
    They think of a cipher- text-only attack in which the cryptanalyst possesses only ciphertext and some statistical knowledge of the plaintext. Aside from the ...
  32. [32]
    [PDF] Recommendation for Key Management: Part 1 - General
    May 5, 2020 · NIST SP 800-57 PART 1 REV. 5. RECOMMENDATION FOR KEY MANAGEMENT: PART ... The security strength of a hash function is determined by the properties ...
  33. [33]
  34. [34]
    Hash Functions | CSRC - NIST Computer Security Resource Center
    Jan 4, 2017 · Collision resistance: It is computationally infeasible to find two different inputs to the hash function that have the same hash value.NIST Policy · News & Updates · Events · SHA-3 Standardization
  35. [35]
  36. [36]
    [PDF] BLAKE3 - GitHub
    We present BLAKE3, an evolution of the BLAKE2 cryptographic hash that is both faster and also more consistently fast across different platforms and input ...<|control11|><|separator|>
  37. [37]
    Improved Collision Attack on Hash Function MD5
    In this paper, we present a fast attack algorithm to find two-block collision of hash function MD5. The algorithm is based on the two-block collision ...
  38. [38]
    [PDF] A Simple and Generic Construction of Authenticated Encryption With ...
    We describe a simple and generic construction of an AEAD protocol from an AE protocol and a collision resistant hash function. This is done in two simple steps.
  39. [39]
    [PDF] GLEVIAN and VIGORNIAN: Robust beyond-birthday AEAD modes
    Sep 20, 2023 · GLEVIAN and VIGORNIAN are AEAD modes with beyond-birthday security, security against nonce misuse, and against the release of unverified ...
  40. [40]
    6.8. Using a Nonce to Protect Against Birthday Attacks - O'Reilly
    Use a nonce or salt before and after your message (preferably a securely generated random salt), padding the nonce to the internal block size of the hash ...
  41. [41]
    [PDF] Exact Security Analysis of Hash-then-Mask Type Probabilistic MAC ...
    Thus, an appropriate usage of salt can enhance the security to go beyond birthday barrier without requiring larger domain PRF. Different Types of Hash-then-MAC ...Missing: prevent | Show results with:prevent
  42. [42]
    Dilithium - CRYSTALS
    Feb 16, 2021 · Dilithium is a digital signature scheme that is strongly secure under chosen message attacks based on the hardness of lattice problems over module lattices.Missing: birthday | Show results with:birthday