Birthday attack
In cryptography, a birthday attack is a type of collision-finding attack on hash functions that leverages the birthday paradox—a probabilistic phenomenon where the likelihood of at least one shared birthday in a group of randomly selected people rises sharply with group size—to identify two distinct inputs producing the same hash output with far fewer computations than exhaustive search would demand.[1] For a hash function with an output space of size n (e.g., 2^b for b-bit hashes), the attack requires approximately √(2 * n * ln(2)) ≈ 1.177 √n hash evaluations to achieve a 50% probability of collision, reducing the effort from O(n) to O(√n).[2] This makes it a significant threat to systems relying on collision resistance, such as digital signatures and integrity checks, as demonstrated in early proposals like Gideon Yuval's 1979 method to forge signatures in Rabin's cryptosystem by generating colliding message pairs. The birthday paradox itself, which underpins the attack, illustrates that in a room of just 23 people, the probability of at least two sharing a birthday exceeds 50%, calculated as 1 minus the product of (365 - i)/365 for i from 0 to 22, yielding approximately 0.507.[1] This result stems from the quadratic growth in pairwise comparisons (k choose 2 for k samples), making collisions inevitable sooner than intuition suggests in uniform random distributions.[3] In cryptographic contexts, the paradox translates directly to hash outputs treated as "birthdays," enabling attackers to precompute and store many hashes until a match emerges, often using storage-optimized variants like Pollard's rho algorithm for efficiency.[2] Historically, the birthday attack gained prominence through Yuval's work, which highlighted its potential against naive hash-based authentication by allowing an adversary to craft equivalent messages indistinguishable to the verifier. While pure birthday attacks on strong modern hashes like SHA-256 (requiring ~2^128 operations) remain computationally infeasible, they set the baseline for security analysis and have informed vulnerabilities in weaker functions, such as MD5, where collisions were found in 2^64 effort bounds, though actual breaks used more sophisticated differential cryptanalysis.[4] Defenses include using larger hash outputs, randomized salting, or target-specific collision resistance in protocols like digital signatures to mitigate the attack's impact.[5]Background and Paradox
The Birthday Paradox
The birthday paradox, also known as the birthday problem, describes the counterintuitive probability that, in a random group of 23 people assuming 365 possible birthdays and uniform distribution, there is approximately a 50% chance that at least two individuals share the same birthday.[6] This result defies common intuition, which might suggest a much larger group size is needed given the 365 possible days, but it highlights how probabilities of shared events accumulate rapidly in moderate-sized sets.[7] The intuitive explanation centers on pairwise comparisons rather than sequential checks against all possible dates. In a group of n people, there are \binom{n}{2} = \frac{n(n-1)}{2} unique pairs, and each pair has a \frac{1}{365} probability of matching birthdays under the uniform assumption, ignoring leap years and exact date distributions.[6] As n increases, the number of pairs grows quadratically, amplifying the likelihood of at least one match even though individual pair probabilities remain small.[7] The exact probability P of at least one shared birthday in a group of n people is given byP(n) = 1 - \frac{_{365}P_n}{365^n},
where _{365}P_n = \frac{365!}{(365-n)!} is the number of permutations of 365 days taken n at a time, valid for n ≤ 365.[7] For n ≪ 365, this simplifies computationally to the product form
P(\text{no shared}) = \prod_{k=1}^{n-1} \left(1 - \frac{k}{365}\right),
so P(at least one shared) = 1 minus that value; for example, at n=23, P ≈ 0.507, and at n=57, P ≈ 0.991.[6] The problem was first published in 1939 by Austrian mathematician Richard von Mises, though its conceptual roots lie in earlier work on probability coincidences.[6] It was popularized in the United States through Martin Gardner's "Mathematical Games" column in Scientific American, where he discussed it as a striking example of probabilistic surprise.[8] This paradox extends analogously to scenarios like finding collisions in cryptographic hash functions, where output values behave like birthdays.[7]