Fact-checked by Grok 2 weeks ago

Birthday problem

The birthday problem, also known as the birthday paradox, is a probability puzzle that examines the likelihood that at least two people in a group share the same birthday, assuming birthdays are uniformly distributed across 365 days of the year and ignoring leap years and year differences.^[1]^[2] The counterintuitive core result is that, in a group of just 23 people, the probability of at least one shared birthday exceeds 50%, far lower than the intuitive expectation of needing roughly half the number of days (around 183) for such an outcome.^[1]^[2] The probability is calculated as 1 minus the likelihood that all birthdays are distinct, given by the formula $1 - \frac{365! / (365 - n)!}{365^n} for n people, where the numerator represents the number of ways to assign unique birthdays and the denominator the total possible assignments.^[1]^[2] This yields approximately 50.7% for n = 23 and approaches certainty (over 99%) for groups larger than 60.^[2] An approximation for the threshold where the probability reaches 50% is n \approx \sqrt{2 \times 365 \times \ln 2}, highlighting the quadratic growth in pairwise comparisons (about n(n-1)/2) that drives the result.^[2] The problem's origins trace to Richard von Mises, who introduced it in a 1939 lecture, though it gained prominence in the 1950s through works like William Feller's probability textbook.^[3]^[4]^[5] Beyond its pedagogical value in illustrating probabilistic intuition and the pigeonhole principle, the birthday problem has significant applications in fields like cryptography, where it underpins "birthday attacks" on hash functions—exploiting collisions in output spaces of size m with roughly \sqrt{m} trials rather than m/2, as seen in analyses of functions like MD5.^[6]^[7] Generalizations extend to non-uniform birthday distributions, multiple matches, or larger calendars, and it informs algorithms in computer science for collision detection and randomized data structures.^[3]^[2]

Core Problem

Problem Statement

The birthday problem addresses the probability that at least two individuals in a randomly selected group of n people share the same birthday. This classic question in probability theory assumes a non-leap year with exactly 365 possible birthdays, uniform distribution of birthdays across all days, and independence among the individuals' birth dates, meaning no correlations such as familial or cultural patterns influence the outcomes.^[8] The counterintuitive nature of the problem arises from how quickly the probability of a shared birthday rises with group size; for instance, in a gathering of just 23 people, the chance exceeds 50%, far higher than most initial intuitions might suggest.^[9] This surprise stems from the combinatorial explosion of possible pairwise comparisons, which amplifies the likelihood of at least one match even in modestly sized groups. Introduced by Austrian mathematician Richard von Mises in 1939, the problem originated as an illustration in probability theory.^[10] Von Mises posed it to demonstrate practical applications of probabilistic reasoning in everyday scenarios. These foundational assumptions simplify the model by disregarding real-world factors, such as uneven birthday distributions due to seasonal birth variations or the slight increase from leap years, to focus on the core mathematical insight.^[11]

Exact Probability Calculation

The exact probability of at least one shared birthday among n people, assuming 365 days and uniform, independent birthdays, is derived using the complementary probability that all birthdays are distinct.^[12] This complementary probability q is the product \prod_{k=1}^{n-1} \left(1 - \frac{k}{365}\right), which represents the sequential likelihood that each additional person's birthday differs from all previous ones: the second person avoids the first's birthday with probability $364/365, the third avoids the first two with $363/365, and so on up to the nth person avoiding the prior n-1 with (365 - n + 1)/365.^[12] The desired probability p is then p = 1 - q.^[12] Equivalently, the complementary probability can be expressed in factorial form as q = \frac{365!}{365^n (365 - n)!} for n \leq 365, reflecting the number of ways to assign distinct birthdays (a permutation of 365 days taken n at a time) divided by the total number of possible birthday assignments $365^n.^[12] Thus, p = 1 - \frac{365!}{365^n (365 - n)!}.^[12] This form highlights the combinatorial nature of the problem but poses significant computational challenges for even moderate n, as the factorials grow enormously large, leading to overflow in standard arithmetic computations.^[13] To compute the exact value practically, the product form is preferred, often evaluated using logarithms to avoid overflow: \ln q = \sum_{k=1}^{n-1} \ln\left(1 - \frac{k}{365}\right), followed by q = e^{\ln q} and p = 1 - q.^[13] For large n, while Stirling's approximation can estimate the factorials, it introduces error and is not used for exact computation; instead, high-precision libraries or iterative multiplication with scaling handle the product directly.^[13] A classic example is n = 23: the complementary probability q is computed as the product from k=1 to $22 of (365 - k)/365, yielding q \approx 0.492703 (or exactly \frac{75091883268515350125426207425223147563269805908203125 - 38093904702297390785243708291056390518886454060947061}{75091883268515350125426207425223147563269805908203125}), so p \approx 0.507297.^[12] This step-by-step multiplication—starting with 1, multiplying by $364/365 \approx 0.997260, then $363/365 \approx 0.994521, and continuing through 22 terms—demonstrates how the accumulating reductions lead to just over a 50% chance of a shared birthday.^[12]

Approximations and Bounds

Common Approximations

One common approximation for the probability P(n) that at least two people share a birthday in a group of n individuals, assuming 365 days and uniform distribution, derives from the Taylor expansion of the logarithm in the exact complementary probability. The exact probability of no shared birthdays is \prod_{k=1}^{n-1} \left(1 - \frac{k}{365}\right). Taking the natural logarithm yields \ln\left(\prod_{k=1}^{n-1} \left(1 - \frac{k}{365}\right)\right) = \sum_{k=1}^{n-1} \ln\left(1 - \frac{k}{365}\right). For small k/365, \ln(1 - x) \approx -x, so the sum approximates -\sum_{k=1}^{n-1} \frac{k}{365} = -\frac{n(n-1)}{2 \times 365}. Thus, the product approximates \exp\left(-\frac{n(n-1)}{2 \times 365}\right), and P(n) \approx 1 - \exp\left(-\frac{n(n-1)}{2 \times 365}\right).^[14] This exponential approximation is equivalent to the Poisson paradigm, which models the number of shared birthday pairs as a Poisson random variable with parameter \lambda = \frac{n(n-1)}{2 \times 365}, treating pairs as rare, nearly independent events. The probability of at least one pair is then $1 - e^{-\lambda}, matching the exponential form. This Poisson approach leverages the fact that the sum of indicator variables for each possible pair having a match converges in distribution to Poisson(\lambda) under suitable conditions.^[3] A useful special case is the square-root approximation for the group size n yielding P(n) \approx 0.5. Setting $1 - e^{-\lambda} = 0.5 gives \lambda = \ln 2 \approx 0.693, so \frac{n(n-1)}{2 \times 365} \approx 0.693, and solving the quadratic yields n \approx \sqrt{2 \times 365 \times \ln 2} \approx 22.99, or roughly 23. This heuristic extends generally as n \approx \sqrt{2 c \ln(1/(1-p))} for probability p and c days.^[4] The exponential (or Poisson) approximation performs well for moderate n, with absolute errors typically under 0.01 for n \leq 100 in the uniform case, though it slightly underestimates the true probability as n increases due to growing dependencies among pairs. For n \leq 50, the maximum error is about 0.007 (e.g., at n=23). The Poisson distribution itself provides a slightly less precise fit for the full count of pairs at larger n, where higher moments deviate more, but remains effective for the at-least-one probability.^[15] The following table compares exact probabilities P(n) (computed via the product formula) against the approximation $1 - e^{-\lambda} for selected n, illustrating the accuracy:

n	Exact P(n)	Approx. P(n)	Absolute Error
10	0.11695	0.1164	0.00055
20	0.41144	0.4061	0.00534
23	0.50730	0.5000	0.00730
30	0.70632	0.6963	0.01002
50	0.97045	0.9652	0.00525

^[16]^[17]

Probability Bounds

The probability P that at least two individuals in a group of n people share a birthday, assuming 365 equally likely days and ignoring leap years, admits several useful mathematical bounds that provide guarantees without requiring exact computation. An upper bound arises from the union bound applied to the pairwise collision events. Specifically, there are \binom{n}{2} = \frac{n(n-1)}{2} pairs, each with collision probability \frac{1}{365}, yielding P \leq \frac{n(n-1)}{2 \times 365}.^[18] This bound follows by considering indicator variables for each pair's collision and summing their expectations, as the probability of the union is at most the sum of individual probabilities.^[18] A complementary lower bound on P leverages the inequality $1 - x \leq e^{-x} for x \geq 0. The probability of no shared birthday is Q = \prod_{i=1}^{n-1} \left(1 - \frac{i}{365}\right). Taking the natural logarithm gives \ln Q = \sum_{i=1}^{n-1} \ln\left(1 - \frac{i}{365}\right) \leq \sum_{i=1}^{n-1} \left( -\frac{i}{365} \right) = -\frac{n(n-1)}{2 \times 365}, since \ln(1 - x) \leq -x. Thus, Q \leq e^{-\frac{n(n-1)}{2 \times 365}}, and P = 1 - Q \geq 1 - e^{-\frac{n(n-1)}{2 \times 365}}.^[18] This inequality provides a conservative estimate that is tight for moderate n relative to 365. These bounds enable deriving a lower bound on the minimal group size n required for P > p, for $0 < p < 1. Setting the lower bound equal to p and solving $1 - e^{-\frac{n(n-1)}{2 \times 365}} = p yields e^{-\frac{n(n-1)}{2 \times 365}} = 1 - p, so \frac{n(n-1)}{2 \times 365} = \ln\frac{1}{1-p}. Approximating n(n-1) \approx n^2 gives n > \sqrt{2 \times 365 \times \ln\frac{1}{1-p}}.^[18] For p = 0.5, \ln 2 \approx 0.693, so n > \sqrt{2 \times 365 \times 0.693} \approx 22.49, implying n \geq 23 suffices for P > 0.5.^[18] Such bounds are valuable for quick assessments in applications like hash collision analysis, where exact probabilities may be computationally intensive, offering provable limits on collision likelihood without full enumeration.^[18]

Generalizations

Variable Number of Days

The birthday problem generalizes naturally to an arbitrary number of possible days d in a year or cycle, rather than the standard 365 days, allowing analysis of scenarios with varying calendar lengths or analogous discrete uniform distributions. In this setting, the probability P(n, d) that at least two out of n randomly selected individuals share the same "birthday" assumes uniform and independent assignments across the d possibilities. This formulation was part of the original presentation by Richard von Mises in 1939, who considered variable partition and occupancy probabilities in combinatorial settings.^[19] The exact probability of at least one shared birthday is given by

P(n, d) = 1 - \frac{d!}{d^n (d - n)!}

for n \leq d, where the subtracted term represents the probability that all n birthdays are distinct, expressed using the falling factorial d(d-1)\cdots(d-n+1). This formula arises from counting the total d^n possible birthday assignments and the favorable P(d, n) permutations of n distinct days out of d. For n > d, P(n, d) = 1 by the pigeonhole principle.^[12] A common approximation, derived via Poissonization of the occupancy process, yields

P(n, d) \approx 1 - e^{-n(n-1)/(2d)},

which becomes accurate when n is much smaller than d but n^2 / d is order 1; it models the number of colliding pairs as approximately Poisson with mean n(n-1)/(2d). For a 50% probability threshold, solving $1 - e^{-n(n-1)/(2d)} \approx 1/2 gives n \approx \sqrt{2 d \ln 2}, highlighting how the required group size scales with the square root of d.^[12]^[20] For the classic case of d = 365, this approximation predicts n \approx 23 for 50% probability, matching the exact value closely. With fewer days, such as d = 100, the threshold drops to about n \approx 13, implying higher collision risks in shorter cycles like lunar calendars. Conversely, longer cycles (e.g., d = 1000) require n \approx 36 for the same probability, illustrating the problem's sensitivity to d. These examples underscore applications in systems with variable hash spaces or periodic events.^[12] An adapted upper bound from the union bound over all \binom{n}{2} pairs gives P(n, d) \leq n(n-1)/(2d), since each pair collides with probability $1/d and linearity of expectation bounds the probability of at least one collision. Inverting this for the minimal n such that the bound exceeds a target probability p yields n \gtrsim \sqrt{2 p d}, providing a quick conservative estimate for large d.^[18]

Shared Birthdays in Groups

The probability of at least one group of three or more people sharing the same birthday in a group of n individuals, assuming 365 equally likely birthdays and ignoring leap years, extends the classic pairwise birthday problem. This event, known as a triple or higher-order collision, can be computed exactly by considering the complementary probability of no such group, which involves summing over all possible configurations where the maximum number of people per birthday is at most two (i.e., singletons and pairs only). The exact formula is

P(\text{at least one triple}) = 1 - \sum_{i=0}^{\lfloor n/2 \rfloor} \frac{365! \cdot n!}{i! \cdot (n - 2i)! \cdot 2^i \cdot (365 - n + i)! \cdot 365^n},

where the sum enumerates the ways to partition the n people into i pairs and n - 2i singletons across the days. This approach relies on multinomial coefficients and avoids overcounting by fixing the occupancy structure, though it becomes computationally intensive for large n. More generally, exact counts for at least one k-tuple (group of k \geq 3) sharing a birthday can be derived using inclusion-exclusion over the days or by partitioning the people with Stirling numbers of the second kind to account for the number of ways to assign groups to birthdays without exceeding k-1 per day in the complement. For the specific case of a triple (k=3), a useful approximation treats the expected number of triples as a Poisson random variable. The parameter is \lambda_3 = \binom{n}{3} / 365^2 = n(n-1)(n-2)/(6 \cdot 365^2), and the probability of at least one triple is approximately $1 - e^{-\lambda_3}. This Poisson approximation performs well when n is much smaller than 365, as the occurrences on different days are nearly independent. For example, with n=88 people, \lambda_3 \approx 0.824 and the approximate probability is about 56%, closely matching the exact value of 0.511. The probability that exactly k specific people share a particular birthday, with the remaining n-k having different birthdays from that day and possibly among themselves, involves binomial probabilities adjusted for the uniform distribution. For a fixed day and fixed set of k people, this is \binom{n}{k} (1/365)^k ((364/365)^{n-k}), but to find the overall probability of exactly k people on some day requires further integration over choices of day and group, often approximated or bounded for practicality. The extreme case where all n people share exactly one common birthday has probability $365 \cdot (1/365)^n = 365^{1-n}, accounting for any of the 365 days being the shared one. This probability decreases rapidly with n; for n=2, it is $1/365 \approx 0.0027, and for n=10, it is approximately $2.5 \times 10^{-18}. Generalizations to d days yield d^{1-n}. The pairwise shared birthday probability serves as a foundational prerequisite, where for n=23, it reaches about 50%, highlighting how higher-order shares require larger groups.

Collision Probabilities

In the context of hash functions or random mappings, the birthday problem generalizes to the probability of a collision, where n items are assigned to one of d possible values uniformly at random. The exact probability of at least one collision is

P(\text{collision}) = 1 - \prod_{i=1}^{n-1} \left(1 - \frac{i}{d}\right),

which represents the complement of the probability that all assignments are unique. This formulation frames the classic birthday scenario (with d=365) as a special case for detecting shared hashes or identifiers. The probability of exactly one colliding pair, assuming no higher-order collisions, can be approximated using the Poisson paradigm, where the expected number of pairs is \lambda = \binom{n}{2}/d. Under this approximation, the probability is roughly the expected number of such pairs times the probability of no other collisions, yielding

P(\text{exactly one pair}) \approx \frac{n(n-1)}{2d} \, e^{-n(n-1)/(2d)}.

This expression is derived via inclusion-exclusion principles and is accurate when \lambda is small and n \ll \sqrt{d}.^[3] For scenarios involving m distinct types or categories of items, each type independently assigned to its own space of d possibilities, the probability of at least one collision within any single type can be bounded using the union bound. This gives an upper estimate of \sum_{j=1}^m P(\text{collision in type } j), where each term follows the pairwise collision formula above, useful when individual probabilities are low to avoid overcounting overlaps across types.^[21] In cryptography, this collision framework underpins the birthday attack on hash functions, where d = 2^b for a b-bit output. To achieve a collision probability p, the required number of queries n satisfies n \approx \sqrt{2 d \ln(1/(1-p))}. For p = 0.5, this simplifies to approximately 1.177 \sqrt{d}. As originally described for undermining digital signatures, the attack exploits the quadratic scaling to find collisions with far fewer than d attempts. A practical example is a 128-bit hash function (d = 2^{128}), where roughly n \approx 2^{64} inputs yield a 50% chance of collision, demonstrating the vulnerability of even large output spaces to birthday-based attacks.

Covering All Birthdays

The problem of covering all birthdays with a group of people is a variant of the classic birthday problem, recast as a coupon collector's scenario where each birthday represents a distinct "coupon" to be collected. Here, the objective is to find the number of randomly selected individuals required to ensure every possible birthday date is represented at least once, assuming uniform distribution over d days and ignoring leap years or other complications. This shifts the emphasis from detecting the first duplicate birthday to achieving complete coverage of the calendar. The expected number of people T needed to cover all d birthdays is given exactly by E[T] = d H_d, where H_d = \sum_{k=1}^d \frac{1}{k} is the d-th harmonic number.^[22] For large d, this admits the approximation E[T] \approx d \ln d + \gamma d, with \gamma \approx 0.57721 denoting the Euler-Mascheroni constant.^[23] The random variable T arises as the sum of d independent geometric random variables, each representing the additional people needed to acquire a new birthday after k-1 distinct ones have been collected, for k = 1 to d. The variance is then \operatorname{Var}(T) = d^2 \sum_{k=1}^d \frac{1}{k^2} - d H_d, which for large d approximates to \frac{\pi^2}{6} d^2 \approx 1.64493 d^2.^[23] The probability that exactly n people suffice to cover all d birthdays equals \frac{d! \, S(n, d)}{d^n}, where S(n, d) denotes the Stirling number of the second kind, which counts the partitions of n items into d non-empty unlabeled subsets.^[24] This expression arises from the total surjective mappings from n people to d birthdays, divided by the d^n possible assignments. For illustration, with d = 365, the expected value E[T] \approx 365 \ln 365 + \gamma \cdot 365 \approx 2368.^[23] In contrast to the standard birthday problem, which quantifies the onset of collisions, this covering variant addresses the scale required for exhaustive representation, highlighting the asymmetry between partial overlap and full completion in uniform random assignments.

Partition and Matching Variants

The partition problem in the context of the birthday problem involves distributing n people into 365 birthday "bins" such that no bin is empty, meaning every day of the year has at least one person born on it. This is equivalent to the number of surjective functions from a set of n elements to a set of 365 elements, divided by the total number of possible assignments $365^n. The exact probability is given by the inclusion-exclusion principle:

P(\text{all days covered}) = \sum_{k=0}^{365} (-1)^k \binom{365}{k} \left(1 - \frac{k}{365}\right)^n.

This formula arises from subtracting the cases where at least one specific day is missed, adding back intersections where at least two are missed, and so on.^[25] For more balanced partitions, where the birthdays are distributed according to a specific occupancy vector (n_1, n_2, \dots, n_{365}) with \sum n_i = n and each n_i \geq 0, the probability is

\frac{n!}{n_1! n_2! \cdots n_{365}! \cdot 365^n},

using the multinomial coefficient to count the ways to assign people to days with those exact counts, under the uniform assumption. For non-empty balanced partitions (each n_i \geq 1), sum this over all such vectors. These multinomial probabilities describe the full distribution of partition types in the occupancy model underlying the birthday problem. A notable example is the uniform partition for n = 365, where each day has exactly one birthday (all n_i = 1). The probability is \frac{365!}{365^{365}}. Using Stirling's approximation n! \approx \sqrt{2 \pi n} \, (n/e)^n, this simplifies to

\frac{365!}{365^{365}} \approx \sqrt{2 \pi \cdot 365} \cdot e^{-365} \approx 4.8 \times 10^{-157},

illustrating how extraordinarily unlikely it is to have exactly one person per day in a group of 365.^[13] The first match variant considers the expected number of people E[N] needed until the first shared birthday occurs. The exact value is E[N] = 1 + \sum_{k=1}^{365} \prod_{i=1}^{k-1} \left(1 - \frac{i}{365}\right), which evaluates to approximately 24.6166 for 365 days. An asymptotic approximation for large d (number of days) is E[N] \approx \sqrt{\frac{\pi d}{2}} + 1, yielding about 24.95 for d=365, closely matching the exact figure and highlighting the counterintuitive efficiency of collisions. To arrive at this approximation, note that the probability of no collision after k people is \prod_{i=1}^{k-1} (1 - i/d) \approx \exp(-k^2 / 2d); the expected waiting time is then the sum of survival probabilities, which integrates to the square root form via the Gaussian integral \int_0^\infty \exp(-x^2 / 2) \, dx = \sqrt{\pi / 2}.^[3] For the number of shared birthdays, consider the expected number of days with exactly k birthdays in a group of n people. Each day's count follows a \text{Binomial}(n, 1/365) distribution, which for large n and fixed \lambda = n/365 is well-approximated by \text{Poisson}(\lambda). Thus, the probability a specific day has exactly k birthdays is approximately e^{-\lambda} \lambda^k / k!, and the expected number of such days is $365 \cdot e^{-\lambda} \lambda^k / k!. This Poisson approximation captures the clustering of birthdays into shared days, with \lambda controlling the average occupancy per day. For instance, with n=100 (\lambda \approx 0.274), the expected number of days with exactly 2 birthdays is about 10.4.

Applications and Extensions

Computer Science Applications

In cryptography, the birthday attack leverages the birthday paradox to find collisions in hash functions, allowing an attacker to generate two distinct inputs that produce the same hash output with significantly reduced effort compared to exhaustive search.^[26] This attack achieves a time complexity of O(\sqrt{d}), where d is the size of the hash output space, in contrast to the O(d) complexity required for a brute-force preimage attack.^[26] Such collisions undermine the security of hash-based systems like digital signatures and message authentication, where uniqueness is assumed. A notable example occurred in 2004 when Xiaoyun Wang and colleagues demonstrated practical collisions for the MD5 hash function, producing two different messages with identical 128-bit hashes using only standard computing resources. This breakthrough accelerated the deprecation of MD5 in security protocols, as it exposed vulnerabilities in applications relying on its collision resistance. Similarly, the 2017 SHAttered attack revealed chosen-prefix collisions for SHA-1, prompting NIST to formally deprecate SHA-1 for all cryptographic protections by December 31, 2030, due to the practical feasibility of attacks on certificate authorities and code signing.^[27] Beyond cryptography, the birthday problem applies to database systems, where hash functions map records to keys in a space of size d, and the probability of duplicate keys (collisions) rises unexpectedly with the number of records.^[28] For instance, with d \approx 10^9 possible keys, inserting around $10^5 records yields a non-negligible chance of collision, potentially leading to data integrity issues in hash tables or indexing structures.^[28] To mitigate these risks in both cryptographic and database contexts, designers employ larger hash output sizes (e.g., 256 bits or more) to expand d exponentially, or incorporate salting—unique random values per input—to effectively increase the collision space and thwart targeted attacks. In the era of quantum computing, the birthday attack faces further evolution through algorithms like Brassard-Høyer-Tapp (BHT), which combines Grover's search with quantum parallelism to find collisions in O(d^{1/3}) time, reducing the security margin for n-bit hashes to roughly n/3 bits.^[29] As of 2025, ongoing research highlights this as a pressing threat to symmetric cryptography, urging transitions to quantum-resistant hash functions like those based on lattices or sponges to maintain post-quantum security levels.^[29]

Near and Reverse Variants

In the near birthday variant, the focus shifts from exact shared birthdays to pairs whose birthdays fall within k days of each other, often modeled on a circular calendar to account for the wrap-around from December to January. For two individuals, the probability of such a near match is exactly \frac{2k+1}{d} under uniform distribution, where d = 365 is the number of days, assuming leap years are ignored and k is small relative to d.^[30] This setup treats adjacent days symmetrically, increasing the effective collision likelihood compared to the classic problem. For a group of n people, the probability of at least one near match is approximated by adapting the classic birthday formula, effectively reducing the number of "pigeonholes" by the factor $2k+1. The group size n yielding a 50% probability is roughly n \approx 1.2 \sqrt{\frac{d}{2k+1}}.^[4] Exact computations for this variant, derived from combinatorial counting of non-matching assignments, confirm that for k=1 and d=365, n \approx 14 achieves about 50% probability, while n=23 reaches approximately 89%.^[3] This adjustment highlights how allowing proximity significantly lowers the threshold for likely coincidences, with further reductions for larger k (e.g., n \approx 8 for k=5).^[31] The reverse birthday problem inverts the classic formulation by solving for the smallest n such that the probability of at least one shared birthday reaches a given p. The standard approximation derives from the pair-wise collision estimate, where the probability is $1 - \exp\left(-\frac{n(n-1)}{2d}\right) \approx p, leading to n \approx \sqrt{-2d \ln(1-p)} for large d and moderate p.^[4] For d=365 and p=0.99, this yields n \approx 57, meaning a group of 57 people has nearly a 99% chance of a shared birthday.^[31] In the near-match context, the same inversion applies with the adjusted pair probability \frac{2k+1}{d}, further decreasing n (e.g., to about 14 for k=1 and p=0.5).^[3] Another extension examines the distribution of birthdays across days, approximating the number per day via the Poisson paradigm for large groups. Each day receives birthdays according to a Poisson random variable with mean \lambda = n/d, so the probability a specific day has at least k birthdays is $1 - \sum_{j=0}^{k-1} \frac{e^{-\lambda} \lambda^j}{j!}. The expected number of days with at least k birthdays is then d times this value, providing insight into multiple collisions beyond pairs.^[32] The average group size until the first shared birthday, or expected waiting time for the initial collision, integrates the survival function over the approximation \exp\left(-\frac{x^2}{2d}\right), yielding \int_0^\infty \exp\left(-\frac{x^2}{2d}\right) \, dx \approx \sqrt{\frac{\pi d}{2}}.^[33] For d=365, this expected value is approximately 24, slightly above the 23 for 50% probability, reflecting the mean over the distribution of first-collision times.^[32]

References

[1]
The Birthday Problem - explanation, MSTE, University of Illinois
If there is no match after two people, the third person has 363 different birthdays that do not match the other two. So, the probability of a match is 1 - (365)( ...
[2]
[PDF] The Birthday Paradox
In reality, there is a 50:50 chance that two people will share a birthday in a group. We will explain this solution, as well as the problem in general, and the ...
[3]
[PDF] The matching, birthday and the strong birthday problem
The article is partially historical, and partially forward looking. For example, we address a new problem that we call the strong birthday problem. Section ...
[4]
[PDF] Methods for Studying Coincidences - UC Berkeley Statistics
As far as we know, the earliest mention of the birthday problem was made by Von Mises. (1939). Problem 2: Many Types of Categories. The first variant offers ...Missing: origin | Show results with:origin
[5]
[PDF] Combinatorics - Chance
The birthday problem does not seem to have a very old history. Problems of this type were first discussed by von Mises.8 It was made popular in the 1950s by.Missing: origin | Show results with:origin
[6]
The Birthday Problem
This has applications to the design of cryptographically strong hash functions. The hash function should not be invertible - it should be computationally too ...
[7]
[PDF] The Birthday Paradox Andreas Klappenecker
An interesting application is the birthday attack on cryptographic hash functions. Our motivation of Pollard's ρ algorithm was also based on the birthday ...Missing: problem | Show results with:problem
[8]
The Uniformity Assumption in the Birthday Problem - jstor
It is easily solved under the assumptions that each person's birthday is determined independently, and that the 365 possible birthdays (ignoring leap years) are ...
[9]
Probability and the Birthday Paradox | Scientific American
Mar 29, 2012 · If you survey a random group of just 23 people there is actually about a 50–50 chance that two of them will have the same birthday. George ...
[10]
[PDF] Von Mises' Frequentist Approach to Probability - StatLit.org
Richard Von Mises (1883-1953), the originator of the birthday problem, viewed statistics and probability theory as a science that deals with and is based on ...Missing: origin | Show results with:origin
[11]
The birthday problem - Borja - 2007 - Royal Statistical Society - Wiley
Aug 30, 2007 · The assumption, when applying these calculations to birthdays, is that all possible birth dates are equally likely. But this assumption is false ...
[12]
Birthday Problem -- from Wolfram MathWorld
Reprinted in Selected Papers of Richard von Mises, Vol. 2 (Ed. P. Frank, S. Goldstein, M. Kac, W. Prager, G. Szegö, and G. Birkhoff).Missing: original | Show results with:original
[13]
[PDF] Combinatorics (2.6) The Birthday Problem (2.7) - UCSD Math
What's the probability p that at least two people share a birthday? Equivalently, compute q = 1 − p, the probability that all birthdays are different. Prof.
[14]
[PDF] 1.1. Birthday Problem. If there are 2 people, the chance that they do ...
Sep 24, 2010 · One partial explanation for the counter-intuitive high answer is that, among the others, there are likely to be many pairs that share the same ...
[15]
[PDF] approximations to the birthday problem with unequal occurrence ...
From this expression we derive a recurrence relation for rn. Also we give two types of approximations of non-coincidence probabilities. The second approximation ...
[16]
Probability
Here's a table of probabilities that everyone has a different birthday, for N=1 to N=25. To get the probability that two people have the same brithday, ...
[17]
None
### Extracted Table of n and Probabilities
[18]
[PDF] Appendix A The Birthday Problem
The following gives a more exact rendering, providing both upper and lower bounds on this probability. Theorem A.1 [Birthday bound] Let C(N,q) denote the ...Missing: derivation | Show results with:derivation
[19]
[PDF] Happy Birthday to . . . Two? - American Statistical Association
In 1939, Richard von Mises posed the Birthday Problem: “How many people would need to be in a room so that it was more likely than not that at least two of them ...
[20]
Feller, W. (1968) An Introduction to Probability Theory and Its ...
The birthday problem is: Determine the number of independent and identically distributed random variables required so there is a probability of at least 1/2 ...
[21]
[PDF] Memoryless Near-Collisions, Revisited - arXiv
The generic method for finding collisions for a given hash function is based on the birthday paradox and attributed to Yuval [22]. There are well ...
[22]
[PDF] ON A CLASSICAL PROBLEM OF PROBABILITY THEORY b We ...
We consider the following classical “urn-problem”. Suppose that there are 71 urns given, and that balls are placed at random in these urns one after.
[23]
https://arxiv.org/pdf/2003.04720.pdf
[24]
[PDF] Using Stirling numbers to solve coupon collector problems
Sep 17, 2024 · By way of enrichment here is a proof using Stirling numbers of the second kind which encapsulates inclusion-exclusion in the generating function ...
[25]
[PDF] Lecture Notes for Introductory Probability - UC Davis Math
Hughes, not understanding the Birthday Problem, did not check for ... (a) Write down the exact probability that exactly 3 male tourists will get married while ...
[26]
Birthday Problem - Cornell Mathematics
Birthday Problem. As an application of the Poisson approximation to Binomial, we consider the Birthday problem, which is quite interesting.
[27]
[PDF] How to Launch A Birthday Attack Against DES
Hence, the birthday attack should evaluate 248 candidates for K. Thus, the attack has a computational complexity of 248.
[28]
NIST Transitioning Away from SHA-1 for All Applications | CSRC
NIST will transition away from the use of SHA-1 for applying cryptographic protection to all applications by December 31, 2030.Missing: implications | Show results with:implications
[29]
[PDF] Lecture 3: Concentration Bounds and Hashing 3.1 Birthday Paradox
Jan 11, 2017 · So, to prove the claim we upper bound the variance of Y and use Chebyshev inequality. First, observe that the random variables Yi,j are pairwise ...Missing: derivation | Show results with:derivation
[30]
[PDF] Applying Grover's Algorithm to Hash Functions - arXiv
Feb 22, 2022 · birthday attack, which attempts to generate a collision. The BHT algorithm is a quantum birthday attack that makes use of Grover's Algorithm ...
[31]
More Birthday Surprises - Taylor & Francis Online
Apr 11, 2018 · (1970). More Birthday Surprises. The American Mathematical Monthly: Vol. 77, No. 8, pp. 856-858.
[32]
[PDF] the birthday problem and generalizations - Cloudfront.net
Poisson variables are very frequently used to model counts, and though the infinite state space that they usually take would not seem to fit the multinomial ...
[33]
[PDF] Asymptotic distribution for the birthday problem with multiple ...
The classical occupancy problem is specified in terms of a fixed number of balls, and a fixed number of equally likely bins. We choose the notation b balls and ...Missing: lambda = | Show results with:lambda =
[34]
Asymptotic distribution for the birthday problem with multiple ... - arXiv
Oct 26, 2013 · Asymptotic distribution for the birthday problem with multiple coincidences, via an embedding of the collision process. Authors:R. Arratia, S.