Cryptographically secure pseudorandom number generator
A cryptographically secure pseudorandom number generator (CSPRNG) is a deterministic algorithm that expands a short, uniformly random seed into a longer sequence of bits that is computationally indistinguishable from true randomness, satisfying properties essential for cryptographic security such as unpredictability even when an adversary observes portions of the output.[1][2] CSPRNGs differ from ordinary pseudorandom number generators by resisting attacks that exploit knowledge of internal state or previous outputs, ensuring that no polynomial-time algorithm can predict future bits with advantage greater than negligible probability.[2] These generators are foundational to cryptographic protocols, providing the entropy needed for secure key generation, digital signatures, and nonce creation, where any predictability could enable attacks like key recovery or replay exploits. Standards from the National Institute of Standards and Technology (NIST), such as Special Publication 800-90A, define approved constructions including deterministic random bit generators (DRBGs) based on hash functions, HMAC, or block ciphers like AES in counter mode, which have been rigorously vetted for security under realistic threat models. A defining characteristic is state compromise resistance, encompassing forward security—where compromise of the current state does not reveal prior outputs—and backward security, preventing reconstruction of earlier states from future ones, though full implementations often require entropy sources for reseeding to maintain long-term unpredictability.[2] Notable controversies highlight the stakes of CSPRNG design; the Dual_EC_DRBG, endorsed by NIST in 2006, was later exposed for structural weaknesses allowing efficient prediction if certain elliptic curve points were known, leading to suspicions of deliberate NSA insertion of a backdoor via non-standard parameters, prompting its removal from standards in 2014 and underscoring the need for transparent, provably secure algorithms.[3][4] Despite such setbacks, advancements in CSPRNGs continue to bolster cryptographic systems, with empirical validation through suites like NIST's Statistical Test Suite confirming statistical randomness while theoretical foundations ensure computational hardness.[5]Fundamentals
Definitions and Core Concepts
A pseudorandom number generator (PRNG) is a deterministic algorithm that accepts an initial seed value and produces an output sequence exhibiting statistical properties approximating those of independent, uniformly distributed random bits.[1] This output is generated through mathematical transformations that expand or iterate on the seed, ensuring reproducibility under identical starting conditions. PRNGs differ fundamentally from true random number generators (TRNGs), which derive bits from physical entropy sources, as PRNGs rely solely on computational processes without inherent unpredictability beyond the seed. A cryptographically secure pseudorandom number generator (CSPRNG), also termed a deterministic random bit generator (DRBG) in standards such as NIST SP 800-90A, is a PRNG designed for cryptographic applications where security against adversarial prediction is paramount. It must produce output that is computationally indistinguishable from uniformly random bits, assuming the seed possesses sufficient entropy, such that no polynomial-time distinguisher algorithm succeeds beyond negligible advantage.[2] Core to this security is forward unpredictability: given any prefix of the output sequence, no efficient adversary can predict subsequent bits with success probability exceeding 1/2 plus a negligible function.[6] Additionally, CSPRNGs incorporate backtracking resistance, preventing reconstruction of prior internal states from observed outputs, thereby protecting against attempts to infer the seed or earlier bits.[7] In cryptographic theory, CSPRNGs align with the concept of a secure pseudorandom generator (PRG), formalized as a function G_k: \{0,1\}^k \to \{0,1\}^{p(k)} where p(k) > k stretches a short, random seed into a longer pseudorandom string, secure under the assumption that distinguishing G(s) from random requires superpolynomial resources for random seed s.[8] Practical CSPRNGs often employ cryptographic primitives like block ciphers or hash functions in modes such as counter or hash-based DRBGs, ensuring diffusion and confusion properties that thwart state recovery. These generators typically require periodic reseeding from high-entropy sources to maintain security over extended outputs, as deterministic expansion alone cannot indefinitely resist exhaustive search without fresh randomness.[7]Security Requirements
The fundamental security requirement for a cryptographically secure pseudorandom number generator (CSPRNG) is that its output, derived from a short uniform random seed, must be computationally indistinguishable from a truly random bit string of equal length by any probabilistic polynomial-time adversary. This means that for a security parameter k, if G_k maps a k-bit seed to a p(k)-bit output with p(k) > k, the advantage of distinguishing G_k(U_k) from U_{p(k)} (where U_n denotes the uniform distribution over n bits) is negligible in k.[9] Such indistinguishability ensures that no efficient algorithm can exploit the pseudorandom output for cryptographic purposes in a manner that compromises security. Equivalence holds between this property and passing the next-bit test: given any prefix of the output sequence, predicting the subsequent bit succeeds with probability at most $1/2 plus a negligible function of the security parameter, for all PPT predictors.[10] This test captures the unpredictability essential for applications like key generation, where any bias or predictability could enable attacks such as distinguishing ciphertexts from random strings or forging signatures. The next-bit test implies resistance to all polynomial-time statistical tests, confirming the output's suitability as a substitute for true randomness in computational settings.[10] Beyond basic expansion, CSPRNGs require that the seed or internal state remains computationally infeasible to recover from any polynomially many output bits, preventing reconstruction attacks that could enable prediction of unreleased portions.[11] In deterministic random bit generator (DRBG) constructions standardized for cryptographic use, such as those in NIST SP 800-90A, outputs must remain unpredictable even given knowledge of prior outputs and the algorithm, with security strength defined up to 256 bits against state determination or seed inference.[12] For iterative generators, additional properties like backtracking resistance—ensuring current state compromise does not reveal past outputs—and forward security via reseeding with fresh entropy are often mandated to mitigate persistent threats from state exposure.[11] These requirements collectively ensure the CSPRNG withstands adversarial efforts to exploit deterministic structure, grounding its reliability in the hardness of underlying computational assumptions like one-way functions.
Differences from Conventional PRNGs
Conventional pseudorandom number generators (PRNGs) produce sequences that pass statistical tests for randomness but lack guarantees against computational adversaries who may observe outputs or partial state information.[13] In contrast, cryptographically secure PRNGs (CSPRNGs) must satisfy stringent security properties, including computational indistinguishability from true random sources, where no polynomial-time algorithm can distinguish the output from uniform randomness with non-negligible advantage.[2] This requires basing CSPRNG designs on well-studied cryptographic primitives, such as block ciphers in counter mode, ensuring resistance to prediction even if an adversary knows prior outputs.[14] A key distinction lies in unpredictability: for CSPRNGs, given any sequence of prior bits, no efficient algorithm can predict the next bit with probability significantly exceeding 1/2.[2] Conventional PRNGs, such as the Mersenne Twister, often fail this, as their internal state—typically recoverable from a small number of consecutive outputs (e.g., 624 32-bit words for MT19937)—allows full sequence reconstruction and prediction.[15] CSPRNGs additionally provide backtracking resistance, preventing computation of prior states or outputs from the current state, a property absent in linear-feedback shift register-based or congruential conventional PRNGs.[16] CSPRNGs incorporate mechanisms for entropy reseeding to maintain security over long outputs, countering state compromise that could propagate in deterministic conventional PRNGs without such input.[17] While conventional PRNGs prioritize speed and long periods for non-adversarial uses like Monte Carlo simulations, CSPRNGs trade performance for cryptographic strength, often running orders of magnitude slower due to reliance on one-way functions.[18] This renders conventional PRNGs unsuitable for applications like key generation, where predictability could enable attacks, as evidenced by historical vulnerabilities in systems using them for cryptographic purposes.[14]Historical Context
Origins and Early Theoretical Foundations
The theoretical foundations of cryptographically secure pseudorandom number generators (CSPRNGs) emerged in the early 1980s amid the development of modern cryptography, driven by the need to expand short, truly random seeds into longer sequences indistinguishable from uniform randomness by computationally bounded adversaries. Prior to this, pseudorandom number generators existed for statistical simulations, such as von Neumann's 1946 middle-square method, but lacked cryptographic security guarantees against prediction or inversion by efficient algorithms.[19] The key challenge was formalizing "pseudorandomness" in a way that resisted polynomial-time attacks, rooted in the assumption that certain computational problems, like factoring or discrete logarithms, are intractable. In 1982, Andrew Yao established the core theoretical framework by defining pseudorandom sequences as those computationally indistinguishable from truly random ones by any probabilistic polynomial-time distinguisher.[20] Yao proved that this indistinguishability is equivalent to passing the "next-bit test": no efficient algorithm can predict the next bit of the sequence with success probability significantly better than 1/2, given all prior bits.[21] This equivalence demonstrated that a single predictive test suffices to validate security against all polynomial-time statistical tests, providing a foundational criterion for CSPRNGs without requiring exhaustive verification. Yao's work implied that such generators exist if one-way functions do, though unconditional constructions awaited later results; it shifted focus from statistical to computational notions of randomness, enabling practical cryptographic applications like key generation and encryption padding.[22] Building directly on Yao's definitions, Manuel Blum and Silvio Micali proposed the first concrete CSPRNG construction in 1984, using a one-way permutation to iteratively compute hardcore bits—those unpredictable even given surrounding outputs.[21] Their generator starts with a random seed s_0 in a domain where computing discrete logarithms or quadratic residuosity is hard, then outputs the least significant bit of g^{s_i} \mod p for successive s_{i+1} = g^{s_i} \mod p, where g is a generator and p a safe prime. Security relies on the intractability of predicting these bits, provably stretching the seed exponentially while satisfying Yao's next-bit criterion.[23] This paradigm influenced subsequent designs, emphasizing provable reductions to assumed-hard problems rather than ad hoc heuristics. Early limitations included efficiency concerns and reliance on specific number-theoretic assumptions, but it marked the transition from theory to constructible primitives. The formal model of a PRG as a family of functions G_k: \{0,1\}^k \to \{0,1\}^{p(k)} with p(k) > k, secure if no efficient adversary distinguishes G_k(U_k) from U_{p(k)}, encapsulates these foundations.[22]Evolution Through Key Publications and Implementations
The foundational theoretical framework for cryptographically secure pseudorandom number generators (CSPRNGs) emerged in the early 1980s amid efforts to construct sequences indistinguishable from true randomness under computational constraints. In November 1982, Manuel Blum and Silvio Micali presented a construction relying on one-way permutations, where security hinges on the unpredictability of the next output bit given prior bits, assuming the underlying function's hardness (such as discrete exponentiation).[24] This work formalized CSPRNGs as expanders from short seeds to longer pseudorandom strings, resistant to polynomial-time prediction attacks. Concurrently, Andrew Chi-Chih Yao established that passing the next-bit unpredictability test suffices for security against all polynomial-time statistical distinguishers, linking pseudorandomness to one-way functions and enabling reductions from cryptographic assumptions. Building on these ideas, Lenore Blum, Manuel Blum, and Michael Shub introduced the Blum-Blum-Shub (BBS) generator in 1986, a simple iterative scheme based on the quadratic residuosity assumption modulo a Blum integer (product of two large primes congruent to 3 mod 4). Starting from a quadratic residue seed s_0, each state s_{i+1} = s_i^2 \mod n yields a pseudorandom bit via the least significant bit (or parity), proven secure against next-bit predictors if factoring n is hard.[25] While theoretically sound, BBS's quadratic expansion and large modulus requirements limited early practicality, influencing subsequent designs prioritizing efficiency alongside provable security. Practical implementations gained traction in the late 1990s, addressing real-world entropy integration and reseeding to mitigate state compromise. In 1999, John Kelsey, Bruce Schneier, and Niels Ferguson proposed Yarrow, a family of CSPRNGs separating entropy collection (via pooling from diverse sources like hardware interrupts) from generation (using a block cipher in counter mode for expansion), with automatic reseeding after fixed output volumes to bound forward security risks.[26] This design emphasized resilience to poor entropy estimation, influencing system-level RNGs in operating systems. Refining Yarrow's approach, Ferguson and Schneier detailed Fortuna in 2003, partitioning entropy sources into pools for staggered reseeding based on pool maturity, eschewing explicit estimation to avoid underestimation vulnerabilities while supporting incremental entropy addition.[27] Standardization efforts culminated in NIST Special Publication 800-90 (2006), specifying deterministic random bit generators (DRBGs) including Hash_DRBG (SHA-based), HMAC_DRBG, CTR_DRBG (block cipher counter mode), and Dual_EC_DRBG (elliptic curve point generation). However, Dual_EC_DRBG drew scrutiny in 2007 when Dan Shumow and Niels Ferguson demonstrated that NSA-chosen curve parameters enabled efficient prediction with a 32-byte secret, raising backdoor concerns due to non-standard point selection and excessive output dependence on seeds.[28] NIST revised SP 800-90A in 2011, removing Dual_EC_DRBG amid these issues and emphasizing provable security for approved mechanisms, with further updates in 2015 incorporating strengthened instantiation requirements.[11] These developments underscored the tension between theoretical ideals and deployable, auditable designs resistant to institutional influence.Entropy Management
Sources of True Randomness
True random number sources, essential for seeding cryptographically secure pseudorandom number generators (CSPRNGs), rely on inherently unpredictable physical phenomena or stochastic processes to furnish high-quality entropy, typically measured in min-entropy bits per sample. These sources must undergo rigorous validation, such as the statistical tests outlined in NIST SP 800-90B, which include assessments for independence, uniformity, and entropy estimation via methods like histogram analysis, permutation testing, and compression estimators, to confirm their suitability for cryptographic use. Entropy from such sources ensures the CSPRNG's initial state resists prediction, as even a single predictable seed can compromise the entire output stream's security.[29] Hardware-based entropy sources predominate in cryptographic applications due to their direct coupling with physical randomness, often yielding rates from kilobits to megabits per second. Thermal noise, arising from random electron movements in resistors or diodes at finite temperatures (governed by Johnson-Nyquist noise with variance proportional to temperature and bandwidth), serves as a foundational source in many integrated circuits; for instance, implementations amplify and digitize this noise after amplification and conditioning to extract bits. Shot noise from discrete charge carrier flows in semiconductors or vacuum tubes provides another classical source, exhibiting Poisson statistics that yield near-ideal randomness when properly sampled. Radioactive decay processes, such as alpha particle emissions from isotopes like Americium-241 (half-life of 432.2 years), offer timing unpredictability, though practical devices like the HotBits service have historically generated bits from decay counts at rates around 1 per second per detector.[30][31] Quantum mechanical sources leverage fundamental indeterminacy for superior entropy purity, often validated under NIST frameworks. Photonic quantum random number generators (QRNGs), which measure photon arrival times or polarizations in attenuated laser beams split by beam splitters (exploiting single-photon detection with dark count rates below 100 Hz), achieve entropy rates exceeding 1 Gbps in commercial modules; Quside's implementations, certified compliant with SP 800-90B as of 2023, use integrated photonics for tamper-resistant operation. Vacuum fluctuations or spontaneous parametric down-conversion in nonlinear crystals generate entangled photon pairs, with detection asymmetries providing bits immune to classical correlations. These outperform classical sources in resisting environmental biases but require post-processing like von Neumann debiasing to mitigate detector inefficiencies.[32][33] System-level software aggregation of entropy, while convenient, draws from indirect observables that approximate true randomness through high variance and low predictability. Operating systems like Linux's /dev/random pool entropy from hardware interrupts (e.g., timer ticks varying by microseconds due to jitter), disk seek times, inter-packet arrival delays in network traffic, and user inputs such as keystroke latencies (typically 10-100 ms with sub-ms resolution limits). These sources, hashed into an entropy pool (often 256-4096 bits), provide seeds but demand conditioning to address potential depletion or correlation; for example, Intel's RDRAND instruction, integrated since 2012 Ivy Bridge processors, internally seeds from thermal and jitter-based hardware entropy before outputting conditioned bits. Validation per SP 800-90B remains critical, as unvetted aggregation can yield min-entropy below 0.5 bits per sample under adversarial conditions like virtualized environments.[29][34]Extraction and Conditioning Methods
Conditioning components process outputs from noise sources to mitigate bias, correlations, and other imperfections, thereby increasing the entropy rate and producing bits suitable for seeding cryptographically secure pseudorandom number generators (CSPRNGs). These methods ensure that the resulting randomness meets cryptographic security requirements, such as indistinguishability from uniform randomness under computational bounded adversaries, by leveraging primitives resistant to known attacks. Inadequate conditioning can propagate weaknesses, leading to predictable CSPRNG outputs if an adversary exploits source predictability.[29] Non-cryptographic extractors, such as the von Neumann debiaser introduced in 1951, provide a simple baseline for handling biased Bernoulli sources. It pairs consecutive bits, outputting 0 for a 01 pair, 1 for a 10 pair, and discarding 00 or 11 pairs; the output bits are unbiased (probability 0.5 each) regardless of input bias p, with extraction rate $2p(1-p), which equals the binary entropy H(p). This method extracts entropy without assuming uniformity but suffers from inefficiency—e.g., rate 0.11 bits per input bit at p=0.9—and vulnerability to structured non-i.i.d. biases, rendering it unsuitable for cryptographic use without augmentation. Iterated variants, like Peres' multiple-pass extractor from 1992, improve efficiency by recursively applying von Neumann to discarded pairs, achieving rates approaching full input entropy for mildly biased sources.[35][36] For cryptographic security, standards mandate vetted conditioning using primitives that preserve or amplify min-entropy while resisting forgery or inversion attacks. NIST SP 800-90B, published January 2018, approves specific components for entropy sources feeding random bit generators, allowing full-entropy claims (up to output length) if input min-entropy exceeds a threshold (e.g., via the formula Output_Entropy = min(n_out, h_in * (n_in / n_w))), where n_w is the narrowest input width. These functions process fixed-length inputs, often concatenated noise samples with nonces, to produce uniform-like outputs. Truncation may follow, proportionally reducing claimed entropy. Non-vetted methods cap entropy at 0.999 times output length to account for unquantified risks.[29] Approved conditioning functions per NIST SP 800-90B include:| Function | Type | Keyed/Unkeyed | Block Size / Output Length | Primitive Basis |
|---|---|---|---|---|
| HMAC | Deterministic | Keyed | Hash output size | Approved hash (e.g., SHA-256 per FIPS 180-4) |
| CMAC | Deterministic | Keyed | 128 bits | AES (FIPS 197) per SP 800-38B |
| CBC-MAC | Deterministic | Keyed | 128 bits | AES, restricted to RBG use (Appendix F) |
| Hash | Deterministic | Unkeyed | Hash output size | Approved hash (FIPS 180/202) |
| Hash_df | Deterministic | Unkeyed | Hash output size | Approved hash per SP 800-90A |
| Block_Cipher_df | Deterministic | Unkeyed | AES key size (128/256 bits) | AES per SP 800-90A |