Fact-checked by Grok 2 weeks ago

Linear congruential generator

A linear congruential generator (LCG) is a pseudorandom number generator (PRNG) algorithm that produces a sequence of numbers approximating randomness through the iterative formula X_{n+1} = (a X_n + c) \mod m, where m > 0 is the modulus defining the range, a is the multiplier (with $0 < a < m), c is the increment (with $0 \leq c < m), and X_0 is the initial seed value (with $0 \leq X_0 < m). The output is typically normalized to uniform variates in the interval [0, 1) via U_n = X_n / m, making it suitable for simulations requiring sequences that mimic independent uniform random variables. LCGs are among the simplest and most efficient PRNGs, relying on modular arithmetic to ensure the sequence is deterministic yet appears random for practical purposes. The LCG was first proposed by mathematician D. H. Lehmer in 1949 during a symposium on large-scale computing at Harvard University, as a method to generate reliable random numbers for Monte Carlo simulations on early electronic computers. Lehmer detailed the approach in his 1951 paper "Mathematical Methods in Large-Scale Computing Units," emphasizing its use of number theory to achieve long periods without complex machinery. This innovation addressed the need for computational randomness in statistical sampling and scientific computing during the post-World War II era of digital machinery development. Early implementations, such as those on the ENIAC, demonstrated its practicality, though initial parameters were chosen heuristically before rigorous analysis emerged. The quality of an LCG, particularly its period (the length before the sequence repeats), depends critically on the choice of parameters a, c, and m. The maximum possible period is m, achieved when the sequence cycles through all residues modulo m exactly once. The Hull–Dobell theorem (1962) provides necessary and sufficient conditions for full-period LCGs: c and m must be coprime; a - 1 must be divisible by every prime factor of m; and if 4 divides m, then 4 must divide a - 1. Common choices include prime moduli like m = 2^{31} - 1 (a Mersenne prime) for large periods, with c = 0 yielding multiplicative congruential generators (a special case known as Lehmer generators). Poor parameter selection can result in short periods or detectable patterns, compromising randomness. LCGs have been widely adopted in software libraries and systems due to their speed and low memory requirements, powering simulations in fields like physics, finance, and computer graphics. Notable examples include the RANDU generator (IBM, 1960s, with m = 2^{31}, a = 65539, c = 0), though it suffered from lattice structure defects leading to correlations in multidimensional projections. Modern variants, such as those in the Numerical Recipes library or Java's java.util.Random (using m = 2^{48}, a = 25214903917, c = 11), incorporate improved parameters to pass statistical tests like the Diehard suite. Despite their efficiency, LCGs exhibit limitations in high dimensions, where they fail spectral tests and produce uneven distributions, prompting the development of more advanced PRNGs like Mersenne Twister for demanding applications. They remain foundational in teaching random number generation and as components in combined generators.

Fundamentals

Definition

A linear congruential generator (LCG) is an algorithmic method for producing sequences of pseudorandom numbers through deterministic linear arithmetic operations, creating the appearance of randomness from an initial seed value. This approach relies on simple recurrence to generate integers that, when scaled, yield values uniformly distributed across a defined interval, mimicking true random behavior for computational purposes. LCGs serve as a foundational tool in applications requiring simulated randomness, such as Monte Carlo methods for numerical integration, stochastic modeling of physical systems, and statistical testing of hypotheses. Their appeal lies in their computational simplicity and efficiency, as they require minimal storage—only the current state—and execute rapidly on basic hardware, making them suitable for resource-constrained environments. In essence, an LCG produces a repeatable sequence of integers within a fixed modular range that statistically resemble independent random draws, enabling reproducible experiments in simulations. The quality of such sequences is partly assessed by their period length, the number of steps before repetition occurs, which influences their reliability in long-running applications. LCGs played a pivotal role in early computing by providing an accessible means to generate pseudorandom numbers when hardware-based randomness was unavailable or impractical.

Formulation

A linear congruential generator produces a sequence of integers through the iterative formula X_{n+1} = (a X_n + c) \mod m, where X_n represents the current state (an integer between 0 and m-1), m > 0 is the modulus defining the range of states, a is the multiplier (an integer such that $0 \leq a < m), and c is the increment (an integer such that $0 \leq c < m). The sequence begins with an initial seed value X_0, and each subsequent state is computed by scaling the previous state by a, adding c, and taking the result modulo m to ensure it remains within the bounded interval. This linear transformation followed by modular reduction forms the core mechanism, making the process computationally efficient while aiming to produce numbers that appear statistically random. To generate pseudo-random numbers uniformly distributed in the interval [0, 1), the integer state X_n is normalized by dividing by the modulus: U_n = \frac{X_n}{m}. This scaling maps the discrete integers from 0 to m-1 to fractional values from 0 inclusive to 1 exclusive, providing a continuous-like uniform distribution suitable for many simulations and modeling applications. For illustration, consider parameters m=10, a=3, c=1, and seed X_0=0:
  • X_1 = (3 \cdot 0 + 1) \mod 10 = 1
  • X_2 = (3 \cdot 1 + 1) \mod 10 = 4
  • X_3 = (3 \cdot 4 + 1) \mod 10 = 13 \mod 10 = 3
  • X_4 = (3 \cdot 3 + 1) \mod 10 = 10 \mod 10 = 0
The corresponding outputs are U_0 = 0/10 = 0, U_1 = 1/10 = 0.1, U_2 = 4/10 = 0.4, U_3 = 3/10 = 0.3, and U_4 = 0/10 = 0. This step-by-step computation demonstrates the straightforward iteration without evaluating deeper properties. As a deterministic algorithm operating on a finite set of m possible states, the LCG's sequence must eventually repeat, forming a cycle after at most m steps due to the pigeonhole principle applied to the state transitions.

Parameters

A linear congruential generator (LCG) is defined by four key parameters: the multiplier a, the increment c, the modulus m, and the seed X_0. The multiplier a is an integer that scales the previous value, influencing the mixing and distribution properties of the generated sequence. The increment c serves as an additive constant, helping to introduce variety and avoid degenerate cycles. The modulus m determines the range of the output values, typically producing integers from 0 to m-1, and thus controls the upper bound of the sequence period. The seed X_0 is the initial value that starts the sequence, which can be any integer in the interval [0, m-1] to ensure the generator begins within its defined state space. Selecting appropriate values for these parameters is crucial for achieving desirable statistical properties in the output sequence. The modulus m is often chosen as a large power of 2 (e.g., $2^{31} or $2^{48}) for computational efficiency in modular arithmetic on binary hardware, or as a large prime to facilitate full-period generation. The multiplier a should be selected to promote good mixing, such as values that are primitive roots modulo m or satisfy conditions ensuring low serial correlations, typically in the range where a \mod m avoids small residues. The increment c is commonly set to 1 or another odd integer coprime to m to enhance the potential for a maximal period and better randomness. The seed X_0 has minimal impact on overall quality if the other parameters are well-chosen, but varying it across runs allows reproducibility while exploring different sequence starts. Poor choices of parameters can severely degrade the randomness of the sequence. For instance, setting a = 1 results in an arithmetic progression X_{n+1} = (X_n + c) \mod m, which exhibits clear predictability and fails basic statistical tests for uniformity. Similarly, if c = 0 and a shares factors with m, the sequence may collapse into a short cycle confined to a subgroup of residues, leading to clustered outputs rather than uniform distribution. Such flaws manifest in visible patterns, like lattice structures in scatter plots of successive values, compromising applications in simulation or Monte Carlo methods. To evaluate parameter quality beyond period length, the spectral test is commonly employed as a figure of merit. This test assesses the lattice structure formed by pairs or tuples of successive LCG outputs in the unit hypercube, measuring the minimal distance between hyperplanes containing these points; larger normalized distances indicate better uniformity and less detectable structure. It guides the selection of a and m by identifying sets that maximize this distance up to a specified dimension.

Period Length

Hull-Dobell Theorem

The Hull-Dobell Theorem establishes the necessary and sufficient conditions under which a linear congruential generator (LCG), defined by the recurrence X_{n+1} = (a X_n + c) \mod m, achieves its maximal period of length m, meaning the sequence visits every residue class modulo m exactly once before repeating. Specifically, the period is m if and only if: (i) c and m are coprime (i.e., \gcd(c, m) = 1); (ii) a \equiv 1 \pmod{p} for every prime p dividing m; and (iii) if $4 divides m, then a \equiv 1 \pmod{4}. The proof begins by deriving the closed-form expression for the nth term (assuming a \neq 1): X_n = a^n X_0 + c \frac{a^n - 1}{a - 1} \pmod{m}. For the period to be m, the smallest positive integer n such that X_n \equiv X_0 \pmod{m} for all initial seeds X_0 must be n = m. The conditions ensure that a^m \equiv 1 \pmod{m} and the multiplier term \frac{a^m - 1}{a - 1} \equiv 0 \pmod{m}, implying the sequence returns to the starting point only after m steps. This is verified first for prime power moduli m = p^k (where p is prime) by showing divisibility up to p^k, then extended to general composite m via the Chinese Remainder Theorem. The "only if" direction confirms no smaller period exists by checking divisors of m. These conditions guide the selection of LCG parameters to attain the full period, thereby maximizing the cycle length and avoiding short cycles that could compromise randomness. For instance, when m is a power of 2 (common in binary computers), c must be odd and a \equiv 5 \pmod{8} to satisfy the criteria. Similarly, for decimal moduli like m = 10^k, c must avoid factors of 2 or 5, and a \equiv 1 \pmod{20}. Failure to meet any condition results in a period strictly less than m. To illustrate, consider m = 5, a = 6, c = 1. Here, \gcd(1, 5) = 1, and the only prime dividing 5 is p = 5 with $6 \equiv 1 \pmod{5}; since 4 does not divide 5, condition (iii) is vacuously true. Starting from X_0 = 0, the sequence is 0, 1, 2, 3, 4, 0, confirming a full period of 5.

Prime Modulus with Zero Increment

When the increment c = 0 and the modulus m is a prime number, the linear congruential generator reduces to the multiplicative congruential generator, defined by the simplified recurrence relation X_{n+1} = a X_n \pmod{m}, where a is the multiplier and the initial seed X_0 is typically chosen to be coprime to m (i.e., nonzero since m is prime). This form excludes zero from the generated sequence unless X_0 = 0, in which case the sequence remains fixed at zero. The period of this sequence, starting from a nonzero X_0, is the multiplicative order of a modulo m, which is the smallest positive integer k such that a^k \equiv 1 \pmod{m}. By Fermat's Little Theorem, since m is prime and a is not divisible by m, the order k divides m-1. The maximum possible period of m-1—achieving a full cycle through all nonzero residues modulo m—occurs precisely when the order of a is m-1, meaning a is a primitive root modulo m. A primitive root exists for every prime m, and there are \phi(m-1) such multipliers a, where \phi is Euler's totient function. To illustrate, consider m = 7 (a prime) and X_0 = 1. If a = 3, which is a primitive root modulo 7 (as its powers modulo 7 are 3, 2, 6, 4, 5, 1, confirming order 6), the sequence is 1, 3, 2, 6, 4, 5, then returns to 1, yielding a full period of 6 and visiting all nonzero residues. In contrast, for a = 2 (order 3, since $2^3 \equiv 1 \pmod{7}), the sequence is 1, 2, 4, then returns to 1, resulting in a shorter period of 3 and only three distinct nonzero values.

Power-of-2 Modulus with Zero Increment

When the modulus m is a power of 2, denoted as m = 2^k with k \geq 3, and the increment c = 0, the linear congruential generator simplifies to a multiplicative congruential generator with the recurrence relation X_{n+1} = a X_n \mod m, where a is the multiplier and X_0 is the initial seed. In this configuration, the period—the length before the sequence repeats—is strictly limited and cannot achieve the full modulus length m. The maximum possible period is m/4 = 2^{k-2}, attained only if the seed X_0 is odd (relatively prime to m) and the multiplier a satisfies a \equiv 3 \pmod{8} or a \equiv 5 \pmod{8}. Otherwise, the period is shorter, often degenerating to 1 or 2 depending on a and X_0. This result follows from the Hull–Dobell theorem, which provides necessary and sufficient conditions for maximal period in multiplicative generators, highlighting the structural constraints imposed by powers-of-2 moduli. The period limitation arises because c = 0 causes the sequence to remain within residues sharing the same parity as X_0: if X_0 is even, all subsequent values are even, restricting the sequence to at most m/2 possible states. To maximize the period, X_0 must be odd, confining the sequence to the m/2 odd residues modulo m. However, the multiplicative structure further halves this effective range, as the order of a in the multiplicative group of units modulo $2^k (the smallest positive integer d such that a^d \equiv 1 \pmod{2^k}) cannot exceed $2^{k-2}. Thus, even starting from an odd seed, the cycle length tops out at m/4. For illustration, consider m = 8 (k = 3) and a = 5 (\equiv 5 \pmod{8}). With odd seed X_0 = 1, the sequence is $1, 5, 1, \dots, yielding period 2 (= 8/4). In contrast, for a = 1, the sequence remains fixed at X_0 regardless of the seed, giving period 1.

Power-of-2 Modulus with Non-Zero Increment

When the modulus m = 2^k for k \geq 1 and the increment c \neq 0, a linear congruential generator can achieve the full period of length m under conditions derived from the Hull-Dobell theorem specialized to powers of 2. Specifically, the period is m if c is odd (i.e., \gcd(c, m) = 1), a \equiv 1 \pmod{2} (i.e., a is odd), and for m \geq 4 (i.e., k \geq 2), a \equiv 1 \pmod{4}. These ensure the recurrence relation produces a permutation of all residues modulo m. A common choice satisfying a \equiv 1 \pmod{4} is a \equiv 5 \pmod{8}, which additionally promotes better uniformity in the lower bits of the generated numbers. The condition that c is odd guarantees that adding the increment shifts the sequence across all parity classes, allowing it to explore the entire residue system rather than being trapped in even or odd subsets. The requirements on a prevent premature cycling by ensuring the multiplicative step mixes the values sufficiently within the binary hierarchy of the modulus; for instance, a \equiv 1 \pmod{4} for m \geq 4 makes the map x \mapsto a x + c \pmod{m} bijective, avoiding fixed subgroups or short orbits in \mathbb{Z}/m\mathbb{Z}. Without these, the period drops, such as to m/2 if c is even or if a \not\equiv 1 \pmod{4}. Consider the parameters m = 16, a = 5, c = 1 (noting c = 1 is odd and a = 5 \equiv 1 \pmod{4}, $5 \equiv 5 \pmod{8}). Starting with seed X_0 = 0, the sequence is: \begin{align*} X_1 &= 1, \\ X_2 &= 6, \\ X_3 &= 15, \\ X_4 &= 12, \\ X_5 &= 13, \\ X_6 &= 2, \\ X_7 &= 11, \\ X_8 &= 8, \\ X_9 &= 9, \\ X_{10} &= 14, \\ X_{11} &= 7, \\ X_{12} &= 4, \\ X_{13} &= 5, \\ X_{14} &= 10, \\ X_{15} &= 3, \\ X_{16} &= 0. \end{align*} This visits all 16 residues exactly once, confirming the full period of 16. In a failure case with a = 3 (odd but $3 \not\equiv 1 \pmod{4}) and same m, c, X_0, the sequence is 0, 1, 4, 13, 8, 9, 12, 5, 0, yielding only period 8 by cycling through a proper subgroup. For small moduli, limitations arise inherently. For m = 2, any valid LCG (with odd c and a) has period 2, as the sequence simply alternates between 0 and 1 regardless of parameters. For m = 4, full period 4 is possible with the conditions above, but the limited state space often results in sequences with poor randomness, such as incremental shifts.

Common Parameters

Historical Selections

Early implementations of linear congruential generators (LCGs) were heavily influenced by the hardware constraints of mid-20th-century computers, particularly the 32-bit integer word sizes common in systems like IBM's mainframes. These choices prioritized computational efficiency, such as multipliers that could be evaluated using simple bit shifts and additions, over optimal statistical properties, often leading to generators with long periods but detectable correlations. One prominent historical example is RANDU, introduced by IBM in the 1960s as part of their Scientific Subroutine Package for systems like the IBM 360. It uses parameters m = 2^{31}, a = 65539, and c = 0, where the multiplier a = 2^{16} + 3 was selected for its hardware-friendly form, enabling fast computation via shifts on 32-bit architectures. Despite achieving a period of $2^{29}, RANDU exhibited poor quality, with successive triples of outputs lying on at most 15 hyperplanes in the unit cube, causing significant correlations unsuitable for multidimensional simulations. In response to such limitations, a improved Lehmer-style generator was proposed for the IBM System/360 in 1969, with parameters m = 2^{31} - 1 (a Mersenne prime), a = 16807, and c = 0. This selection leveraged the prime modulus for a maximal period of m - 1, while the multiplier was chosen to satisfy number-theoretic conditions for full-period behavior and to align with 32-bit multiplication efficiency on the System/360's architecture. The parameters ensured portability across similar hardware and passed early statistical tests, making it a step forward from RANDU despite the era's computational trade-offs. These parameters were later formalized as the "minimal standard" LCG by Park and Miller in 1988, retaining m = 2^{31} - 1, a = 16807, and c = 0 for their proven period of $2^{31} - 2 and robustness under spectral and other tests. The endorsement highlighted their suitability for 32-bit systems, building on the 1969 proposal while addressing implementation pitfalls like overflow handling. Historical selections like these underscore the evolution from hardware-driven compromises to theoretically grounded designs.

Contemporary Choices

In modern software libraries, linear congruential generators (LCGs) continue to be employed for non-cryptographic applications such as simulations and general-purpose randomization, with parameter selections optimized for larger state spaces and better statistical properties compared to earlier implementations. One prominent example is the LCG used in Java's java.util.Random class, which employs a modulus m = 2^{48}, multiplier a = 25214903917, and increment c = 11, achieving a full period of $2^{48} due to the power-of-2 modulus and appropriate parameter choices that satisfy the Hull-Dobell conditions. The "minimal standard" LCG, widely adopted in scientific computing and referenced in Numerical Recipes, uses a prime modulus m = 2^{31} - 1 = 2147483647, multiplier a = 16807, and increment c = 0 (a multiplicative congruential generator variant), providing a period of m - 1 and passing key spectral tests for randomness in 32-bit applications. This set balances computational efficiency with acceptable uniformity for Monte Carlo methods and is implemented in various numerical libraries. For 64-bit systems, contemporary LCG designs often select moduli like $2^{64} or large primes near $2^{64} to leverage native hardware word sizes, but implementations must address potential arithmetic overflow in the multiplication step a \cdot X_n, which can exceed 128 bits; this is typically mitigated by using extended-precision arithmetic (e.g., 128-bit integers) or unsigned 64-bit operations that naturally wrap around modulo $2^{64}. Standards bodies like NIST advise against using LCGs for cryptographic purposes due to their predictability and failure in advanced statistical tests (e.g., linear complexity and runs tests), recommending them instead only for low-stakes simulations where speed is prioritized over security, with thorough validation using suites like SP 800-22.

Properties

Advantages

Linear congruential generators (LCGs) are renowned for their computational efficiency, as each pseudorandom number is produced using a simple recurrence relation involving only multiplication, addition, and modulo operations, which are optimized in most computer architectures. This minimal set of arithmetic operations enables extremely fast generation rates, often outperforming more complex pseudorandom number generators in resource-constrained environments. A key advantage of LCGs is their low memory footprint, requiring storage for just a single integer to represent the generator's state, in contrast to methods that need arrays or multiple values. This compactness makes LCGs particularly suitable for embedded systems or applications with limited RAM. LCGs exhibit strong portability across different hardware and software platforms due to their reliance on basic integer arithmetic, which is standardized in programming languages and avoids dependencies on specialized hardware instructions. This simplicity facilitates easy implementation and consistent behavior in diverse computing environments, from desktops to parallel systems. Finally, the deterministic nature of LCGs ensures predictable and reproducible sequences when initialized with a fixed seed, allowing exact replication of random number streams for debugging, validation, and comparative simulations without variability from true randomness.

Disadvantages

Linear congruential generators (LCGs) exhibit poor statistical properties, particularly in their tendency to produce sequences that form visible lattice structures when plotted in two or three dimensions. This arises because successive points generated by an LCG lie on a finite number of parallel hyperplanes in n-dimensional space, leading to non-uniform distributions that deviate from true randomness. A notorious example is the RANDU generator, which uses parameters a = 65539, c = 0, and m = 2^{31}, resulting in three-dimensional points confined to just 15 planes, severely limiting its utility for multidimensional simulations. Despite achieving a maximal theoretical period under certain conditions, LCGs suffer from a short effective period in low dimensions, where the lattice structure restricts the sequence's coverage and introduces correlations that undermine randomness in practical applications like Monte Carlo methods. This dimensional weakness means that even full-period LCGs fail to provide adequate randomness for tasks requiring high-dimensional uniformity, as the points cluster in low-dimensional subspaces rather than filling the space evenly. LCGs are unsuitable for cryptographic purposes due to their predictability; given a few consecutive outputs, an attacker can infer the internal state and parameters, allowing reconstruction of the entire sequence. This vulnerability stems from the linear recurrence relation, which enables efficient lattice reduction techniques to solve for the multiplier and modulus from partial observations, rendering secret LCGs cryptographically insecure even with hidden parameters. The quality of LCG output is highly sensitive to parameter choices, with poor selections leading to visible patterns such as short cycles or correlated bits that are easily detectable in scatter plots or statistical tests. For instance, inadequate multipliers can amplify low-dimensional lattice effects, producing sequences with obvious regularities that compromise simulation integrity.

Statistical Properties

The statistical properties of linear congruential generators (LCGs) are evaluated using empirical and theoretical tests to assess their uniformity, independence, and overall randomness quality. These tests reveal that while LCGs can perform adequately in one dimension, they often exhibit structural deficiencies in higher dimensions due to their inherent lattice structure. Key assessments include the spectral test for lattice irregularities, chi-square tests for uniformity, and serial correlation measures for dependence between successive values. The spectral test, a theoretical measure of an LCG's multidimensional uniformity, quantifies the worst-case lattice structure by examining the spacing between parallel hyperplanes on which the generated points lie. For an LCG with modulus m, the test computes the maximum distance \delta_d between adjacent hyperplanes in d dimensions; a good generator maximizes this minimum spacing to approximate uniform distribution. The performance is bounded below by \delta_d^* = 1 / (\gamma_d m^{1/d}), where \gamma_d is a dimension-dependent constant (approximately $3.9^d / d! for large d), and the normalized criterion g_d = \delta_d m^{1/d} should approach \gamma_d for optimal equidistribution. LCGs typically achieve only a fraction of this bound, with the number of hyperplanes limited to at most m^{1/d}, leading to detectable patterns via Fourier analysis. Empirical tests like the chi-square test evaluate one-dimensional uniformity by comparing observed frequencies in bins to expected uniform probabilities. The test statistic is \chi^2 = \sum_{j=1}^k \frac{(N_j - n p_j)^2}{n p_j}, where N_j is the count in bin j, n is the sample size, p_j = 1/k for equal bins, and it follows a \chi^2 distribution with k-1 degrees of freedom under the null hypothesis of uniformity; rejection occurs if \chi^2 > \chi^2_{k-1,1-\alpha}. Serial correlation tests assess independence by computing the coefficient r_k = \frac{\sum_{i=1}^{n-k} (U_i - \bar{U})(U_{i+k} - \bar{U})}{\sum_{i=1}^n (U_i - \bar{U})^2} for lag k, which should be near zero for random sequences; significant deviations indicate dependence. LCGs often pass basic chi-square tests in one dimension but show elevated serial correlations, especially for poor parameters. A notorious example is the RANDU generator, defined by X_{n+1} = 65539 X_n \mod 2^{31}, which passes one- and two-dimensional uniformity tests but fails miserably in higher dimensions. In three dimensions, its points lie on only 15 parallel planes with separation approximately 0.092, causing clumping and poor coverage detectable by the spectral test and multidimensional chi-square variants. This structural flaw renders RANDU unsuitable for simulations requiring high-dimensional uniformity, despite its full period of $2^{31}. Regarding equidistribution, an LCG with full period is equidistributed in one dimension, meaning the normalized sequence \{X_n / m\} uniformly fills [0,1) over a complete cycle. However, in d > 1 dimensions, full-period LCGs achieve (d, w)-equidistribution only up to limited word lengths w, as points form a distorted lattice rather than an independent uniform grid, failing to ensure statistical independence between coordinates. This property holds theoretically for the sequence but breaks down empirically in higher dimensions due to correlations.

Implementations

Algorithm Description

A linear congruential generator (LCG) produces a sequence of pseudo-random integers through an iterative linear recurrence, defined by the relation X_{n+1} = (a X_n + c) \mod m, where m > 0 is the modulus, $0 < a < m is the multiplier, $0 \leq c < m is the increment, and the initial value X_0 (with $0 \leq X_0 < m) serves as the seed. To obtain uniform random numbers in the interval [0, 1), each generated X_n is normalized by dividing by m, yielding U_n = X_n / m. This method, originally proposed in multiplicative form (with c = 0) for early computing machinery, forms the basis for efficient pseudorandom number generation in fixed-point arithmetic. The implementation proceeds in two main phases: initialization and iterative generation. First, the parameters a, c, and m are chosen to optimize properties like period length, followed by setting the seed X_0. Seeding strategies include using a fixed value for reproducible sequences in testing or a time-based value (such as the current system timestamp) to introduce variability across runs, ensuring the initial state is within the valid range to avoid degenerate sequences. The generation loop then computes successive states until a predetermined count n is reached or the full period is exhausted, though practical applications typically halt after generating a fixed number of values to suit computational needs. In pseudocode, the algorithm is expressed as follows:
Initialize X ← X_0  // Seed value, 0 ≤ X_0 < m
For i = 1 to n:     // Generate n numbers
    X ← (a * X + c) mod m
    Output X / m     // Normalized uniform random number in [0, 1)
To handle potential overflow during the multiplication a \times X when using fixed-width integers (e.g., 32-bit for m = 2^{32}), computations are performed in a wider type such as 64-bit integers before applying the modulo operation, ensuring the intermediate product fits without loss. This approach maintains numerical stability and leverages hardware support for modular arithmetic in power-of-two moduli.

Python Example

The following Python implementation provides a basic linear congruential generator using the original minimal standard parameters by Park and Miller: multiplier a = 16807, increment c = 0, and modulus m = 2^{31} - 1 = 2147483647. These parameters yield a period of m-1 for seeds not equal to 0, making it suitable for illustrative purposes.
python
def lcg(seed, a=16807, c=0, m=2147483647, n=10):
    """
    Generate n pseudo-random numbers in [0, 1) using a linear congruential generator.
    
    Parameters:
    - seed: Initial seed value (integer).
    - a: Multiplier.
    - c: Increment.
    - m: Modulus.
    - n: Number of values to generate.
    
    Returns:
    - List of floats in [0, 1).
    """
    x = seed % m  # Normalize seed to [0, m)
    sequence = []
    for _ in range(n):
        x = (a * x + c) % m
        u = x / m  # Scale to [0, 1)
        sequence.append(u)
    return sequence

# Example usage with a fixed seed for reproducibility
seed = 12345
numbers = lcg(seed, n=10)
print("First 10 generated numbers in [0, 1):")
for i, num in enumerate(numbers, 1):
    print(f"{i}: {num:.10f}")
To use a random seed instead, import the random module and set seed = int(random.random() * m) before calling the function; this provides variability across runs while staying within the valid range. Avoid seed=0 to achieve the full period when c=0. Python's arbitrary-precision integers ensure that the intermediate product a \times x does not overflow before applying the modulo operation, simplifying implementation compared to fixed-width integer languages.

C Example

A practical C implementation of a linear congruential generator (LCG) can utilize the parameters recommended by Park and Miller for their minimal standard generator, where the modulus m = 2^{31} - 1 = 2147483647, the multiplier a = 16807, and the increment c = 0 (making it a multiplicative congruential generator). These choices ensure a period of m-1 = 2{,}147{,}483{,}646 when the seed is not 0, providing a sequence of up to 2,147,483,646 distinct values before repetition. To prevent arithmetic overflow during computation—since the product a \times seed can exceed 32-bit integer limits (reaching approximately $2^{44})—the implementation employs 64-bit unsigned integers (unsigned long long) for intermediate calculations, which are standard in modern C compilers and portable across 32-bit and 64-bit systems. The modulo operation is then applied directly, leveraging the hardware's 64-bit arithmetic to maintain precision without needing specialized methods like Schrage's algorithm for this modulus size. The following standalone C program demonstrates the LCG, initializing with a seed of 1 and printing the first 10 values in the sequence as floating-point numbers between 0 and 1 (exclusive), scaled by dividing by m. It includes necessary headers for input/output and uses printf for output.
c
#include <stdio.h>

unsigned long long lcg(unsigned long long seed, unsigned long long a, unsigned long long c, unsigned long long m) {
    return (a * seed + c) % m;
}

int main() {
    unsigned long long seed = 1ULL;
    unsigned long long a = 16807ULL;
    unsigned long long c = 0ULL;
    unsigned long long m = 2147483647ULL;
    
    printf("First 10 LCG values (scaled to [0,1)):\n");
    for (int i = 0; i < 10; ++i) {
        seed = lcg(seed, a, c, m);
        double value = (double)seed / m;
        printf("%.6f\n", value);
    }
    return 0;
}
This code is compatible with both 32-bit and 64-bit architectures, as unsigned long long is guaranteed to be at least 64 bits wide per the C99 standard, ensuring the multiplication does not overflow before the modulo reduction. On 32-bit systems, it relies on compiler support for 64-bit operations, which is ubiquitous in contemporary environments like GCC and Clang. Compiling with gcc example.c -o lcg yields an executable that produces the sequence starting with approximately 0.000008, 0.000132, and so on.

Derivatives

Multiply-with-Carry Generators

Multiply-with-carry (MWC) generators extend the linear congruential generator framework by incorporating a carry term derived from the multiplication overflow, enabling multi-word states that enhance randomness properties. This mechanism treats the state as a multi-digit number in base m, where each iteration updates the least significant digit and propagates a carry to the next, effectively simulating a linear recurrence across the entire state vector. The core formulation for a basic single-word MWC is given by
Let \text{temp} = a X_n + c + \text{carry}_n; then
X_{n+1} = \text{temp} \mod m, \text{carry}_{n+1} = \left\lfloor \frac{\text{temp}}{m} \right\rfloor, where a is the multiplier, c is the increment (often 0), m is the modulus, and the initial carry is typically small (e.g., 1). For multi-word versions of order k, the state consists of k words (X_n, X_{n-1}, \dots, X_{n-k+1}), and the recurrence applies similarly, with the carry influencing subsequent words to achieve full-period behavior. Compared to standard LCGs, MWC generators offer significantly longer periods, potentially reaching m^k for a k-word state when parameters are chosen such that the associated connection polynomial yields full periodicity, far exceeding the m limit of basic LCGs. They also exhibit improved equidistribution, satisfying optimal d-dimensional uniformity for dimensions d < k under suitable parameter selection, leading to better spectral properties and reduced lattice structure artifacts in low dimensions. A representative multi-word example is the 4-lag MWC with m = 2^{32}, coefficients 5115, 1776, 1492, 2111111111, and initial carry up to 2111119494, which provides a long period suitable for 32-bit arithmetic. For multi-word implementations, parameters are selected to ensure the multiplier a makes the effective modulus prime or yields maximal order, enhancing overall cycle length. MWC generators are particularly valuable in combined pseudorandom number systems, where they are paired with other methods (e.g., shift-register or additional LCG components) to generate high-dimensional sequences with superior statistical independence and longer effective periods for applications like Monte Carlo simulations.

Other LCG Variants

Shuffled linear congruential generators (LCGs) modify the basic LCG structure by applying a permutation to the generated outputs, typically using a buffer and an auxiliary generator to select indices randomly. This shuffling disrupts lattice structures and correlations inherent in standard LCG sequences, improving statistical properties such as uniformity and independence without altering the core recurrence. For instance, one common approach fills a fixed-length vector (e.g., 128 elements) with values from the primary LCG, then uses a secondary LCG to choose a random index for output, replacing the selected slot with the next primary value. Lagged Fibonacci generators based on LCGs initialize their state using an LCG to produce the initial lag values, then apply a lagged recurrence for subsequent numbers, achieving longer periods than the seeding LCG alone. This hybrid approach leverages the simplicity of LCG seeding to set up the lag buffer (e.g., lags of 24 and 55), after which the generator computes values via addition or multiplication modulo m, enhancing period length to near 2^k for k-bit states. Such variants appear in implementations like Go's math/rand package, where an LCG transforms the seed into LFG state. Additive congruential generators extend LCGs by varying the increment c across multiple recursive steps, often as a special case of multiple recursive generators where additions replace multiplications. The recurrence typically takes the form X_n = (X_{n-p} + X_{n-q} + c) mod m, with fixed lags p and q, allowing for full-period properties under certain conditions on m and the lags. This variant avoids multiplication for faster computation on parallel systems and exhibits good equidistribution in serial and vectorized applications. A practical example of a shuffled LCG is the rand() function in Microsoft Visual C++, which uses parameters m = 2^{31}, a = 214013, and c = 2531011, followed by a right-shift by 16 bits and masking to 15 bits for output. This bit shuffling mitigates low-bit dependencies, producing values in [0, 32767] with improved randomness for general-purpose use.

History

Early Development

The development of linear congruential generators (LCGs) traces its roots to the mid-1940s, when the need for reliable pseudo-random numbers emerged in computational simulations. In 1946, mathematicians John von Neumann and Stanislaw Ulam, working at Los Alamos National Laboratory, pioneered the Monte Carlo method for solving complex problems in physics, particularly neutron diffusion in nuclear weapons research, which required sequences of random numbers to approximate probabilistic outcomes. Their work highlighted the limitations of manual random number generation and spurred the search for algorithmic alternatives, laying the groundwork for systematic pseudo-random number production in early electronic computers. A pivotal advancement came in 1949 from mathematician Derrick H. Lehmer, who introduced the multiplicative form of the LCG—characterized by an increment of zero (c=0) and a prime modulus m—during a symposium on large-scale computing. Lehmer designed this generator for practical implementation on the ENIAC, the first general-purpose electronic computer, to support number-theoretic computations and simulations, producing sequences with periods up to nearly the modulus size for suitable parameters. His approach addressed the shortcomings of earlier methods like von Neumann's middle-square technique, offering a more efficient recursive formula that could generate thousands of digits per second on ENIAC. By the 1950s, LCGs gained widespread adoption in major computing systems, including UNIVAC and IBM machines, where they powered Monte Carlo simulations in scientific and engineering applications. For instance, parameters such as moduli of 2^{43}, 2^{39}, or 10^{11} were employed to optimize performance on binary and decimal architectures, achieving periods close to m/4 for binary cases. This integration reflected the growing demand for pseudo-random numbers in statistical modeling, particularly in physics for particle simulations and in operations research for optimization problems like inventory control and queuing theory.

Key Milestones

In 1962, T. E. Hull and A. R. Dobell published a seminal analysis of linear congruential generators, establishing the full-period theorem that specifies the necessary and sufficient conditions for an LCG to achieve its maximum period equal to the modulus m. Their work provided a foundational theoretical framework for selecting parameters that ensure the longest possible cycle length, influencing subsequent designs of pseudorandom number generators. Building on this foundation, George Marsaglia's 1968 paper critiqued the structural weaknesses of LCGs, demonstrating that sequences from these generators tend to cluster on a limited number of parallel hyperplanes in multi-dimensional spaces, leading to poor uniformity beyond one dimension. He introduced the spectral test as a quantitative measure to evaluate the quality of LCG multipliers by assessing the maximum spacing between these hyperplanes, highlighting the need for improved parameter selection to mitigate lattice-based artifacts. In 1988, Stephen K. Park and Keith W. Miller proposed the "minimal standard" LCG, defined by the parameters multiplier a = 16807, increment c = 0, and modulus m = 2^{31} - 1 = 2147483647, as a portable and efficient benchmark for research and implementation. This Lehmer-style generator was selected for its proven full period, good spectral test performance, and simplicity, serving as a reference point that addressed earlier criticisms while remaining computationally lightweight. By the 1990s, the limitations of standalone LCGs—such as inadequate multi-dimensional uniformity and insufficient period lengths for emerging high-performance computing applications—prompted a shift toward combined generators that integrate multiple LCG components to enhance statistical properties and scalability in parallel simulations. This transition was driven by the demands of Monte Carlo methods in scientific computing, where LCGs alone failed to meet the rigor required for large-scale, multi-threaded environments.

Comparisons

With Linear PRNGs

Linear congruential generators (LCGs) offer simplicity and ease of implementation compared to other linear pseudorandom number generators, such as lagged Fibonacci generators, but at the cost of shorter periods and smaller state efficiency. LCGs require only a single state variable and perform basic modular multiplication and addition, making them straightforward to code and fast on most hardware. In contrast, lagged Fibonacci generators use a recurrence involving lagged terms (e.g., x_n = (x_{n-j} + x_{n-k}) \mod m), necessitating a larger state array of size equal to the maximum lag k, which increases memory usage and initialization complexity but enables much longer sequences suitable for high-volume simulations. Additive congruential generators represent a special case of LCGs where the multiplier is set to 1, reducing the recurrence to x_{n+1} = (x_n + c) \mod m and relying solely on modular addition without multiplication. This simplification leads to poorer bit mixing, as additions preserve low-order bit patterns more predictably than the scrambling effect of a non-trivial multiplier in standard LCGs, resulting in slower diffusion of randomness across the output bits. While both share arithmetic modularity, the multiplier in LCGs accelerates mixing and improves low-dimensional uniformity, though careful parameter selection is essential for both to avoid visible correlations. All these linear generators, including LCGs, lagged Fibonacci, and additive variants, suffer from inherent linearity that exposes them to algebraic attacks, such as solving systems of linear equations to predict future states from observed outputs or detecting lattice structures in multi-dimensional projections. LCGs remain preferable for resource-constrained environments due to their minimal state and computational overhead, despite these vulnerabilities. For instance, the maximum period of an LCG is bounded by the modulus m (e.g., $2^{32} for common 32-bit implementations), whereas lagged Fibonacci generators can achieve periods up to $2^k - 1 with k lags, such as $2^{1279} - 1 for large-scale applications.

With Non-Linear PRNGs

Linear congruential generators (LCGs) exhibit inherent linearity in their recurrence relation, which facilitates predictability through algebraic inversion; given a sufficient number of consecutive outputs, future states can be computed exactly by solving the modular equation X_{n+1} = (a X_n + c) \mod m. In contrast, non-linear pseudorandom number generators (PRNGs), such as the Mersenne Twister (MT) and xorshift, employ non-linear transformations like bit-wise XOR and shifts or matrix operations over finite fields, rendering state prediction computationally intensive and resistant to simple equation-based attacks. This structural non-linearity mitigates artifacts like lattice structures in LCGs, where sequences align along low-dimensional hyperplanes, leading to detectable patterns in multidimensional projections. Compared to the Mersenne Twister, LCGs offer superior speed due to their minimal arithmetic operations but deliver inferior statistical quality, with periods typically limited to m (often $2^{32} or $2^{64}) versus MT's enormous period of $2^{19937} - 1. MT's design ensures 623-dimensional equidistribution, making it preferable for applications demanding high uniformity, such as Monte Carlo simulations in organic and biological systems, where LCGs can introduce significant biases—like up to 87% volume expansion in solvated peptide models—while MT yields results aligning closely with experimental data. For graphics and large-scale simulations, MT's enhanced randomness reduces visible artifacts in rendering or sampling, though at the cost of larger state size (19937 bits) and slightly slower generation. xorshift generators match LCGs in computational efficiency, leveraging simple bit operations for high throughput, but surpass them in quality by avoiding LCG's lattice defects through non-linear scrambling, all while using compact state sizes (e.g., 64 bits for a $2^{64} - 1 period). This enables xorshift to produce sequences with better spectral properties and lower predictability in parallel computing environments, without the modular multiplications that can bottleneck LCGs on certain hardware. LCGs suit lightweight scenarios requiring rapid seeding and low memory, such as embedded systems or initial prototyping, whereas non-linear PRNGs like MT and xorshift are favored for production use in simulations, graphics, and statistical modeling where sequence quality directly impacts accuracy and reliability.