Fact-checked by Grok 2 weeks ago

Middle-square method

The Middle-square method is a pioneering algorithm for generating pseudorandom numbers, developed by mathematician around 1946 as one of the first computational techniques for producing sequences of numbers that mimic for use in simulations and statistical computations. It functions by selecting an initial n-digit seed number (typically with n even), squaring it to produce a number with up to 2n digits, and extracting the central n digits—padding with leading zeros if necessary—to serve as the next seed, thereby iterating to generate a sequence. This iterative process aims to "scramble" the digits through the nonlinear squaring operation, creating apparent randomness without external input. Von Neumann introduced the method at a 1949 conference on techniques and elaborated on it in a 1951 publication, where he described it as a practical tool for high-speed computing amid the growing need for random digits in simulations and other probabilistic modeling. Early implementations, such as those on the computer, employed it for tasks like modeling neutron diffusion, marking it as a foundational step in computational before more advanced linear congruential generators emerged. However, himself cautioned that the sequences required rigorous statistical testing due to their potential for obscure error propagation, such as round-off amplification that could degrade precision after roughly 33 iterations in . Despite its simplicity and historical significance, the middle-square method is now considered highly flawed for modern applications, primarily because it often produces short cycles, strong correlations between successive numbers, and degeneracies—such as sequences collapsing to zero or repeating patterns like 0000 or 6100 in four-digit examples—that undermine its randomness. For instance, with 20-bit numbers, the longest cycle observed has a period of only 142, far short of the theoretical maximum of $2^{20}, making it prone to failure with poor seeds and unsuitable for cryptographic or high-precision needs without modifications. Later variants, such as those incorporating Weyl sequences, have attempted to address these issues, but the original remains a cautionary example in the evolution of pseudorandom number generation.

History

Invention and early adoption

Developed around 1946 for early computing applications such as those on the , the middle-square method was introduced by in 1949 at a on methods held June 29, 30, and July 1 in , , sponsored by the , the National Bureau of Standards, and the . In his contribution to the symposium proceedings, published in 1951, von Neumann presented the method as a simple arithmetic procedure for generating pseudorandom digits suitable for simulations of complex physical processes, such as . He emphasized its computational efficiency on early computers, noting that while the approach was imperfect, its speed made it practical for large-scale numerical experiments, famously remarking: "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." Nicholas Metropolis, a key collaborator in the development of Monte Carlo techniques at Los Alamos, further documented the method's performance in the same 1951 proceedings, reporting that implementations using 38-bit numbers could generate sequences of up to 750,000 digits before the cycle broke down into repetition or zeros. This capability highlighted the method's viability for producing sufficiently long streams of pseudorandom numbers in resource-constrained early computing environments. The method saw initial adoption throughout the 1950s in computational projects requiring pseudorandom sequences, particularly as electronic computers like the MANIAC at Los Alamos replaced slower mechanical punch-card systems for random number generation, which were limited by manual tabulation and physical constraints. Von Neumann underscored its value for efficiency in such simulations, stating that the procedure allowed for rapid iteration without the need for external randomness sources, enabling extensive testing of statistical models in fields like nuclear physics.

Historical claims and context

One anecdotal claim attributes the invention of the middle-square method to Brother Edvin, a Franciscan friar at the of Tautra in , around 1240–1250. According to Ivar Ekeland, Edvin described the technique in a now-lost for generating random sequences to resolve disputes or make decisions, such as by squaring a four-digit number and extracting the middle four digits to produce the next value. Ekeland suggests the manuscript was later copied into archives, drawing on a reference in Jorge Luis Borges's story "" for its historical transmission. However, this attribution lacks primary sources and is considered likely apocryphal, as no contemporary of Edvin's work mention such a method, and the claim originates solely from Ekeland's popular account without corroboration in medieval mathematical texts. Scholarly histories of trace algorithmic pseudorandom techniques to the mid-20th century, with no evidence of pre-modern deterministic methods like middle-square. The middle-square method fits into the broader evolution of randomness in mathematics, which began with physical devices like ancient and astragali used by Mesopotamians and around 3000 BCE for and . By the , probability theory formalized chance with contributions from Pascal and Fermat, but deterministic generation of pseudo-random sequences emerged only in the 19th and early 20th centuries amid advances in , transitioning from empirical tools to computational aids. Its conceptual roots lie in squaring operations for digit manipulation, echoing early interests in congruences and developed without mechanical computation, as exemplified by Carl Friedrich Gauss's foundational work on quadratic residues and modular forms in 1801. Although later popularized by in 1949 for simulations, the method's simplicity highlights a bridge between classical arithmetic and modern .

Algorithm

Core procedure

The middle-square method is a deterministic for generating a sequence of pseudorandom using iterative squaring and digit extraction. It begins with an initial value X_0, which must consist of an even number of digits n to facilitate balanced extraction of the central digits from the square, typically satisfying $0 \leq X_0 < 10^n. The is treated as an n-digit , with leading zeros added if necessary to reach exactly n digits, ensuring consistent handling during . The core iterative process generates subsequent values as follows: Compute the square S = X_i^2 of the current n-digit integer X_i, which produces a non-negative integer with at most $2n digits. If the square has fewer than $2n digits, it is implicitly padded with leading zeros to reach $2n digits for extraction purposes. The next value X_{i+1} is then obtained by selecting the middle n digits of this $2n-digit representation. This extraction ensures the method remains confined to the n-digit . Mathematically, this is expressed by the formula X_{i+1} = \left\lfloor \frac{S \mod 10^{2n}}{10^n} \right\rfloor, where S = X_i^2, the modulo operation discards any digits beyond $2n, and the floor division shifts to isolate the central n digits. The even requirement for n arises because the square yields up to $2n digits, allowing symmetric extraction of exactly the middle n without ambiguity in positioning for odd lengths. To produce pseudorandom numbers in the unit interval [0, 1), each extracted integer X_i is normalized by dividing by $10^n: U_i = \frac{X_i}{10^n}. This yields a value between 0 (inclusive) and 1 (exclusive), suitable for applications requiring approximations.

Digit extraction mechanics

The middle-square method begins by squaring an n-digit seed number X_i, where $0 \leq X_i < 10^n, yielding a result X_i^2 that has at most $2n , as the maximum value satisfies (10^n - 1)^2 = 10^{2n} - 2 \cdot 10^n + 1 < 10^{2n}. If the squared value has fewer than $2n , it is implicitly padded with leading zeros to form a $2n- representation, ensuring consistent extraction; for instance, squaring a small number like 1 for n=2 produces 1, padded as 0001. The next seed X_{i+1} is then obtained by selecting precisely the middle n from this padded $2n- square. This extraction can be expressed mathematically without string manipulation, using modular arithmetic as X_{i+1} = \left\lfloor \frac{X_i^2}{10^n} \right\rfloor \mod 10^n, which isolates the central n digits by dividing by $10^n (effectively shifting right by n digits) and then taking the remainder modulo $10^n to discard any higher digits. Equivalently, one may first compute X_i^2 \mod 10^{2n} to obtain the lowest $2n digits (though unnecessary here since X_i^2 < 10^{2n}), followed by the floor division and modulo operations. For illustration with n=4, the squared value is treated as an 8-digit string with positions indexed 0 through 7 from the left (after padding); the middle 4 digits occupy positions 2 through 5, corresponding to the third through sixth digits. In computational implementations, particularly on early machines with fixed-width arithmetic, challenges arise from ensuring the squaring operation does not overflow the available integer precision, as X_i^2 requires storage for up to $2n digits while X_i fits in n digits. Von Neumann's original formulation targeted 10-digit numbers on ENIAC-era hardware, necessitating careful handling via multi-word arithmetic or modulo reductions to prevent wrap-around errors that could corrupt the digit selection; modern big-integer libraries mitigate this, but the core issue persists in resource-constrained environments.

Examples

Numerical sequence illustration

To illustrate the middle-square method, consider its application to small numbers of digits, where the sequences can be computed manually to reveal the generation process and typical short-term behavior. For an n-digit , the square is padded with leading zeros to exactly 2n digits if necessary, and the middle n digits—specifically, those in positions n/2 + 1 through 3n/2 (1-based indexing from the left)—are extracted as the next . A step-by-step example using a 4-digit of demonstrates the :
  • Square : $1234^2 = 1{,}522{,}756, padded to 01522756.
  • Extract middle 4 digits (positions 3–6): 5227.
  • Square 5227: $5227^2 = 27{,}321{,}529, middle 4 digits (positions 3–6): 3215.
  • Square 3215: $3215^2 = 10{,}336{,}225, middle 4 digits: 3362.
  • Square 3362: $3362^2 = 11{,}303{,}044, middle 4 digits: 3030.
This sequence continues without immediate repetition, but many 4-digit starting values enter short cycles or fixed points quickly. Certain seeds lead to the "zero trap," a fixed point where the sequence remains at 0000 thereafter. For instance:
  • Starting with 0000: $0000^2 = 0, padded to 00000000, middle 4 digits: 0000 (fixed).
  • Starting with 1000: $1000^2 = 1{,}000{,}000, padded to 01000000, middle 4 digits: 0000, then fixed at 0000.
Other 4-digit seeds produce fixed points, such as 2500 and 3792:
  • $2500^2 = 6{,}250{,}000, padded to 06250000, middle 4 digits: 2500 (fixed).
  • $3792^2 = 14{,}379{,}264, middle 4 digits: 3792 (fixed).
For even smaller cases, a 2-digit example starting from 24 reveals a short of 2:
SeedSquare (padded to 4 digits)Middle 2 digits (positions 2–3)Next seed
2405765757
5732492424
This cycles between 24 and 57 indefinitely. These examples highlight that while the maximum possible period of any sequence is at most $10^n due to the finite state space of n-digit numbers, actual periods are often much shorter, with many 4-digit sequences entering cycles of length 1 (fixed points) or small numbers, and the longest observed run before cycling being 111 steps.

Programming implementation

The middle-square method lends itself to straightforward implementation in modern programming languages, relying on arithmetic to square the and extract the middle digits without manipulation where possible. A function can automate the generation of the sequence, incorporating by tracking previously seen values in a set. This approach highlights the method's deterministic nature, where repeats indicate the onset of a . Here is a Python implementation for an n-digit middle-square generator:
python
def middle_square(seed, n):
    """
    Generate middle-square sequence until a cycle is detected.
    
    Args:
    seed (int): Initial n-digit seed.
    n (int): Number of digits.
    
    Returns:
    list: Sequence of values until cycle detection.
    """
    seen = set()
    sequence = []
    current = seed % (10 ** n)  # Ensure seed is within n digits
    power_low = 10 ** (n // 2)
    power_n = 10 ** n
    while current not in seen:
        seen.add(current)
        sequence.append(current)
        sq = current * current
        middle = (sq // power_low) % power_n
        current = middle
    return sequence
This code uses division and modulo operations to extract the middle n digits, equivalent to padding the square to 2n digits and slicing the central portion, as per the original procedure described by . The logic employs a set to store seen values; generation stops when a repeat occurs, allowing computation of the preperiod (transient states) and ( length) if needed—for instance, the is the distance from the first occurrence of the repeated value to the end of the sequence. For seed 1234 and n=4, the function produces the following sequence of 20 unique values before detecting the cycle: 1234, 5227, 3215, 3362, 3030, 1809, 2724, 4201, 422, 1780, 1684, 8362, 9220, 84, 70, 49, 24, 5, 0. The next value is again 0, confirming a of length 1 at the terminal state. When displaying results, values can be formatted to n digits with leading zeros (e.g., 0084 for 84) for clarity, though the set tracks numerical values. Adaptations for other languages follow similar logic; for example, in or , use Math.pow(10, n/2) cast to long for the , with and , ensuring no floating-point issues by precomputing powers of 10. For bit widths instead of digits (e.g., 32-bit integers), the extraction can analogously shift and mask bits: square the seed, then right-shift by (bit_width / 2) bits and mask the lower bit_width bits, mimicking the digit padding via bitwise operations. These -based variants avoid string conversions, improving for large n. Although the middle-square method is no longer used in production due to its limitations, its simple implementation makes it valuable for educational demonstrations of pseudorandom number generation and cycle analysis in computing courses.

Properties and limitations

Cycle behavior and period

The middle-square method operates on a finite state space of $10^n possible n-digit numbers (including leading zeros), defining a deterministic mapping from each state to the next via squaring and extraction of the middle n digits. Consequently, every sequence generated by the method is eventually , with the maximum possible bounded by the state space size of $10^n. However, the squaring operation tends to produce outputs clustered toward states with many zero digits, leading to low-entropy configurations and significantly shorter actual periods in practice. This clustering arises because the square of a number with trailing or leading zeros often preserves or amplifies such patterns, directing trajectories toward repetitive or degenerate states. A common failure mode is trapping into the all-zero state, where the sequence remains at 000...0 indefinitely, as the square of zero yields zero. For example, with n=2 digits, starting from 43 produces the sequence 43 → 84 → 05 → 02 → 00, entering the zero trap after four steps. Similarly, for n=4, the state 0100 maps to itself (since $100^2 = 10000, padded to eight digits as 00010000, with middle four digits 0100), forming a fixed point of length 1. Another frequent short for n=4 is 6100 → 2100 → 4100 → 8100 → 6100, with 4; G. E. Forsythe observed that 12 out of 16 possible four-digit starting values (considering only non-zero seeds) converge to this . These examples illustrate how even small n values yield rapid degeneracy, often within fewer than 10 steps. Empirical analyses reveal that periods are typically much shorter than the theoretical maximum, with lengths influenced by the initial and the of . Poor seed choices, such as those with many zeros, accelerate entry into short s or traps, while even (like 4 or 10) may exhibit more fixed points due to symmetric padding in the squaring process. For analogs, N. identified exactly distinct s for 20-bit states, the longest having 142—far below the $2^{20} state space. Extending to larger sizes, a 38-bit implementation generated approximately 750,000 numbers before entering a degenerate , demonstrating that while longer periods are possible with increased , they remain orders of magnitude smaller than $10^n or $2^b due to the method's inherent biases. Seed selection thus plays a critical role, as random or poorly chosen starts often lead to periods on the order of hundreds or thousands rather than millions.

Statistical flaws and analysis

The middle-square method generates pseudorandom numbers that suffer from pronounced non-uniform distributions, with outputs clustering around the central digits of squared values, resulting in biases toward specific digits such as an overrepresentation of even numbers. This arises because the squaring operation preserves certain structural properties of the input, preventing a balanced spread across the full range of possible values. For instance, empirical analyses show that the method favors numbers with middle digits that reflect residues' tendencies, leading to detectable in digit frequencies. Sequences produced by the method frequently exhibit long runs of zeros or repetitive patterns, which undermine their . Early empirical tests in the , including those conducted at on 10-digit seeds, revealed sequences entering zero cycles after as few as 17,000 steps, with streaks of identical values persisting thereafter. These characteristics cause the method to fail key statistical tests, such as the for uniformity and serial correlation tests, where correlations between consecutive outputs often exceed 0.5, indicating strong dependencies rather than . John von Neumann, who proposed the method, openly acknowledged its questionable randomness, describing arithmetical approaches to random digit generation as placing one "in a state of sin" due to their inherent deterministic limitations. Mathematical examination confirms this, as the transformation X \mapsto \left\lfloor X^2 / 10^n \right\rfloor \mod 10^n fails to be ergodic over the state space [0, 10^n - 1], resulting in poor mixing and confinement to small invariant subsets that limit exploration. Such flaws, including poor performance in spectral tests that reveal structures in the output points, render the method unsuitable for reliable simulations or other applications requiring high-quality randomness.

Variants and extensions

Weyl sequence integration

The Weyl sequence integration variant enhances the middle-square method by adding an equidistributed term after the squaring and extraction step, addressing the original algorithm's tendency toward short cycles and poor uniformity. This modification draws on Weyl's , which guarantees that sequences of the form {α i} (where {·} denotes the and α is ) are uniformly distributed 1, thereby improving the overall randomness properties when adapted to finite precision arithmetic. The procedure begins with the standard middle-square operation on the current state X_i, followed by addition of a Weyl term before modular reduction. In decimal form for n-digit numbers, it is given by X_{i+1} = \left( \text{middle-square}(X_i) + \left\lfloor \alpha \cdot i \right\rfloor \right) \mod 10^n, where \alpha is chosen as an irrational multiple (e.g., derived from \sqrt{2} or a suitable constant) to ensure equidistribution, and i is the iteration counter starting from 0 or 1. In modern binary implementations using 64-bit words, the update uses bit shifts for extraction: the state x is squared, the Weyl accumulator w is incremented by an odd constant s (e.g., s = 0x61C88647), added to the squared value, and the middle 32 or 64 bits are extracted via right- and left-shifts, yielding x = ((x \cdot x + w) \gg 32) \mod 2^{64}. This keeps the computation efficient while leveraging hardware integer operations. The benefits include a dramatically extended —at least $2^{64} in 64-bit versions, far surpassing the original method's typical short cycles of length less than $10^n—and reduced clustering, as the additive term prevents trapping in low-entropy states like zeros. These improvements stem from the Weyl sequence's full-period traversal the state space, combined with the nonlinear mixing of squaring, resulting in outputs that pass stringent statistical test suites such as TestU01's BigCrush and PractRand, making it viable for contemporary applications. This variant was proposed by Widynski in 2017 as an extension of John von Neumann's original middle-square method, introduced in 1951 for early computational simulations including techniques. The middle-product method modifies the core squaring operation of the middle-square method by replacing it with by a fixed constant c, where the next value X_{i+1} is derived from the middle digits of X_i \times c. This adaptation, proposed by George Forsythe in 1949, aims to improve sequence uniformity for carefully chosen c, as can produce longer runs compared to squaring alone, though it still fails some statistical tests for randomness. For example, using a constant like 9973 with 4-digit seeds can extend periods beyond those of pure squaring, but the method remains sensitive to the choice of c. Digit permutation variants emerged in the as refinements to disrupt short cycles in middle-square sequences, involving operations like or XORing extracted digits after the squaring step to enhance mixing and break predictable patterns. These approaches, explored amid early efforts to salvage the method's simplicity, were analyzed but found to offer only marginal improvements in period length and distribution, often requiring combination with other techniques for practical use. Birger Jansson's study highlighted such variants as inadequate for robust generation, emphasizing their limited impact on overall flaws. Bit-level adaptations tailor the middle-square method for binary computers by operating on the middle bits of the squared state rather than decimal digits, effectively translating the procedure to base-2 representations. This shift, studied by Nicholas Metropolis in 1954, revealed structural similarities to linear congruential generators, positioning bit-extraction variants as precursors to multiplicative methods while exposing cycles—for instance, only 13 distinct cycles in 20-bit implementations, with a maximum period of 142. Such binary forms facilitated early computational implementations but inherited the original method's proneness to degeneracy. Donald Knuth, in The Art of Computer Programming, Volume 2, discusses these modifications as historical stepping stones to more reliable random number generators, noting refinements like the adjusted product \lfloor X(X-1)/10^5 \rfloor \mod 10^{10} proposed by N. V. Smirnov to mitigate zero-trapping issues. Knuth underscores their role in illustrating early pitfalls, such as unintended periodicity, while crediting them for inspiring subsequent algorithms. Despite these adaptations, related modifications remain vulnerable to short periods unless integrated with external mechanisms like Weyl sequences for better equidistribution, limiting their standalone utility in modern applications.

References

  1. [1]
    [PDF] Various Techniques Used in Connection With Random Digits - MCNP
    By John von Neumann. SUJnJnary written by George E. Forsythe. In manual computing methods today random control call for these numbers as needed. The real.
  2. [2]
    Pseudo-Random Number - an overview | ScienceDirect Topics
    But for historical interest we must mention the first RNG, which was proposed by John von Neumann at a conference in 1949 (von Neumann, 1951). In his mid-square ...
  3. [3]
    The Art of Computer Programming: Random Numbers - InformIT
    Jun 23, 2014 · Von Neumann's original “middle-square method” has actually proved to be a comparatively poor source of random numbers. The danger is that ...
  4. [4]
    [PDF] The Beginning of the Monte Carlo Method - MCNP
    A much used algorithm for gener- ating such numbers is the so-called von. Neumann "middle-square digits." Here, an arbitrary n-digit integer is squared,.
  5. [5]
    [PDF] LOS ALAMOS SCIENTIFIC LABORATORY - MCNP
    E. D. Cashwell. C. J. Everett. 0. W. Rechard c. Report written by: E, D. Cashwell. C. J. Everett r. Contract W-7405-ENG. 36 with the U. S. Atomic Energy ...
  6. [6]
  7. [7]
    [PDF] HISTORY OF UNIFORM RANDOM NUMBER GENERATION - Hal-Inria
    In this article, we give a historical account on the design, implementation, and testing of uniform random number generators used for simulation. 1 INTRODUCTION.
  8. [8]
    [PDF] The Middle of the Square An amateur's outlook on computation and ...
    Aug 8, 2022 · Von Neumann came up with the middle-square algorithm circa 1947, while planning the first computer simulations of nuclear fission. This work was ...
  9. [9]
    The Middle Square Method (Generating Random Sequences VIII)
    Nov 21, 2017 · Von Neumann proposed the middle square method of generating pseudo-random numbers in 1949, in a paper published a bit later.Missing: 1951 | Show results with:1951
  10. [10]
    Bias in Pseudo-Random Numbers - jstor
    An early method, sometimes called the mid-square method, is subject to this danger. In the mid-square method we square an n-digit number and extract the ...
  11. [11]
  12. [12]
    [PDF] Middle-Square Weyl Sequence RNG - arXiv
    Mar 20, 2022 · Early in the field of computer science, John von Neumann proposed the middle-square method for generating random numbers [10]. A number is.<|control11|><|separator|>
  13. [13]
  14. [14]
    History of uniform random number generation - ACM Digital Library
    RNG with long and known period, explained that the middle-square method was unsatisfactory because it usually ends up in a very short cycle, and proposed ...
  15. [15]
    [PDF] Generalized Middle-Square Method
    Dec 28, 2022 · In this paper, we generalize John von Neumann's Middle-Square. Method (MSM) to canonical number systems (CNS). Additionally, we present some ...<|control11|><|separator|>