Probable prime

In number theory, a probable prime is a positive integer greater than 1 that satisfies a probabilistic primality test, indicating it is likely prime but not guaranteed to be so, as it may instead be a composite pseudoprime. These tests leverage mathematical properties, such as Fermat's Little Theorem, that all primes obey but which most composite numbers fail, allowing for efficient screening of candidates without exhaustive factorization.^[1]^[2] Probable primes are categorized by the specific test used, with key variants including Fermat probable primes, where a^{n-1} \equiv 1 \pmod{n} for a base a coprime to n; Euler probable primes, based on Euler's criterion involving the Jacobi symbol; and strong probable primes, which apply more rigorous conditions in the Miller-Rabin test by writing n-1 = 2^s t with t odd and verifying either a^t \equiv 1 \pmod{n} or a^{2^r t} \equiv -1 \pmod{n} for some $0 \leq r < s.^[1]^[3] The strong variant, introduced in the 1970s by Gary Miller and Michael Rabin, significantly reduces the error rate, with the probability of a composite number passing a single iteration being less than $1/4, and multiple iterations yielding error probabilities as low as $4^{-k} for k trials.^[4]^[3] Probable primes play a vital role in computational number theory and cryptography, enabling the rapid identification of large primes essential for secure systems like RSA, where generating keys from products of two large primes relies on efficient, probabilistic testing due to the infeasibility of deterministic methods for numbers exceeding hundreds of digits.^[4] Despite the existence of infinitely many composite probable primes for any fixed base, the low incidence of pseudoprimes—such as fewer than 0.0001% for base-2 tests below 25 billion—ensures high reliability in practice.^[2]

Fundamentals

Definition

A probable prime, often abbreviated as PRP, is an integer that passes a specified probabilistic primality test, satisfying a condition met by all prime numbers but potentially also by certain composite numbers known as pseudoprimes. Formally, for a given set of bases a_1, a_2, \dots, a_k, a number n > 1 is a probable prime to those bases if it satisfies the test criteria for each a_i coprime to n, leading to a high likelihood that n is prime despite the small chance it is composite.^[1] Unlike absolute primes—numbers rigorously proven to be prime through deterministic methods such as the AKS algorithm, which guarantees correctness without probabilistic error—probable primes carry an inherent uncertainty, as the test may erroneously classify a pseudoprime as prime. The AKS test, developed in 2002, provides a polynomial-time deterministic verification for any integer, contrasting with the efficiency of probabilistic approaches for very large numbers.^[5] The concept of probable primes originated in the 1970s amid the development of efficient probabilistic primality tests for handling large integers impractical for exhaustive factorization. Seminal work, such as the Solovay-Strassen test introduced in 1977, formalized these methods, enabling rapid identification of likely primes with controlled error rates. In notation, a probable prime to specific bases (e.g., a base-2 PRP) passes tests for those bases alone.^[6]

Probabilistic Primality Testing

Probabilistic primality tests provide an efficient means to assess whether a large integer is prime, offering a practical alternative to deterministic methods for numbers too large for exhaustive factorization. Deterministic algorithms, such as the AKS primality test, guarantee correctness in polynomial time but incur higher computational costs, with running times on the order of O((\log n)^{6 + o(1)}), rendering them less suitable for generating the enormous primes required in modern cryptography. In contrast, probabilistic tests execute in expected polynomial time and yield a tunable level of confidence; while they often incorporate randomization and multiple iterations, deterministic variants using fixed sets of bases are commonly employed for practical sizes up to at least $2^{64}, ensuring zero error probability without randomization.^[4]^[7] At their core, these tests leverage number-theoretic properties that are satisfied by all primes but violated by most composites, enabling rapid discrimination. A foundational example is Fermat's Little Theorem, which asserts that if p is prime and \gcd(a, p) = 1, then a^{p-1} \equiv 1 \pmod{p}; probabilistic tests select random bases a coprime to the candidate n and verify analogous congruences modulo n. Failure of the condition immediately certifies n as composite, while success across multiple bases suggests primality, as such properties hold universally for primes due to the structure of the multiplicative group (\mathbb{Z}/n\mathbb{Z})^*. This approach exploits the rarity of composites mimicking prime behavior, ensuring high detection rates for non-primes.^[8]^[4] Central to the methodology are the concepts of witnesses and liars, which quantify the reliability of random base selection. A witness is a base a for which the test condition fails, conclusively revealing the compositeness of n; conversely, a liar is a base where the condition holds despite n being composite, causing the test to erroneously pass. For any composite n > 1, the proportion of liars among bases coprime to n is strictly less than 1. This proportion is at most $1/2 for the Solovay-Strassen test and at most $1/4 for the Miller-Rabin test, ensuring that random sampling is likely to encounter a witness if n is composite.^[4]^[9] The reliability of probabilistic primality testing is further enhanced by the exponential decay of error probability with additional iterations. For tests like Solovay-Strassen with error at most $1/2 per trial, or Miller-Rabin with at most $1/4, performing k independent tests with distinct random bases reduces the overall error accordingly, e.g., to at most (1/4)^k for Miller-Rabin. For instance, k = 40 yields a confidence exceeding $1 - 10^{-12} even under the looser bound, sufficient for cryptographic standards, while larger k (e.g., 100) achieves error probabilities below $10^{-30}, far surpassing practical needs. This scalability makes probabilistic tests indispensable for applications demanding frequent prime generation.^[4]^[10]

Types

Fermat probable prime

A Fermat probable prime to base a is a positive integer n > 1 such that $1 < a < n, \gcd(a, n) = 1, and the congruence a^{n-1} \equiv 1 \pmod{n} holds.^[11] This condition is tested computationally to assess the primality of n, where passing the test suggests n might be prime, though it does not guarantee it.^[12] The concept derives directly from Fermat's Little Theorem, which asserts that if p is a prime number and \gcd(a, p) = 1, then a^{p-1} \equiv 1 \pmod{p}.^[13] Thus, every prime n satisfies the Fermat test for any base a coprime to n, making it a necessary but not sufficient condition for primality.^[14] Despite its simplicity, the Fermat probable prime test has significant limitations due to the existence of composite numbers that pass it. Such composites are termed Fermat pseudoprimes to base a, and certain classes, like Carmichael numbers, satisfy the condition for all bases a coprime to n, resulting in a high error rate for identifying composites.^[15]^[16]

Strong probable prime

A strong probable prime is a refinement of the Fermat probable prime concept, providing a stricter probabilistic test for primality that reduces the likelihood of certain composite numbers passing as primes. Specifically, an odd integer n > 2 is defined as a strong probable prime to base a (where $1 < a < n and \gcd(a, n) = 1) if, upon writing n-1 = 2^s d with d odd and s \geq 1, either a^d \equiv 1 \pmod{n} or there exists an integer r with $0 \leq r < s such that a^{2^r d} \equiv -1 \pmod{n}.^[10] This condition ensures that a does not serve as a witness to the compositeness of n, meaning n passes the test and is considered probably prime. The concept originates from Gary L. Miller's 1976 work, which improved upon Fermat's Little Theorem-based tests by incorporating checks for the order of a modulo n in the multiplicative group, particularly addressing vulnerabilities to Carmichael numbers and other pseudoprimes.^[17] Miller's approach assumes the extended Riemann hypothesis for deterministic primality but introduces the "strong" condition to verify that a^{n-1} \equiv 1 \pmod{n} holds without nontrivial square roots of unity in intermediate subgroups, effectively probing the 2-Sylow subgroup structure. Michael O. Rabin later adapted this in 1980 into a fully probabilistic algorithm, removing the hypothesis dependency while retaining the core strong test for practical use.^[10] The full conditions break down into two primary cases for the test to pass (i.e., for n to be a strong probable prime to base a):

Trivial case: a^d \equiv 1 \pmod{n}, where d = (n-1)/2^s is the odd part of n-1. This verifies that the order of a divides d, aligning directly with the prime case under Fermat's Little Theorem.
Non-trivial case: If the trivial case fails, then for some $0 \leq r < s, a^{2^r d} \equiv -1 \pmod{n}. This detects the first occurrence where squaring yields 1 modulo n (since (-1)^2 = 1), but without introducing improper square roots of 1 earlier in the chain a^d, a^{2d}, \dots, a^{2^{s-1}d}. These checks ensure no "liar" behavior from composite n with factors allowing premature cycles.

This formulation avoids the full exponentiation of Fermat's test by iteratively squaring from a^d, making it computationally efficient. Compared to the Fermat probable prime test (where n passes if a^{n-1} \equiv 1 \pmod{n}), the strong variant detects more composites by ruling out those with nontrivial square roots of 1 in the sequence of powers, particularly pseudoprimes that exploit the full order but fail subgroup consistency.^[17] It thus offers greater reliability against specific classes of composites, such as those with small prime factors or structured factorizations, without requiring knowledge of n's factorization.^[10]

Other variants

A Lucas probable prime with respect to parameters P and Q is a positive integer n > 1, coprime to $2PQ, that passes a primality test based on properties of Lucas sequences defined by those parameters, where the sequences U_k(P, Q) and V_k(P, Q) satisfy specific divisibility conditions analogous to Fermat's Little Theorem but in the ring \mathbb{Z}[\sqrt{D}] with discriminant D = P^2 - 4Q.^[18] This test is particularly useful for numbers of special form, such as Mersenne numbers of the form $2^p - 1, where the Lucas-Lehmer test—a deterministic special case—efficiently verifies primality by leveraging the known factorization of n+1.^[19] An extra strong Lucas probable prime is a stricter variant of the strong Lucas probable prime test, imposing additional conditions that further limit the proportion of bases for which a composite number can pass, reducing the likelihood of false positives to at most $1/8.^[20] For parameters where Q = 1, it requires either U_s \equiv 0 \pmod{n} and V_s \equiv \pm 2 \pmod{n}, or V_{2^t s} \equiv 0 \pmod{n} for some $0 \leq t < r-1, where n - \epsilon(n) = 2^r s with s odd and \epsilon(n) = (D/n).^[21] This enhancement builds on the standard strong Lucas test by narrowing the acceptable residues, making it suitable for applications requiring higher confidence in fewer iterations. In the Solovay-Strassen test, a probable prime to base a is an odd integer n > 2 for which the Jacobi symbol (a/n) equals a^{(n-1)/2} \pmod{n}, matching Euler's criterion that holds for all primes but fails for at least half of the bases coprime to composite n.^[22] This variant provides a probabilistic guarantee with error probability less than $1/2^k after k independent trials, offering an alternative to Fermat-based tests with comparable efficiency but different pseudoprime distributions. Absolute probable primes are rare composite numbers that pass a given probable prime test for every possible base coprime to them, rendering them indistinguishable from primes under that test alone; for instance, absolute Fermat probable primes are precisely the Carmichael numbers, while no absolute strong probable primes exist due to the test's stricter conditions on the structure of the multiplicative group.^[23] Their existence (or lack thereof) for specific tests underscores the need for multiple test variants in practice, as no single probable prime definition guarantees detection of all composites.^[16]^[24]

Properties and Analysis

Key mathematical properties

A probable prime to a given primality test is either a prime number or a pseudoprime to that test; specifically, pseudoprimes are the composite members of the set of probable primes.^[25] For the Fermat test, Fermat pseudoprimes are precisely the composite Fermat probable primes.^[25] Fermat probable primes encompass all odd primes and the Fermat pseudoprimes to the chosen base.^[25] The density of probable primes to a fixed base decreases with increasing number size, asymptotically matching that of the primes since pseudoprimes form a negligible subset.^[3] For example, up to $25 \times 10^9, there are over 1 billion primes but only about 22,000 base-2 Fermat pseudoprimes, making their contribution to probable prime counts minimal.^[25] This sparsity ensures that probable prime tests remain efficient for large cryptographic numbers, where the natural logarithmic density provides sufficient candidates.^[3] The set of probable primes to a fixed base is not closed under multiplication: the product of two distinct probable primes is composite and generally fails the test.^[26] For instance, 3 and 5 are probable primes to base 2 (as primes), but their product 15 satisfies $2^{14} \not\equiv 1 \pmod{15} since $2^{14} \equiv 4 \pmod{15}.^[27] However, for the bases themselves, if a composite n is a Fermat probable prime to coprime bases a and b, then it is also one to the base ab, as (ab)^{n-1} \equiv a^{n-1} b^{n-1} \equiv 1 \pmod{n}.^[25] Small composite probable primes include the first few Fermat pseudoprimes to base 2: 341 ($11 \times 31), 561 ($3 \times 11 \times 17), 645 ($3 \times 5 \times 43), and 1105 ($5 \times 13 \times 17).^[27] These examples illustrate how pseudoprimes can mimic primes under specific tests despite being composite.^[26]

Error bounds and reliability

The reliability of probable prime tests, particularly the Miller-Rabin algorithm, is quantified through error probabilities that bound the chance of incorrectly identifying a composite number as prime. In the worst case, for a composite number n, at most a quarter of the possible bases are liars (non-witnesses) in the strong probable prime test, leading to an error probability of less than \frac{1}{4} per random base selection. When performing k independent tests with random bases coprime to n, the overall error probability is thus at most $4^{-k}. This conservative bound ensures that, for example, k = 25 yields an error probability below $2^{-50}, sufficient for many applications. Average-case analyses provide tighter bounds, especially for large random composites, revealing that the error probability per test is significantly lower than \frac{1}{4}. Damgård, Landrock, and Pomerance established that for a random odd composite n of L bits, the probability of a random base failing to detect compositeness is at most roughly $2^{-L/4} in certain regimes, leading to exponentially small overall error rates with few iterations. For instance, their estimates imply that for a 1024-bit composite, just 2–3 iterations suffice to achieve an error probability below $2^{-80}, far outperforming the worst-case guarantee. These results stem from the distribution of prime factors and the rarity of composites with many liar bases. Several factors influence the reliability of these tests. The size of the number n plays a key role, as larger L reduces the average error per iteration due to the sparser occurrence of highly deceptive composites. Base selection also matters: random bases yield the probabilistic bounds above, while carefully chosen fixed bases can eliminate error entirely for n below specific thresholds (e.g., deterministic sets of 7 bases for n < 3,317,044,064,679,887,385,961,981). Additionally, the structure of composites affects detection; for example, Carmichael numbers fool all bases in the weaker Fermat test but are detected with high probability (near 1) in strong tests due to their specific factorization properties. In cryptographic applications, achieving negligible error probabilities like $2^{-100} is standard for "provable" security levels. According to NIST guidelines in FIPS 186-5, for 2048-bit primes used in RSA, only 4 iterations of Miller-Rabin with random bases are required to reach this confidence under average-case assumptions, contrasting with the 50 iterations needed under worst-case bounds. This efficiency enables practical prime generation while maintaining rigorous security guarantees.^[28]

Testing Procedures

Miller-Rabin algorithm

The Miller-Rabin primality test, introduced by Gary L. Miller in 1976 and extended to a fully probabilistic version by Michael O. Rabin in 1980, is a widely used algorithm for determining whether an odd integer n > 2 is a strong probable prime.^[29]^[10] The test operates by verifying specific conditions derived from Fermat's Little Theorem and properties of the multiplicative group modulo a prime, ensuring that primes always pass while composites fail with high probability. To apply the test, first express n-1 = 2^s \cdot d where d is odd and s \geq 1. For a chosen base a with $2 \leq a < n and \gcd(a, n) = 1, compute x_0 = a^d \mod n. If x_0 \equiv 1 \pmod{n} or x_0 \equiv -1 \pmod{n}, then n passes for this base. Otherwise, iteratively square: for r = 1 to s-1, set x_r = x_{r-1}^2 \mod n; if x_r \equiv -1 \pmod{n}, then n passes for this base. If no such r yields -1 and x_0 \not\equiv 1 \pmod{n}, declare n composite. The full test repeats this for multiple bases; if n passes all, it is a strong probable prime.^[29]^[10] The following pseudocode outlines the core procedure for a single base, using iterative squaring for efficiency:

function is_strong_probable_prime(n, a):
    if gcd(a, n) != 1:
        return false  // or handle separately
    write n-1 = 2^s * d with d [odd](/page/Odd)
    x = pow(a, d, n)  // a^d mod n
    if x == 1 or x == n-1:
        return true
    for r in 1 to s-1:
        x = (x * x) mod n
        if x == n-1:
            return true
    return false
function is_strong_probable_prime(n, a):
    if gcd(a, n) != 1:
        return false  // or handle separately
    write n-1 = 2^s * d with d [odd](/page/Odd)
    x = pow(a, d, n)  // a^d mod n
    if x == 1 or x == n-1:
        return true
    for r in 1 to s-1:
        x = (x * x) mod n
        if x == n-1:
            return true
    return false

For the overall test, select k bases and return true only if all pass; otherwise, composite.^[9] In probabilistic mode, bases a are selected randomly from $2 to n-2, providing an error probability less than $4^{-k} for composite n.^[10] For deterministic verification up to certain bounds, fixed sets of small primes serve as bases; for example, using the first 12 primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37) guarantees correctness for all n < 3{,}317{,}044{,}064{,}679{,}887{,}385{,}961{,}981.^[30] The time complexity is O(k \log^3 n), dominated by k modular exponentiations each taking O(\log^3 n) time via repeated squaring and modular multiplication, making it practical for very large n such as those in cryptographic applications.^[10]

Example application

To illustrate the application of the Miller-Rabin primality test, consider the number n = 561, which factors as $3 \times 11 \times 17 and is thus composite, yet it is the smallest Carmichael number, passing the Fermat primality test for all bases coprime to it.^[31]^[32] For the Miller-Rabin test with base a = 2, first decompose n-1 = 560 = 2^4 \times 35, so s = 4 and d = 35. Compute x_0 = 2^{35} \mod 561 using modular exponentiation, which yields 263 (verified via successive squaring: $2^{10} \equiv 463, $2^{20} \equiv 67, $2^{30} \equiv 67 \times 463 \equiv 166 \pmod{561} (since $67 \times 463 = 31021, $561 \times 55 = 30855, $31021 - 30855 = 166), then $2^{35} = 2^{30} \times 2^5 \equiv 166 \times 32 = 5312 \equiv 263 \pmod{561} (since $561 \times 9 = 5049, $5312 - 5049 = 263)).^[32] Since x_0 = 263 \not\equiv 1 \mod 561 and $263 \not\equiv -1 \equiv 560 \mod 561, proceed by squaring successively up to s-1 = 3 times to check for -1 \mod 561:

x_1 = 263^2 \mod 561 = 69169 \mod 561 = 166 (since $561 \times 123 = 69003, $69169 - 69003 = 166), and $166 \not\equiv 560 \mod 561.
x_2 = 166^2 \mod 561 = 27556 \mod 561 = 67 (since $561 \times 49 = 27489, $27556 - 27489 = 67), and $67 \not\equiv 560 \mod 561.
x_3 = 67^2 \mod 561 = 4489 \mod 561 = 1 (since $561 \times 8 = 4488, $4489 - 4488 = 1), and $1 \not\equiv 560 \mod 561.

The final squaring gives x_4 = 1^2 \equiv 1 \mod 561, confirming $2^{560} \equiv 1 \mod 561 (passing Fermat), but since no x_r \equiv -1 \mod 561 for r = 0 to $3 and x_0 \not\equiv 1 \mod 561, the Miller-Rabin test identifies 561 as composite with base 2 as a witness.^[32] The intermediate values are summarized in the following table:

r	x_r = 2^{35 \times 2^r} \mod 561	Notes
0	263	\not\equiv 1, -1
1	166	\not\equiv -1
2	67	\not\equiv -1
3	1	\not\equiv -1
4	1	(Fermat check)

In contrast, applying the test to the true prime n = [13](/page/13) with base a = 2 succeeds: n-1 = 12 = 2^2 \times 3, so s = 2, d = 3; x_0 = 2^3 \mod [13](/page/13) = 8 \not\equiv 1, 12 \mod [13](/page/13); then x_1 = 8^2 = [64](/page/64) \equiv 12 \equiv -1 \mod [13](/page/13), satisfying the condition, confirming probable primality (and known primality).^[32] This example highlights how Miller-Rabin can detect composites like Carmichael numbers that fool weaker tests, with 561 passing Fermat but failing the strong pseudoprimality condition for base 2.^[32]

Applications and Examples

Role in cryptography

Probable primes play a central role in public-key cryptography by enabling the efficient generation of large prime factors required for secure key establishment and digital signatures. In the RSA cryptosystem, two large probable primes p and q, each typically around 1024 bits for a 2048-bit modulus n = p × q (with larger sizes such as 3072 or 4096 bits recommended for higher security), are generated and multiplied to form the public modulus, with security relying on the difficulty of factoring n.^[33] This approach allows for asymmetric encryption and signing, where the primes must be kept secret to prevent recovery of the private key.^[33] The Miller-Rabin primality test is widely employed for verifying these probable primes due to its speed and low error rate when iterated, making it suitable for testing candidates in the 2048-bit range and beyond during RSA key generation.^[34] According to NIST FIPS 186-5, multiple rounds of Miller-Rabin—such as 2 iterations for 2048-bit primes to achieve error probabilities below $2^{-128}—provide security levels of 128 bits or higher by ensuring the probability of mistaking a composite for a prime is negligible.^[35] This standard applies to RSA key generation; DSA key generation is no longer approved for new systems but supported for verification of legacy keys, where a large probable prime p (2048 bits or more) and a smaller prime q dividing p-1 were historically required.^[35] To generate these primes, cryptographic systems start with random odd integers of the target bit length and apply sequential Miller-Rabin tests until a probable prime is identified, a process that is computationally feasible given the density of primes. By the prime number theorem, the probability that a random integer n is prime is asymptotically $1 / \ln n, implying that for a 2048-bit n (around $2^{2048}), roughly \ln(2^{2048}) \approx 1418 candidates need testing on average, though focusing on odds halves this effort for large values.^[36]^[33] This methodology traces back to the 1970s, when probable primes first enabled practical public-key systems; the Diffie-Hellman key exchange, introduced in 1976, uses a large prime modulus q (e.g., 200 bits initially, now typically 2048 bits or more) over which discrete logarithms are computed for secure key agreement without shared secrets.^[37] By the 1990s, the approach extended to DSA in NIST FIPS 186, standardizing probable prime generation for government-approved digital signatures with primes initially up to 1024 bits, later increased in subsequent revisions.^[35]

Known pseudoprimes and pitfalls

One notable example of a pseudoprime is 341, which is the smallest odd composite number and a Fermat pseudoprime to base 2, satisfying $2^{340} \equiv 1 \pmod{341} despite being $11 \times 31.^[27] Another famous case is 561, the smallest Carmichael number, which is a Fermat pseudoprime to every base coprime to it and also a strong pseudoprime (in the Miller-Rabin sense) to several bases, including 50.^[27] These examples illustrate how composites can mimic primes in probabilistic tests, with 561 evading detection across multiple bases due to its special square-free form where it passes Fermat's Little Theorem for all coprime bases.^[9] A significant pitfall in probabilistic primality testing arises from using fixed bases, as adversaries can construct composites tailored to pass tests for those specific bases, such as strong pseudoprimes to bases 2 and 3 like 1,373,653.^[9] To counter this, multiple or randomly chosen bases are essential; for instance, testing with bases 2, 3, and 5 reduces the number of undetected strong pseudoprimes below $25 \times 10^9 to just 13.^[38] Detection challenges stem from the rarity of absolute pseudoprimes in strong tests, where no composite is a strong pseudoprime to all bases coprime to it—unlike Fermat absolute pseudoprimes (Carmichael numbers), ensuring that sufficient bases will always reveal compositeness.^[39] Strong pseudoprimes are exceedingly sparse; for example, no strong pseudoprimes to the first 46 prime bases (2 through 199) smaller than a certain bound exist beyond known large instances, such as a 337-digit composite that passes the Miller-Rabin test for those bases.^[38] To mitigate these risks and achieve higher assurance, probabilistic tests like Miller-Rabin are often combined with trial division to eliminate small factors or followed by deterministic methods such as the elliptic curve primality proving (ECPP) algorithm, which provides unconditional proofs for numbers up to thousands of digits.^[9] This hybrid approach is standard in cryptographic applications to avoid pseudoprime pitfalls while maintaining efficiency.^[9]