RSA problem
The RSA problem is a fundamental computational challenge in public-key cryptography, defined as the task of recovering a plaintext message m from its ciphertext c, given only the public key components: the modulus n (a product of two large primes p and q) and the encryption exponent e, where c \equiv m^e \pmod{n}.[1] This problem assumes that m is randomly chosen from the interval [0, n-1] and that n is sufficiently large, making direct computation of the e-th modular root infeasible without the corresponding private exponent d, which satisfies ed \equiv 1 \pmod{(p-1)(q-1)}.[1] The hardness of the RSA problem establishes RSA as a trapdoor one-way function, enabling secure encryption and digital signatures.[1]
The RSA cryptosystem, which relies on the presumed intractability of the RSA problem, was invented in 1977 by Ronald L. Rivest, Adi Shamir, and Leonard Adleman at MIT as a practical implementation of public-key cryptography concepts introduced by Diffie and Hellman.[2] Their method, first detailed in a 1978 Communications of the ACM paper, allows anyone to encrypt messages using a publicly available key while only the designated recipient can decrypt them using a private key derived from the factorization of n.[2] Key generation involves selecting large primes p and q, computing n = pq, choosing e coprime to \phi(n) = (p-1)(q-1), and deriving d as the modular inverse of e modulo \phi(n).[2] This innovation addressed key distribution challenges in symmetric cryptography, enabling applications like secure electronic mail and digital signatures without prior shared secrets.[2]
The security of the RSA problem is closely tied to the integer factorization problem: while factoring n into p and q allows computation of d and thus solution of the RSA problem, the converse—whether solving the RSA problem efficiently implies factoring—is an open question.[3] No polynomial-time algorithm exists for either on classical computers when n has 2048 bits or more, with the best known factoring method being the General Number Field Sieve, which runs in subexponential time.[3] However, certain weak implementations are vulnerable; for instance, small private exponents d can be recovered via continued fraction attacks if d < N^{1/4}/3, and low public exponents like e=3 enable attacks like Håstad's broadcast attack on multiple related ciphertexts.[3] Despite these, properly parameterized RSA remains widely deployed in protocols such as TLS for web security, SSH for remote access, and PGP for email encryption.[3]
Mathematical Foundations
Modular Arithmetic Essentials
Modular arithmetic is a system of arithmetic for integers where numbers "wrap around" upon reaching a fixed value called the modulus, enabling computations in finite sets. The fundamental relation is congruence: two integers a and b are congruent modulo n, denoted a \equiv b \pmod{n}, if n divides a - b, or equivalently, if a and b leave the same remainder when divided by n.[4] The modulo operation a \mod n yields the remainder r where $0 \leq r < n and a \equiv r \pmod{n}.[4]
The operations in modular arithmetic preserve congruence. Specifically, if a \equiv b \pmod{n} and c \equiv d \pmod{n}, then a + c \equiv b + d \pmod{n} and ac \equiv bd \pmod{n}.[4] These properties ensure that addition and multiplication can be performed by first computing the results and then reducing modulo n, maintaining consistency within the residue classes. Additionally, if \gcd(a, n) = 1, then a has a multiplicative inverse modulo n, denoted a^{-1}, such that a \cdot a^{-1} \equiv 1 \pmod{n}; this inverse exists uniquely because a and n are coprime, allowing the solution to the congruence a x \equiv 1 \pmod{n}.[5]
Modular exponentiation, computing a^e \mod n for large e, is efficiently performed using the square-and-multiply algorithm, also known as binary exponentiation, which reduces the time complexity from O(e) to O(\log e) by leveraging the binary representation of e.[6] The algorithm initializes the result to 1 and iterates over the bits of e from most to least significant, squaring the current result at each step and multiplying by a if the bit is 1, with all operations reduced modulo n. The following pseudocode illustrates the process for computing a^e \mod n:
result ← 1
for each bit b in e from MSB to LSB do
result ← (result²) mod n
if b = 1 then
result ← (result * a) mod n
return result
result ← 1
for each bit b in e from MSB to LSB do
result ← (result²) mod n
if b = 1 then
result ← (result * a) mod n
return result
[6]
For a small example, compute $3^5 \mod 7 where $5in binary is101_2. Start with result = 1. For the highest bit (1): result = 1^2 \mod 7 = 1, then result = (1 \cdot 3) \mod 7 = 3. Next bit (0): result = 3^2 \mod 7 = 2, no multiplication. Lowest bit (1): result = 2^2 \mod 7 = 4, then result = (4 \cdot 3) \mod 7 = 5. The result 3^5 = 243 \equiv 5 \mod 7$ verifies the computation.[6]
The ring of integers modulo n, denoted \mathbb{Z}/n\mathbb{Z}, consists of the residue classes \{0, 1, \dots, n-1\} under addition and multiplication modulo n. When n = pq for distinct primes p and q, \mathbb{Z}/n\mathbb{Z} is isomorphic to the product ring \mathbb{Z}/p\mathbb{Z} \times \mathbb{Z}/q\mathbb{Z} by the Chinese Remainder Theorem, since \gcd(p, q) = 1; this decomposition allows elements to be represented as pairs (x \mod p, x \mod q) with component-wise operations.[7]
Euler's Theorem and Totient Function
Euler's totient function, denoted \phi(n), counts the number of positive integers up to n that are relatively prime to n.[8] For a general positive integer n with prime factorization n = p_1^{k_1} p_2^{k_2} \cdots p_m^{k_m}, the formula is \phi(n) = n \prod_{i=1}^m \left(1 - \frac{1}{p_i}\right)./04:_Multiplicative_Number_Theoretic_Functions/4.02:_Multiplicative_Number_Theoretic_Functions) When n = pq for distinct primes p and q, this simplifies to \phi(n) = (p-1)(q-1), as the product excludes higher powers.[8]
Euler's theorem states that if \gcd(a, n) = 1, then a^{\phi(n)} \equiv 1 \pmod{n}.[9] This result generalizes Fermat's Little Theorem to composite moduli and highlights the cyclic structure of multiplication modulo n for coprime elements.
A proof of Euler's theorem relies on group theory: the set (\mathbb{Z}/n\mathbb{Z})^\times of residue classes modulo n that are coprime to n forms a multiplicative group of order \phi(n)./03:_Congruences/3.05:_Theorems_of_Fermat_Euler_and_Wilson) By Lagrange's theorem, the order of any element a \pmod{n} in this group divides \phi(n), so there exists an integer k such that a^k \equiv 1 \pmod{n} and k divides \phi(n). Raising both sides to the power \phi(n)/k yields a^{\phi(n)} \equiv 1 \pmod{n}.[9]
The Chinese Remainder Theorem (CRT) provides an isomorphism \mathbb{Z}/n\mathbb{Z} \cong \mathbb{Z}/p\mathbb{Z} \times \mathbb{Z}/q\mathbb{Z} when n = pq and p, q are distinct primes, since \gcd(p, q) = 1.[10] This ring isomorphism simplifies computations modulo composite n by reducing them to independent calculations modulo p and q, preserving addition and multiplication.[11] For instance, solving congruences or exponentiations modulo n can be decomposed via CRT, enhancing efficiency in modular arithmetic.
Consider n = 15 = 3 \times 5: \phi(15) = (3-1)(5-1) = 8.[8] For a = 2 where \gcd(2, 15) = 1, compute $2^8 = 256 \equiv 1 \pmod{15}, verifying Euler's theorem. Modular exponentiation confirms this: successive powers yield $2^1 \equiv 2, $2^2 \equiv 4, $2^4 \equiv 1 \pmod{15}, so $2^8 = (2^4)^2 \equiv 1^2 = 1 \pmod{15}.[9]
RSA Cryptosystem Overview
Key Generation Procedure
The key generation procedure in the RSA cryptosystem begins with the selection of two large, distinct prime numbers, p and q, which serve as the foundational secret factors. These primes are typically chosen to be odd and of comparable bit length to ensure balance in the resulting modulus, with candidates generated randomly and subjected to rigorous primality testing. Probabilistic tests such as the Miller-Rabin algorithm are employed for efficiency, where a number is declared prime if it passes multiple iterations with randomly selected bases; for cryptographic security, at least 40 rounds are recommended for 2048-bit candidates to achieve negligible error probability.[12][13]
Once p and q are verified as prime, the modulus n is computed as their product: n = p \times q. The totient function \phi(n) is then calculated as \phi(n) = (p-1)(q-1), which represents the number of integers up to n that are coprime to it. This value is kept secret and used in subsequent steps. Security standards recommend that n have at least 2048 bits in length as of 2025 to provide at least 112 bits of security strength, with longer moduli (e.g., 3072 bits) for higher assurance against factorization attacks.[14][15]
The public exponent e is selected next, commonly chosen as a small, fixed value like 65537 (which is $2^{16} + 1) for computational efficiency and resistance to certain attacks, provided that \gcd(e, \phi(n)) = 1. If the gcd condition fails, a different e is tried. The private exponent d is computed as the modular multiplicative inverse of e modulo \phi(n), satisfying e \times d \equiv 1 \pmod{\phi(n)}. This inverse is efficiently found using the extended Euclidean algorithm, which extends the Euclidean algorithm to yield coefficients solving Bézout's identity for the gcd.[14][2]
To illustrate with small values (not for practical use), suppose p = 61 and q = 53; then n = 3233 and \phi(n) = 3120. Choosing e = 17 (since \gcd(17, 3120) = 1), the extended Euclidean algorithm yields d = 2753, as $17 \times 2753 = 46801 = 15 \times 3120 + 1. The public key is the pair (n, e), while the private key is typically (n, d); for performance optimization in decryption, an enhanced private key may include p, q, and Chinese Remainder Theorem (CRT) parameters such as d_p = d \mod (p-1), d_q = d \mod (q-1), and q_{\text{inv}} = q^{-1} \mod p, enabling faster computations via parallel modular exponentiations.[14][16]
Encryption and Decryption Mechanics
In the RSA cryptosystem, encryption transforms a plaintext message representative m, where $0 \leq m < n, into a ciphertext representative c using the public key (n, e) via the operation c = m^e \mod n.[17] This modular exponentiation ensures the ciphertext remains within the range $0 \leq c < n, preserving the size constraint of the modulus n.[2]
Decryption reverses this process using the private key, typically the exponent d, to recover the original message as m = c^d \mod n.[18] The correctness of this operation relies on the relationship ed \equiv 1 \pmod{\phi(n)}, where \phi is Euler's totient function; by Euler's theorem, for \gcd(m, n) = 1, it follows that (m^e)^d \equiv m^{ed} \equiv m \pmod{n}.[2] In practice, padding schemes address cases where \gcd(m, n) > 1 by structuring the message to avoid direct exposure of raw m.[19]
To enhance security against chosen-ciphertext attacks, raw RSA encryption is not used alone; instead, padding schemes such as PKCS#1 v1.5 or Optimal Asymmetric Encryption Padding (OAEP) are applied. PKCS#1 v1.5 prepends a structured padding string to the message before encryption, while OAEP incorporates randomization via a mask generation function and hash for provable security under standard assumptions.[20][21]
Consider a small illustrative example with n = 55, public exponent e = 3, and private exponent d = 27. For message m = 9, encryption yields c = 9^3 \mod 55 = 729 \mod 55 = 14. Decryption then computes $14^{27} \mod 55 = 9, recovering the original message.[2]
For efficiency in decryption, especially with large moduli, the Chinese Remainder Theorem (CRT) is employed when the prime factors p and q of n are known. This involves computing m_1 = c^{d_p} \mod p and m_2 = c^{d_q} \mod q (where d_p = d \mod (p-1) and d_q = d \mod (q-1)), then combining via m = m_2 + q \cdot ((m_1 - m_2) \cdot q^{-1} \mod p) \mod n, which reduces computational cost by approximately a factor of four compared to direct exponentiation modulo n.[22]
Core Statement of the RSA Problem
The RSA problem, introduced by Rivest, Shamir, and Adleman in their seminal 1978 paper on public-key cryptosystems, formalizes the computational challenge underlying the security of the RSA encryption scheme.[23] It centers on the difficulty of extracting e-th roots modulo a composite integer n, where n is the product of two large distinct primes p and q.
Formally, the RSA problem is defined as follows: given a positive integer n = pq (with p and q distinct primes), a positive integer e such that gcd(e, φ(n)) = 1 where φ(n) = (p-1)(q-1) is Euler's totient function, and an integer c with 0 ≤ c < n, compute an integer m such that m^e ≡ c (mod n).[24] This task represents the search variant of the problem, which requires explicitly finding the root m; a decision variant, determining whether such an m exists (i.e., if c lies in the image of the exponentiation map), is less studied because, under the standard assumptions (gcd(c, n) = 1 and gcd(e, φ(n)) = 1), the map is a bijection on the multiplicative group (ℤ/nℤ)*, making existence trivial to affirm.[24]
The RSA problem arises in the context of the RSA trapdoor permutation, a family of permutations on (ℤ/nℤ)* indexed by the public parameters (n, e). The forward direction, computing m^e mod n, is efficient for any m coprime to n, but inversion—recovering m from c = m^e mod n—is computationally infeasible without knowledge of the trapdoor, which is the private exponent d satisfying ed ≡ 1 (mod φ(n)). With d, inversion is straightforward via m = c^d mod n, restoring the bijection.[25] This asymmetry makes RSA a prototypical trapdoor one-way function, where the hardness of inversion without the trapdoor underpins its cryptographic utility.[25]
In practice, parameters for the RSA problem are chosen to ensure security: n is typically at least 2048 bits in length to provide adequate resistance against current computational capabilities, with recommendations extending to 3072 bits or more for post-2030 security levels. The exponent e is often fixed to a small value like 65537 (a Fermat prime ensuring gcd(e, φ(n)) = 1 with high probability and minimizing computation), though random e coprime to φ(n) can also be used.[25]
For illustration, consider a small instance with n = 35 (p = 5, q = 7), e = 3 (coprime to φ(35) = 24), and c = 13. The task is to find m such that m^3 ≡ 13 (mod 35); one solution is m = 12, since 12^3 = 1728 and 1728 ≡ 13 (mod 35), but computing this without factoring n or knowing d is the core challenge, even in toy cases.[24]
Variants and Generalizations
Multi-prime RSA extends the standard formulation by constructing the modulus n as the product of k > 2 distinct primes p_1, p_2, \dots, p_k, rather than just two. In this variant, the RSA problem involves computing the e-th root modulo n, where e is coprime to \phi(n) = \prod_{i=1}^k (p_i - 1), given a ciphertext c = m^e \mod n. This generalization allows for faster decryption using the Chinese Remainder Theorem across more factors, but it introduces potential security trade-offs, as the increased number of primes can make factoring n easier if the primes are not sufficiently balanced in size. For instance, with 1024-bit moduli, up to three primes maintain security comparable to two-prime RSA, but larger k values reduce the effective bit security.[26]
The strong RSA assumption strengthens the standard RSA problem relative to the standard assumption by positing greater hardness in the following sense: given n (product of two large primes) and random c \in \mathbb{Z}_n^*, it is computationally infeasible to find integers f \in \mathbb{Z}_n^* and e' \geq 2 such that f^{e'} \equiv c \pmod{n}.[1] This assumption holds even in cases where an e-th root of c might exist and be computable for a fixed e, but extracting a non-trivial pair (f, e') remains hard without knowledge of the factorization of n. The strong RSA assumption underpins various cryptographic primitives, such as certain signature schemes, due to its robustness against exponent manipulation.[1]
RSA exhibits partial homomorphic properties specifically under multiplication: if c_1 = m_1^e \mod n and c_2 = m_2^e \mod n, then c_1 \cdot c_2 \equiv (m_1 m_2)^e \pmod{n}, enabling the encryption of the product of plaintexts without decryption. However, it lacks additively homomorphic properties, as the sum of ciphertexts does not correspond to the encryption of the sum of plaintexts. This multiplicative homomorphism arises directly from the exponentiation structure in \mathbb{Z}_n^* and is a foundational trait of the original RSA design, though it renders unpadded RSA malleable and vulnerable to certain attacks.
Generalizations of the RSA problem extend beyond \mathbb{Z}/n\mathbb{Z} to broader ring structures, such as rings with commuting ideals (CI-rings), where ideals replace prime factors. In this setting, the modulus is an ideal A = M_1 M_2 \cdots M_k formed by pairwise comaximal maximal ideals M_i, and an analogue of Euler's totient function is defined as \phi(A) = |U(R/A)|, the order of the unit group in the quotient ring R/A. The problem then requires computing e-th roots modulo A, with e coprime to \phi(A), preserving the hardness assumption when the ideal factorization is secret. Such extensions apply to non-commutative rings like Hurwitz quaternions or skew polynomial rings, maintaining the core difficulty of root extraction without revealing the ideal factors. Padded variants, such as OAEP, further generalize by incorporating randomness to enhance security against chosen-ciphertext attacks while retaining the underlying root-finding challenge.[27]
Quantum generalizations highlight the vulnerability of RSA variants to Shor's algorithm, which efficiently factors the modulus n (or its multi-prime or ideal-based analogues) in polynomial time on a quantum computer, thereby solving the root-extraction problem across these structures by recovering the secret factors. This impacts all factoring-based RSA forms uniformly, rendering them insecure at the quantum scale without modifications like lattice-based alternatives.[28]
As an illustrative example of the strong RSA assumption, consider a toy modulus n = 91 = 7 \times 13 and ciphertext c = [64](/page/64). For a fixed small exponent e = 3, the cube root is $4 since $4^3 = [64](/page/64) \equiv [64](/page/64) \pmod{91}, which is easily verifiable but assumes knowledge of the form; however, finding an alternative pair like f = 8, e' = 2 where $8^2 = [64](/page/64) \equiv [64](/page/64) \pmod{91} requires searching exponents without the factorization, demonstrating the added hardness of arbitrary exponent selection even when one root exists. In larger instances, this gap widens, as computing any non-trivial (f, e') pair without factoring remains infeasible.[1]
Hardness and Assumptions
Link to Integer Factorization Problem
The RSA problem is conjectured to be computationally equivalent to the integer factorization problem of the modulus n = pq, where p and q are large primes; this equivalence has remained an open question since the RSA cryptosystem's proposal in 1978. The one-way implication is straightforward: an efficient algorithm for factoring n directly solves the RSA problem, as the factors p and q allow computation of Euler's totient \phi(n) = (p-1)(q-1), followed by the private exponent d \equiv e^{-1} \pmod{\phi(n)}, enabling decryption of any ciphertext.
The converse implication—that an efficient RSA solver yields an efficient factoring algorithm—lacks a deterministic polynomial-time proof, though strong evidence exists through probabilistic reductions and results in restricted models. In particular, the original RSA analysis, building on Gary Miller's 1976 results, establishes a probabilistic polynomial-time equivalence between computing the private exponent d and factoring n when the public exponent e is chosen randomly. Further support comes from the generic ring model, where Aggarwal and Maurer (2008) proved that any generic algorithm for breaking RSA can be transformed into a generic factoring algorithm for n, with only a polynomial overhead.[29] Methods like Coppersmith's attack provide additional evidence for specific cases, such as low-exponent RSA, but do not resolve the general equivalence.
This conjectured equivalence underpins RSA's security: the hardness of the RSA problem reduces to the assumed hardness of factoring large semiprimes, with no known algorithms solving RSA instances faster than factoring n as of 2025.[29] Despite decades of research, no proof of full equivalence exists, and no sub-exponential attacks on properly generated RSA moduli beyond factoring have emerged, reinforcing the conjecture's validity.[30]
Computational Complexity Analysis
The RSA assumption posits that the function family defined by f_{n,e}(m) = m^e \pmod{n}, where n = pq is the product of two large primes p and q, and e is a public exponent coprime to \phi(n), constitutes a one-way function when n is sufficiently large and randomly generated.[1] This means that, for a randomly chosen plaintext m in \{0, \dots, n-1\}, computing the e-th root modulo n (i.e., inverting f_{n,e}) is computationally infeasible without knowledge of the factorization of n.[31] Formally, no probabilistic polynomial-time algorithm can succeed in inversion with non-negligible probability over the random choices of n, e, and m.[1]
In terms of computational complexity, the RSA problem is believed to lie outside the class P, as no polynomial-time algorithm is known to solve it for large n, despite extensive research.[32] However, it resides in NP, since a purported solution (the plaintext m) can be verified in polynomial time by checking if m^e \equiv c \pmod{n}.[31] The hardness stems from the underlying difficulty of integer factorization, with the best classical algorithms, such as the general number field sieve, requiring subexponential but superpolynomial time.[32]
Regarding reductions, the RSA problem is polynomially reducible to integer factorization in the forward direction: given the factorization of n, one can compute the private exponent d and thus invert any ciphertext in polynomial time.[1] The converse reduction holds only partially; specifically, in oracle models where an inverter for the RSA function is available, the modulus n can be factored using a polynomial number of queries to the oracle, but no unconditional polynomial-time reduction from factoring to the RSA problem is known.[1]
The RSA problem exhibits hardness in both average and worst cases for randomly generated instances. Due to its self-reducibility—allowing any single successful inversion to facilitate decryption of related ciphertexts—the problem is as hard on average over random n and e as it is in the worst case, ensuring that no efficient algorithm can invert a non-negligible fraction of random challenges.[1]
Under the RSA assumption, provable security results exist in idealized models. In particular, the RSA-OAEP encryption scheme achieves chosen-ciphertext security in the random oracle model, reducing its security directly to the one-wayness of the RSA function family.[33]
As of 2025, no subexponential algorithmic improvements to solving the RSA problem have emerged beyond those applicable to integer factorization, maintaining its presumed classical hardness; however, post-quantum cryptographic concerns highlight vulnerabilities to advanced computational paradigms.[32]
Attacks and Developments
Historical and Theoretical Attacks
The RSA cryptosystem, introduced in 1978, initially employed relatively small key sizes, often around 100 to 200 bits, which were vulnerable to factoring attacks using methods available at the time, such as trial division or early versions of the number field sieve. These early breaks, demonstrated in the late 1970s and throughout the 1980s, highlighted the need for larger moduli to ensure security; for instance, by the early 1990s, practical factorizations of 400-bit RSA keys underscored the risks, prompting recommendations for at least 512-bit keys and eventually 1024 bits or more. Such vulnerabilities were not theoretical flaws in the RSA problem itself but rather stemmed from insufficient computational hardness assumptions for small parameters, leading to widespread adoption of larger key sizes in standards by the mid-1990s.[3]
One of the earliest theoretical attacks exploiting poor key generation is Wiener's attack, which targets RSA instances where the private exponent d is small, specifically d < \frac{1}{3} N^{1/4}, where N = pq is the modulus with primes p and q satisfying q < p < 2q. The attack recovers d efficiently by computing the continued fraction expansion of e/N, where e is the public exponent; convergents of this expansion yield candidates for d because |e/N - k / \phi(N)| is sufficiently small for some integer k, allowing verification via computation of \phi(N). This method, running in polynomial time, succeeds with high probability under the bound and relies on the lattice reduction properties implicit in continued fractions. Wiener's 1990 paper demonstrated that approximately 4% of randomly generated RSA keys with 512-bit moduli were vulnerable if d was chosen smaller than N^{1/4}. An extension by Boneh and Durfee improved this bound to d < N^{0.292} using lattice-based techniques on the equation e d - k \phi(N) = 1, incorporating Coppersmith's method for finding small solutions to modular equations; this attack constructs a lattice from partial information on d and \phi(N), solving for the private key in subexponential time relative to the bound.[34][3]
Attacks on low public exponents, such as e = 3, exploit deterministic padding or broadcast scenarios, as formalized by Håstad's broadcast attack. In this scenario, if the same plaintext M < N_{\min} (where N_{\min} is the smallest modulus) is encrypted for k > e recipients using distinct pairwise coprime moduli N_i and the same small e, the attacker can recover M by solving a system of congruences c_i \equiv M^e \pmod{N_i} via the Chinese Remainder Theorem after combining the equations. Håstad's 1988 theorem shows this recovers M in polynomial time, emphasizing the dangers of reusing messages without randomization; for e=3, only three encryptions suffice. Coppersmith's theorem extends these low-exponent vulnerabilities by enabling recovery of small plaintexts or partial information. Specifically, for low e, if fewer than e bits of the plaintext are known or if M < N^{1/e}, the theorem finds roots of the univariate polynomial f(x) = x^e - c \pmod{N} where |x_0| < N^{1/e - \epsilon} using the LLL lattice reduction algorithm on a shifted basis, running in time polynomial in \log N and $1/\epsilon. This applies to cases like e=3 with partial known bits, breaking RSA encryption without full factorization.[35]
Faulty implementations of RSA decryption using the Chinese Remainder Theorem (CRT) introduce theoretical weaknesses exploitable through induced errors. In Boneh, DeMillo, and Lipton's fault attack, if a single random bit error occurs during one CRT computation (e.g., in d_p or d_q) while the other proceeds correctly, the resulting faulty signature M' satisfies M'^e \equiv M \pmod{N} except at the error point, allowing the attacker to compute \gcd(N, M'^e - M) to reveal a prime factor of N with high probability, assuming access to M and multiple faulty signatures. This 1997 attack requires no padding randomization and succeeds even for large keys, demonstrating that hardware faults can fully break RSA in O(\log N) time after obtaining two faulty outputs.
Theoretical models of side-channel attacks, such as timing and power analysis, reveal information leaks in RSA computations, with Bleichenbacher's padding oracle providing a seminal example. Bleichenbacher's 1998 attack targets PKCS#1 v1.5 padding in RSA encryption protocols, where an oracle (e.g., a server rejecting invalid paddings) reveals whether a ciphertext decrypts to a valid "00 02" padded block. By adaptively querying modified ciphertexts c' = c \cdot r^e \pmod{N} and using the oracle's responses (valid/invalid), the attacker narrows the plaintext interval from [1, N] to a small set containing the original M, requiring about $10^6 oracle calls to decrypt fully; this exploits the multiplicative blinding property and interval refinement via binary search-like iterations. The attack models a chosen-ciphertext scenario without direct decryption access, highlighting padding as a vulnerability in timing or error-message side-channels.
Recent Advances in Solving Instances
The General Number Field Sieve (GNFS) remains the state-of-the-art algorithm for factoring large semiprimes, the core challenge underlying the RSA problem. As of 2025, the largest RSA number factored using GNFS is RSA-250, a 250-digit (829-bit) semiprime, achieved in February 2020 by an international team using the open-source CADO-NFS implementation. This computation required approximately 2,700 core-years on modern processors, highlighting the immense resources needed for such feats.
The RSA Factoring Challenge, initiated by RSA Laboratories in 1991 and officially discontinued in 2006, spurred significant advancements in factorization techniques by offering cash prizes for solving specific semiprimes. Although the challenge ended, efforts continued, with the largest success being the factorization of RSA-768—a 232-digit (768-bit) number—in December 2009 by a collaborative team using GNFS on hundreds of machines. This marked a milestone, as it exceeded previous records and demonstrated the scalability of distributed sieving methods.
Modern factoring efforts leverage specialized hardware and distributed computing to accelerate the sieving phase of GNFS, which dominates runtime. Graphics processing units (GPUs) have been integrated into tools like CADO-NFS for parallel relation collection, providing speedups over CPU-only approaches. Projects such as NFS@Home utilize volunteer computing networks, including GPUs and ASICs, to contribute to sieving for large targets, enabling progress on numbers beyond individual capabilities.
Since 2020, no larger RSA numbers have been factored using classical methods, with RSA-250 holding the record as of November 2025; ongoing distributed projects like NFS@Home continue sieving for candidates such as RSA-256, but full factorizations remain elusive due to escalating computational demands. Estimates suggest that factoring a 1024-bit RSA modulus would require around 500,000 core-years, roughly 200 times the effort for RSA-250.[36]
These advances have prompted reevaluations of RSA key sizes. The National Institute of Standards and Technology (NIST) deems 2048-bit RSA keys secure against classical factoring through at least 2030, but recommends migrating to 3072-bit or larger keys for extended protection beyond that horizon, or transitioning to post-quantum algorithms to counter emerging threats. Organizations like the German Federal Office for Information Security (BSI) endorse RSA keys of 2000 bits or more for transitional use.[37]
| RSA Number | Decimal Digits | Bit Length | Factored (Year) | Key Details |
|---|
| RSA-155 | 155 | 512 | April 1999 | Factored using GNFS on a distributed network of ~300 workstations; took ~5 months. |
| RSA-200 | 200 | 663 | May 2005 | GNFS on 80 AMD Opteron CPUs; ~18 months of computation. |
| RSA-768 | 232 | 768 | December 2009 | GNFS with ~1,000 machines; ~2 years total effort. |
| RSA-240 | 240 | 795 | November 2019 | CADO-NFS on supercomputers; ~900 core-years.[38] |
| RSA-250 | 250 | 829 | February 2020 | CADO-NFS with GPU acceleration; ~2,700 core-years. |
Broader Implications
Role in Modern Cryptography
The RSA problem serves as the foundational hardness assumption for several core cryptographic primitives in modern systems, particularly in digital signatures where the difficulty of inverting the RSA function ensures the unforgeability of signatures. The RSASSA-PSS (RSA Probabilistic Signature Scheme) is a widely adopted variant, providing provable security under the RSA assumption by incorporating random padding to prevent existential forgery attacks.[39] This scheme is recommended over older deterministic methods like RSA-PKCS#1 v1.5 due to its stronger security guarantees against chosen-message attacks, relying on the computational infeasibility of recovering the plaintext from a ciphertext without the private key.[40] RSA-based signatures are integral to protocols like S/MIME for email security and code signing in software distribution, where the inversion hardness directly underpins non-repudiation.
In key exchange protocols, RSA has historically enabled initial key transport in TLS/SSL handshakes by allowing clients to encrypt a premaster secret with the server's public key, facilitating secure session establishment.[41] However, its usage has declined significantly since 2015 with the rise of TLS 1.3, which eliminates static RSA key exchange in favor of ephemeral Diffie-Hellman variants to provide forward secrecy and mitigate risks from compromised long-term keys. RSA key exchange is now primarily limited to legacy TLS 1.2 deployments, reflecting a broader shift away from non-forward-secret mechanisms.[42]
RSA also plays a key role in hybrid encryption schemes, where it is used for asymmetric key wrapping of symmetric session keys, such as AES, to securely transport bulk encryption keys over insecure channels. This approach leverages RSA's public-key capabilities to protect the symmetric key, which then encrypts the actual data payload, balancing efficiency and security in applications like PGP and secure file transfers.[43] The RSA-KEM (Key Encryption Mechanism) standard formalizes this process, ensuring tight security reductions to the RSA problem's hardness.[44]
Under standards like NIST FIPS 186-5 (published in 2023), RSA remains an approved algorithm for digital signatures with moduli of at least 2048 bits, paired with approved hash functions like SHA-256, supporting its continued use in federal systems for authentication and integrity verification.[45] However, due to emerging quantum threats—where Shor's algorithm could efficiently factor large RSA moduli—NIST has flagged deprecation risks, recommending migration planning by 2030 for RSA-2048 and full disallowance by 2035.[46] including the finalized standards ML-KEM, ML-DSA, and SLH-DSA in FIPS 203, 204, and 205 (August 2024), which provide alternatives for encryption and signatures. The percentage of HTTPS websites relying on RSA-based certificates has decreased significantly since 2010, underscoring RSA's transitional role as systems bridge to elliptic curve cryptography (ECC) and lattice-based post-quantum schemes like Kyber and Dilithium.[47] This hybrid adoption period allows interoperability while phasing out RSA-dependent protocols to maintain long-term security.[48]
The RSA problem is closely related yet distinct from the integer factorization problem (IFP), which requires decomposing a composite modulus n = pq into its prime factors p and q. Solving the RSA problem—recovering the e-th root of a random ciphertext modulo n—is widely believed to be computationally equivalent to factoring n, as an oracle for one could efficiently solve the other in expected polynomial time under certain conditions. However, this equivalence remains unproven in the general case, with evidence suggesting that low-exponent instances of the RSA problem may be easier to break than full factorization. In the generic ring model of computation, however, generic algorithms for the RSA problem imply efficient factoring, establishing a tight reduction in that setting.[49][50][29]
In contrast to the discrete logarithm problem (DLP), the foundational hardness for Diffie-Hellman key exchange and elliptic curve cryptography, the RSA problem operates in the ring \mathbb{Z}_n^* rather than a prime-order multiplicative group. The DLP entails finding an exponent x such that g^x \equiv h \pmod{p} (or on an elliptic curve), a task with subexponential complexity via algorithms like the number field sieve, similar to the general number field sieve for RSA. While both problems exhibit comparable classical hardness for appropriately sized parameters, their structural differences mean advances in one do not directly translate to the other, though empirical records for solving large instances show roughly equivalent effort for RSA moduli around 768 bits and DLP in 180-bit prime fields.[51]
The quadratic residuosity problem serves as a weaker, specialized variant of the RSA problem, focusing on deciding whether an element y \in \mathbb{Z}_n^* is a quadratic residue modulo n = pq (i.e., whether there exists x such that x^2 \equiv y \pmod{n}) without knowing the factorization. This decisional assumption is easier than the general RSA problem, as it corresponds to the case e=2 and can be efficiently resolved given p and q, but remains intractable otherwise. It underpins the semantic security of the Goldwasser-Micali cryptosystem, the first provably secure public-key encryption scheme, highlighting its role as a foundational yet less demanding hardness base compared to full RSA exponentiation.[52][53]
The RSA assumption is classified as a strong cryptographic primitive due to the absence of complete equivalence proofs to factoring in the standard model, unlike the Diffie-Hellman assumption, where computational Diffie-Hellman (CDH) and decisional Diffie-Hellman (DDH) are tightly equivalent to the DLP in algebraic or generic group models. Partial results show that straight-line programs for RSA are nearly as hard as factoring, but no black-box reduction exists, making RSA reliant on potentially stronger conjectures than DLP-based systems. This distinction affects provable security constructions, with DLP enabling more modular reductions while RSA often requires additional idealized models like the random oracle.[49][54]
In the post-quantum context, both the RSA problem and DLP succumb to Shor's algorithm, which solves factoring and discrete logarithms in polynomial time on a sufficiently large quantum computer, rendering classical systems like RSA and elliptic curve Diffie-Hellman obsolete. This shared vulnerability contrasts sharply with code-based problems, such as syndrome decoding in McEliece cryptosystems, which resist all known quantum attacks and form the basis for NIST-standardized post-quantum alternatives.[55]