In this series of articles, I’d like to shed light on potential vulnerabilities and exploring various techniques that adversaries might employ to compromise the RSA cryptosystem. In this first article, we start by providing a comprehensive overview of how RSA works, emphasizing its strengths as a publickey cryptosystem. Then we show a common attack that can be applied when the RSA keys are not properly generated.
In upcoming articles, we will deep dive into the details of the most sophisticated attacks.
RSA introduction
RSA is one of the most popular public key (or asymmetric) crypto system, developed by Ron Rivest, Adi Shamir and Leonard Adleman (hence the name RSA) and published in 1977 for the first time.
In RSA a pair of keys is created (the public and the private key). Each key (public or private) can be used to encrypt the data and the the other key to decrypt the data.
Let’s briefly see how the process works. First we need to generate the public and the private keys:

Generate 2 large prime numbers
p
andq
(usually at least 512 bit each one) 
Calculate the modulus
n = p * q

Calculate the Euler’s totien function Φ(n) $$ \phi(n) = (p1)* (q1) $$

Find a valid exponent
e
such that both following conditions are satisfied: $$ 2 < e < \phi(n) $$ $$ gcd(\phi(n), e) = 1 $$ Meaning that
e
and Φ(n) must be coprime, hence the inverse ofe
always exists modulo Φ(n)
 Meaning that

Calculate the private key
d
as the modular multiplicative inverse ofe
modulo Φ(n). $$ d*e \equiv 1 \bmod \phi(n) \implies d \equiv e^{1} \bmod \phi(n) $$ This can be done efficiently by using the Extended Euclidian Algorithm
d
it’s easy to calculate only if Φ(n) is known, so it must be kept secret
Now we have everything to encrypt and decrypt the data:
 To encrypt a message
m
(where 0 <=m
<n
) we can compute the following (c
is the encrypted message):
$$ c \equiv m^e \bmod n $$
 To decrypt
c
(and get the original messagem
) using the private keyd
we can compute:
$$ m \equiv c^d \bmod n $$
The values n
and e
are public (the public key is indeed the tuple (n, e)
) while d
must be secret (anyone who knows d
can decrypt the message). Notice that also knowing Φ(n) it’s possible to compute d
, so it must be kept secret as well.
All the security behind RSA is based on the assumption that is really hard to factorize the product of the 2 large prime numbers p
and q
, hence it’s really hard to determine the private key d
.
At the time of writing, the biggest RSA key factorized is RSA250 (250 decimal digits, 829 bits long) and it took approximately 2700 CPU coreyears, using a 2.1 GHz Intel Xeon Gold 6130 CPU.
Breaking RSA when p and q are too close
Having p != q
is an RSA requirements, infact if p = q
then:
$$
n = p * q = p * p = p^2 \implies p = \sqrt n
$$
so the square root of n
would be easy to calculate and an attacker can just derive the private key d
(by calculating Φ(n) and then d
by using the Extended Euclidian Algorithm).
However this is not the only case where it’s possible to factorize n
and recover p
and q
.
Let’s assume that we choose p
and q
such that they are too close to each other.
In this case, it’s possible factorize N
using the Fermat’s factorization:
$$
N = (a + b) * (a  b) = a^2  b^2
$$
Where N
must be an odd number (our modulus) and (a + b)
is p
and (a  b)
is q
(or viceversa).
It follows that:
$$
b^2 = a^2  N
$$
so, if we assume that a
is “close” to the square root of N
we can consider the integer immediately above (the ceiling function)
$$
a = \lceil\sqrt(N)\rceil \implies b^2 = (\lceil\sqrt(N)\rceil)^2  N
$$
now:
$$
if \space b^2 \space is\space a\space square \implies p = (a + b),\space q=(ab)
$$
meaning that we found p
and q
. If not, we can just add 1
to a
and repeat the process. If p
and q
are close enough, in just few iteration is possible to factorize n
, hence compute the private key d
.
The following Python program applies the Fermat’s factorization algorithm to calculate p
and q
from n
(which is a 2048 bit value). It only takes few seconds to find the prime factors of n
.
# uses SageMath to find 2 big prime numbers (close to each other)
p = random_prime(2^10241,False,2^1023)
q = p.next_prime()
# Calculate modulus n
n = p * q
# Fermat's factorization algorithm
a = ceil(isqrt(n))
while True:
b2 = a**2  n
if is_square(b2):
b = sqrt(b2)
break
a = a + 1
p1 = a + b
q1 = a  b
# Print 'True' to confirm that we found p and q
print(p1 * q1 == p * q)