This
is a Chapter from the Handbook of Applied Cryptography, by
A. Menezes, P. van
Oorschot,
and S. Vanstone, CRC Press, 1996.
For
further information, see www.cacr.math.uwaterloo.ca/hac
CRC
Press has granted the following speci_c permissions for the electronic version
of this
book:
Permission
is granted to retrieve, print and store a single copy of this chapter for
personal
use. This permission does not extend to binding multiple chapters of
the
book, photocopying or producing copies for other than personal use of the
person
creating the copy, or making electronic copies available for retrieval by
others
without prior permission in writing from CRC Press.
Except
where over-ridden by the speci_c permission above, the standard copyright
notice
from
CRC Press applies to this electronic version:
Neither
this book nor any part may be reproduced or transmitted in any form or
by
any means, electronic or mechanical, including photocopying, micro_lming,
and
recording, or by any information storage or retrieval system, without prior
permission
in writing from the publisher.
The
consent of CRC Press does not extend to copying for general distribution,
for
promotion, for creating new works, or for resale. Speci_c permission must be
obtained
in writing from CRC Press for such copying.
c
1997 by CRC Press, Inc.
3
Number-Theoretic
Reference
Problems
Contents in Brief
3.1 Introduction and overview : : : : : : : : : : : : : : : : : : : : : 87
3.2 The integer factorization
problem : : : : : : : : : : : : : : : : : 89
3.3 The RSA problem : : : : : : : : : : : : : : : : : : : : : : : : : : 98
3.4 The quadratic residuosity
problem : : : : : : : : : : : : : : : : : 99
3.5 Computing square roots in Zn : : : : : : : : : : : :
: : : : : : : 99
3.6 The discrete logarithm problem
: : : : : : : : : : : : : : : : : : 103
3.7 The Diffie-Hellman problem : : : : : : : : : : : : : : : : : : : : 113
3.8 Composite moduli : : : : : : : : : : : : : : : : : : : : : : : : : : 114
3.9 Computing individual bits : : : : : : : : : : : : : : : : : : : : : 114
3.10 The subset sum problem : : : : : : : : : : : : : : : : : : : : : : 117
3.11 Factoring polynomials over
finite fields : : : : : : : : : : : : : : : 122
3.12 Notes and further references : : : : : : : : : : : : : : : : : : : : 125
3.1
Introduction and overview
The security of many public-key
cryptosystems relies on the apparent intractability of the
computational problems studied in
this chapter. In a cryptographic setting, it is prudent to
make the assumption that the
adversary is very powerful. Thus, informally speaking, a computational
problem is said to be easy or
tractable if it can be solved in (expected)1 polynomial
time, at least for a
non-negligible fraction of all possible inputs. In otherwords, if there
is an algorithm which can solve a
non-negligible fraction of all instances of a problem in
polynomial time, then any
cryptosystem whose security is based on that problem must be
considered insecure.
The computational problems
studied in this chapter are summarized in Table 3.1. The
true computational complexities
of these problems are not known. That is to say, they are
widely believed to be
intractable,2 although no proof of this is known. Generally, the only
lower bounds known on the
resources required to solve these problems are the trivial linear
bounds, which do not provide any
evidence of their intractability. It is, therefore, of interest
to study their relative
difficulties. For this reason, various techniques of reducing one
1For simplicity,
the remainder of the chapter shall generally not distinguish between
deterministic polynomialtime
algorithms and randomized
algorithms (see x2.3.4) whose expected running
time is polynomial.
2More precisely,
these problems are intractable if the problem parameters are carefully chosen.
Problem Description
FACTORING Integer
factorization problem: given a positive integer n, find
its prime factorization; that is,
write n = pe1
1 pe2
2 : : : pek
k where
the pi are pairwise distinct primes and
each ei _ 1.
RSAP RSA problem (also known
as RSA inversion): given a positive
integer n that is a product of two distinct odd primes p and q, a
positive integer e such that gcd(e; (p - 1)(q - 1)) = 1, and an
integer c, find an integer m such that me _ c (mod n).
QRP Quadratic residuosity problem:
given an odd composite integer
n and an integer a having Jacobi symbol _a
n_ = 1, decide
whether or not a is a quadratic residue modulo n.
SQROOT Square rootsmodulo n: given a composite integer n and a 2 Qn
(the set of quadratic residues
modulo n), find a square root of a
modulo n; that is, an integer x such that x2 _ a (mod n).
DLP Discrete logarithm problem:
given a prime p, a generator _ of
Z_
p, and an
element _ 2 Z_
p, find the
integer x, 0 _ x _ p - 2,
such that _x _ _ (mod p).
GDLP Generalized discrete
logarithm problem: given a finite cyclic
group G of order n, a generator _ of G, and an element _ 2 G,
find the integer x, 0 _ x _ n - 1, such that _x = _.
DHP Diffie-Hellman problem:
given a prime p, a generator _ of Z_
p,
and elements _a mod p and _b mod p, find _ab mod p.
GDHP Generalized
Diffie-Hellman problem: given a finite cyclic group
G, a generator _ of G, and group elements _a and _b, find _ab.
SUBSET-SUM Subset sum problem:
given a set of positive integers
fa1; a2; : : : ; ang and a positive integer s, determine whether or
not there is a subset of the aj that sums to s.
Table 3.1:Some computational problems of
cryptographic relevance.
computational problem to another
have been devised and studied in the literature. These reductions
provide a means for converting
any algorithmthat solves the second problem into
an algorithm for solving the
first problem. The following intuitive notion of reducibility
(cf. x2.3.3) is used in this chapter.
3.1 Definition Let A and B be two computational problems. A is said to polytime reduce to
B, written A _P B, if there is an algorithm that solves A which uses, as a subroutine, a
hypothetical algorithm for
solving B, and which runs in polynomial time if the algorithm
for B does.3
Informally speaking, if A polytime reduces to B, then B is at least as difficult as A;
equivalently, A is no harder than B. Consequently, if A is a well-studied computational
problemthat iswidely believed to
be intractable, then proving thatA _P B provides strong
evidence of the intractability of
problem B.
3.2 Definition Let A and B be two computational problems. If A _P B and B _P A, then
A and B are said to be computationally equivalent, written A _P B.
3In the
literature, the hypothetical polynomial-time subroutine for B is sometimes
called an oracle
Informally speaking, if A _P B then A and B are either both tractable or both intractable,
as the case may be.
Chapter outline
The remainder of the chapter is
organized as follows. Algorithms for the integer factorization
problem are studied in x3.2. Two problems related to factoring, the RSA problem and
the quadratic residuosity
problem, are briefly considered in x3.3 and x3.4. Efficient algorithms
for computing square roots in Zp, p a prime, are presented in x3.5, and the equivalence
of the problems of finding square
roots modulo a composite integer n and factoring
n is established. Algorithms for the discrete logarithm problem are studied
in x3.6, and
the related Diffie-Hellman
problem is briefly considered in x3.7. The relation between the
problems of factoring a composite
integer n and computing discrete logarithms in (cyclic
subgroups of) the group Z_
n is
investigated in x3.8. The tasks of finding partial
solutions
to the discrete logarithm
problem, the RSA problem, and the problem of computing square
roots modulo a composite integer n are the topics of x3.9. The L3-lattice basis reduction
algorithm is presented in x3.10, along with algorithms for the subset sum problem and for
simultaneous diophantine
approximation. Berlekamp’s Q-matrix algorithm for factoring
polynomials is presented in x3.11. Finally, x3.12 provides references and
further chapter
notes.
3.2 The integer
factorization problem
The security of many
cryptographic techniques depends upon the intractability of the integer
factorization problem. A partial
list of such protocols includes the RSA public-key
encryption scheme (x8.2), the RSA signature scheme (x11.3.1), and the Rabin public-key
encryption scheme (x8.3). This section summarizes the current knowledge on algorithms
for the integer factorization
problem.
3.3 Definition The integer factorization
problem (FACTORING) is the following: given a
positive integer n, find its prime factorization; that is, write n = pe1
1 pe2
2 _ _ _ pek
k where the
pi are pairwise distinct primes and
each ei _ 1.
3.4 Remark (primality testing vs.
factoring) The problem of deciding whether an integer is
composite or prime seems to be,
in general, much easier than the factoring problem. Hence,
before attempting to factor an
integer, the integer should be tested to make sure that it is
indeed composite. Primality tests
are a main topic of Chapter 4.
3.5 Remark (splitting vs. factoring)
A non-trivial factorization of n is a factorization of the
form n = ab where 1 < a < n and 1 < b < n; a and b are said to be non-trivial factors
of n. Here a and b are not necessarily prime. To solve the integer factorization problem, it
suffices to study algorithms that
split n, that is, find a non-trivial
factorization n = ab. Once
found, the factors a and b can be tested for primality. The algorithmfor splitting integers can
then be recursively applied to a and/or b, if either is found to be composite. In this manner,
the prime factorization of n can be obtained.
3.6 Note (testing for perfect powers)
If n _ 2, it can be efficiently checked as follows whether
or not n is a perfect power, i.e., n = xk for some
integers x _ 2, k _ 2. For each prime
p _ lg n, an integer approximation x of n1=p is computed.
This can be done by performing
a binary search for x satisfying n = xp in the
interval [2; 2blg n=pc+1]. The entire procedure
takes O((lg3 n) lg lg lg n) bit operations. For the remainder
of this section, it will always
be assumed that n is not a perfect power. It follows that if n is composite, then n has at least
two distinct prime factors.
Some factoring algorithms are
tailored to perform better when the integer n being factored
is of a special form; these are
called special-purpose factoring algorithms. The running
times of such algorithms
typically depend on certain properties of the factors of n. Examples
of special-purpose factoring
algorithms include trial division (x3.2.1), Pollard’s rho
algorithm (x3.2.2), Pollard’s p-1 algorithm (x3.2.3), the elliptic curve algorithm (x3.2.4),
and the special number field
sieve (x3.2.7). In contrast, the running times of the so-called
general-purpose factoring algorithms depend
solely on the size of n. Examples of generalpurpose
factoring algorithms include the
quadratic sieve (x3.2.6) and the general number
field sieve (x3.2.7).
Whenever applicable,
special-purpose algorithms should be employed as they will generally
be more efficient. A reasonable
overall strategy is to attempt to find small factors
first, capitalize on any
particular special forms an integer may have, and then, if all else
fails, bring out the
general-purpose algorithms. As an example of a general strategy, one
might consider the following.
1. Apply trial division by small
primes less than some bound b1.
2. Next, apply Pollard’s rho
algorithm, hoping to find any small prime factors smaller
than some bound b2, where b2 > b1.
3. Apply the elliptic curve
factoring algorithm, hoping to find any small factors smaller
than some bound b3, where b3 > b2.
4. Finally, apply one of the more
powerful general-purpose algorithms (quadratic sieve
or general number field sieve).
3.2.1 Trial division
Once it is established that an
integer n is composite, before expending vast amounts of time
with more powerful techniques,
the first thing that should be attempted is trial division by
all “small” primes. Here, “small”
is determined as a function of the size of n. As anextreme
case, trial division can be
attempted by all primes up to
p
n. If this is done, trial division
will completely factor n but the procedure will take roughly
p
n divisions in the worst case
when n is a product of two primes of the same size. In general, if the factors
found at each
stage are tested for primality,
then trial division to factor n completely takes O(p + lgn)
divisions, where p is the second-largest prime factor of n.
Fact 3.7 indicates that if trial
division is used to factor a randomly chosen large integer
n, then the algorithmcan be expected to find some small factors of n relatively quickly, and
expend a large amount of time to
find the second largest prime factor of n.
3.7 Fact Let n be chosen uniformly at random from the interval [1; x].
(i) If 1
2 _ _ _ 1, then the probability that the
largest prime factor of n is _ x_ is
approximately 1+ln_. Thus, for example, the
probability that n has a prime factor
>
p
x is ln 2 _ 0:69.
(ii) The probability that the
second-largest prime factor of n is _ x0:2117 is about 1
2 .
(iii) The expected total number
of prime factors of n is ln ln
x+O(1). (If n = Qpei
i , the
total number of prime factors of n isPei.)
3.2.2 Pollard’s rho factoring
algorithm
Pollard’s rho algorithmis a
special-purpose factoring algorithmfor finding small factors of
a composite integer.
Let f : S -! S be a random function, where S is a finite set of cardinality n. Let
x0 be a random element of S, and consider the sequence x0; x1; x2; :
: : defined by xi+1 =
f(xi) for i _ 0. Since S is finite, the sequence must eventually cycle, and consists of a
tail of expected lengthp_n=8 followed by an endlessly repeating cycle of expected length
p_n=8 (see Fact 2.37). Aproblem that arises in some cryptanalytic tasks,
including integer
factorization (Algorithm 3.9) and
the discrete logarithm problem (Algorithm 3.60), is of
finding distinct indices i and j such that xi = xj (a collision
is then said to have occurred).
An obvious method for finding a
collision is to compute and store xi for i = 0; 1; 2; : : :
and look for duplicates. The
expected number of inputs thatmust be tried before a duplicate
is detected isp_n=2 (Fact 2.27). This method requires O(
p
n) memory and O(
p
n) time,
assuming the xi are stored in a hash table so
that new entries can be added in constant time.
3.8 Note (Floyd’s cycle-finding
algorithm) The large storage requirements in the above technique
for finding a collision can be
eliminated by using Floyd’s cycle-finding algorithm.
In this method, one starts with
the pair (x1; x2), and iteratively computes (xi; x2i) from
the previous pair (xi-1; x2i-2), until xm = x2m for some m. If the tail of the sequence
has length _ and the cycle has length _, then the first time that xm = x2m is when m =
_(1 + b_=_c). Note that_ < m _ _ + _, and consequently the expected running time of
this method is O(
p
n).
Now, let p be a prime factor of a composite integer n. Pollard’s rho algorithm for factoring
n attempts to find duplicates in the sequence of integers x0; x1; x2; : : : defined by
x0 = 2, xi+1 = f(xi) = x2
i + 1 mod p for i _ 0. Floyd’s cycle-finding algorithm is utilized
to find xm and x2m such that xm _ x2m (mod p). Since p divides n but is unknown,
this is done by computing the
terms xi modulo n and testing if gcd(xm - x2m; n) > 1.
If also gcd(xm - x2m; n) < n, then a non-trivial factor of n is obtained. (The situation
gcd(xm - x2m; n) = n occurs with negligible
probability.)
3.9 Algorithm Pollard’s rho
algorithm for factoring integers
INPUT: a composite integer n that is not a prime power.
OUTPUT: a non-trivial factor d of n.
1. Set a 2, b 2.
2. For i = 1; 2; : : : do the following:
2.1 Compute a a2 + 1 mod n, b b2 + 1 mod n, b b2 + 1 mod n.
2.2 Compute d = gcd(a - b; n).
2.3 If 1 < d < n then return(d) and terminate with success.
2.4 If d = n then terminate the algorithm with failure (see Note 3.12).
3.10 Example (Pollard’s rho algorithm for
finding a non-trivial factor of n = 455459) The
following table lists the values
of variables a, b, and d at the end of each iteration of step 2
of Algorithm 3.9.
a b d
5 26 1
26 2871 1
677 179685 1
2871 155260 1
44380 416250 1
179685 43670 1
121634 164403 1
155260 247944 1
44567 68343 743
Hence two non-trivial factors of 455459 are 743 and 455459=743 = 613. _
3.11 Fact Assuming that the function f(x) = x2 + 1mod p behaves like a random function,
the expected time for Pollard’s
rho algorithmto find a factor p of n is O(
p
p) modular multiplications.
This implies that the expected
time to find a non-trivial factor of n is O(n1=4)
modular multiplications.
3.12 Note (options upon termination with
failure) If Pollard’s rho algorithm terminates with
failure, one option is to try
again with a different polynomial f having integer coefficients
instead of f(x) = x2 + 1. For example, the polynomial f(x) = x2 + c may be used as
long as c 6= 0;-2.
3.2.3 Pollard’s p - 1 factoring algorithm
Pollard’s p-1 factoring algorithmis a special-purpose factoring algorithmthat can be used
to efficiently find any prime
factors p of a composite integer n for which p - 1 is smooth
(see Definition 3.13) with
respect to some relatively small bound B.
3.13 Definition
Let B be a positive integer. An integer n is said to be B-smooth, or smooth
with respect to a bound B, if all its prime factors are _ B.
The idea behind Pollard’s p - 1 algorithm is the following. Let B be a smoothness
bound. Let Q be the least common multiple of all powers of primes _ B that are _ n. If
ql _ n, then l ln q _ ln n, and so l _ blnn
ln q c. Thus
Q = Yq_B
qbln n= ln qc;
where the product is over all
distinct primes q _ B. If p is a prime factor of n such that p-1
is B-smooth, then p -1jQ, and consequently for any a satisfying gcd(a; p) = 1, Fermat’s
theorem (Fact 2.127) implies that
aQ _ 1 (mod p). Hence if d = gcd(aQ - 1; n), then
pjd. It is possible that d = n, in which case the algorithm
fails; however, this is unlikely to
occur if n has at least two large distinct prime factors.
3.14 Algorithm Pollard’s p - 1 algorithm for
factoring integers
INPUT: a composite integer n that is not a prime power.
OUTPUT: a non-trivial factor d of n.
1. Select a smoothness bound B.
2. Select a random integer a, 2 _ a _ n - 1, and compute d = gcd(a; n). If d _ 2
then return(d).
3. For each prime q _ B do the following:
3.1 Compute l = b ln n
ln q c.
3.2 Compute a aql
mod n (using Algorithm 2.143).
4. Compute d = gcd(a - 1; n).
5. If d = 1 or d = n, then terminate the algorithm
with failure. Otherwise, return(d).
3.15 Example (Pollard’s p - 1 algorithm for finding a non-trivial factor of n = 19048567)
1. Select the smoothness bound B = 19.
2. Select the integer a = 3 and compute gcd(3; n) = 1.
3. The following table lists the
intermediate values of the variables q, l, and a after each
iteration of step 3 in Algorithm
3.14:
q l a
2 24 2293244
3 15 13555889
5 10 16937223
7 8 15214586
11 6 9685355
13 6 13271154
17 5 11406961
19 5 554506
4. Compute d = gcd(554506 - 1; n) = 5281.
5. Two non-trivial factors of n are p = 5281 and q = n=p = 3607 (these factors are in
fact prime).
Notice that p-1 = 5280 = 25 _3 _5_11, and q -1 = 3606 = 2 _3 _601. That
is, p - 1 is 19-smooth, while q - 1 is not 19-smooth. _
3.16 Fact Let n be an integer having a prime factor p such that p - 1 is B-smooth. The running
time of Pollard’s p - 1 algorithm for finding the factor p is O(B ln n= lnB) modular
multiplications.
3.17 Note (improvements) The
smoothness boundB in Algorithm3.14 is selected
based on the
amount of time one is willing to
spend on Pollard’s p - 1 algorithm before moving on to
more general techniques. In
practice, B may be between 105 and 106. If the algorithm
terminates with d = 1, then one might try searching
over prime numbers q1; q2; : : : ; ql
larger than B by first computing a aqi mod n for 1 _ i _ l, and then computing d =
gcd(a - 1; n). Another variant is to start
with a large bound B, and repeatedly execute
step 3 for a few primes q followed by the gcd computation in step 4. There are numerous
other practical improvements of
the algorithm (see page 125).
3.2.4 Elliptic curve factoring
The details of the elliptic
curve factoring algorithm are beyond the scope of this book; nevertheless,
a rough outline follows. The
success of Pollard’s p-1 algorithmhinges on p-1
being smooth for some prime
divisor p of n; if no such p exists, then the algorithm fails.
Observe that p - 1 is the order of the group Z_
p. The elliptic
curve factoring algorithm is a
generalization of Pollard’s p - 1 algorithm in the sense that the group Z_
p is replaced by
a
random elliptic curve group over Zp. The order of such a group is
roughly uniformly distributed
in the interval [p+1-2
p
p; p+1+2
p
p]. If the order of the group
chosen is smooth
with respect to some pre-selected
bound, the elliptic curve algorithm will, with high probability,
find a non-trivial factor of n. If the group order is not smooth, then the algorithm
will likely fail, but can be
repeated with a different choice of elliptic curve group.
The elliptic curve algorithm has
an expected running time of Lp[ 1
2 ;
p
2] (see Example
2.61 for definition of Lp) to find a factor p of n. Since this running time depends on
the size of the prime factors of n, the algorithm tends to find small such factors first. The
elliptic curve algorithm is,
therefore, classified as a special-purpose factoring algorithm. It
is currently the algorithm of
choice for finding t-decimal digit prime factors, for
t _ 40, of
very large composite integers.
In the hardest case, when n is a product of two primes of roughly the same size, the
expected running time of the
elliptic curve algorithm is Ln[ 1
2 ; 1], which is the same as that
of the quadratic sieve (x3.2.6). However, the elliptic curve algorithm is not as efficient as
the quadratic sieve in practice
for such integers.
3.2.5 Random square factoring
methods
The basic idea behind the random
square family of methods is the following. Suppose x
and y are integers such that x2 _ y2 (mod n) but x 6_ _y (mod n). Then n divides
x2-y2 = (x-y)(x+y) but n does not divide either (x-y) or (x+y). Hence, gcd(x-y; n)
must be a non-trivial factor of n. This result is summarized next.
3.18 Fact Let x, y, andn be integers. If x2 _ y2 (mod n) but x 6_ _y (mod n), thengcd(x-
y; n) is a non-trivial factor of n.
The random square methods attempt
to find integers x and y at random so that x2 _ y2
(mod n). Then, as shown in Fact 3.19, with probability at least 1
2 it is the case
that x 6_ _y
(mod n), whence gcd(x - y; n) will yield a non-trivial factor of n.
3.19 Fact Let n be an odd composite integer that is divisible by k distinct odd primes. If a 2
Z_
n, then the
congruence x2 _ a2 (mod n) has exactly 2k solutions
modulo n, two of
which are x = a and x = -a.
3.20 Example Let n = 35. Then there are four solutions
to the congruence x2 _ 4 (mod 35),
namely x = 2, 12, 23, and 33. _
A common strategy employed by the
random square algorithms for finding x and y at
random satisfying x2 _ y2 (mod n) is the following. A set consisting of the first t primes
S = fp1; p2; : : : ; ptg is chosen; S is called the factor base.
Proceed to find pairs of integers
(ai; bi) satisfying
(i) a2
i _ bi (mod n); and
(ii) bi = Qt
j=1 peij
j , eij _ 0; that is, bi is pt-smooth.
Next find a subset of the bi’s whose product is a perfect
square. Knowing the factorizations
of the bi’s, this is possible by selecting
a subset of the bi’s such that
the power of
each prime pj appearing in their product is
even. For this purpose, only the parity of the
non-negative integer exponents eij needs to be considered. Thus, to
simplify matters, for
each i, associate the binary vector vi = (vi1; vi2; : : : ; vit) with the integer exponent vector
(ei1; ei2; : : : ; eit) such that vij = eij mod 2. If t + 1 pairs (ai; bi) are obtained, then the
t-dimensional vectors v1; v2; : : : ; vt+1 must be
linearly dependent over Z2. That is,
there
must exist a non-empty subset T _ f1; 2; : : : ; t+1g such thatPi2T vi = 0 over Z2, and
henceQi2T bi is a perfect
square. The set T can be found using ordinary
linear algebra over
Z2. Clearly, Qi2T a2
i is also a
perfect square. Thus setting x = Qi2T ai and y to be the
integer square root ofQi2T bi yields a pair
of integers (x; y) satisfying x2 _ y2 (mod n).
If this pair also satisfies x 6_ _y (mod n), then gcd(x - y; n) yields a non-trivial factor
of n. Otherwise, some of the (ai; bi) pairs may be replaced by some new such pairs, and
the process is repeated. In
practice, there will be several dependencies among the vectors
v1; v2; : : : ; vt+1, and with
high probability at least one will yield an (x; y) pair satisfying
x 6_ _y (mod n); hence, this last step of
generating new (ai; bi) pairs does not usually
occur.
This description of the random
square methods is incomplete for two reasons. Firstly,
the optimal choice of t, the size of the factor base, is not specified; this is addressed in
Note 3.24. Secondly, a method for
efficiently generating the pairs (ai; bi) is not specified.
Several techniques have been
proposed. In the simplest of these, called Dixon’s algorithm,
ai is chosen at random, and bi = a2
i mod n is computed. Next, trial division
by elements
in the factor base is used to
test whether bi is pt-smooth. If not, then another
integer ai is
chosen at random, and the
procedure is repeated.
The more efficient techniques
strategically select an ai such that bi is relatively small.
Since the proportion of pt-smooth integers in the interval [2; x] becomes larger as x decreases,
the probability of such bi being pt-smooth is higher. The most
efficient of such
techniques is the quadratic sieve
algorithm, which is described next.
3.2.6 Quadratic sieve factoring
Suppose an integer n is to be factored. Letm = b
p
nc, and consider the polynomial q(x) =
(x +m)2 - n. Note that
q(x) = x2 + 2mx +m2 - n _ x2 + 2mx; (3.1)
which is small (relative to n) if x is small in absolute value. The quadratic sieve algorithm
selects ai = (x + m) and tests whether bi = (x + m)2 - n is pt-smooth. Note
that
a2
i = (x + m)2 _ bi (mod n). Note also that if a prime p divides bi then (x + m)2 _ n
(mod p), and hence n is a quadratic residue modulo p. Thus the factor base need only
contain those primes p for which the Legendre symbol _n
p_is 1 (Definition 2.145). Furthermore,
since bi may be negative,-1 is included in the factor base.
The steps of the quadratic
sieve algorithm are summarized in
Algorithm 3.21.
3.21 Algorithm Quadratic sieve
algorithm for factoring integers
INPUT: a composite integer n that is not a prime power.
OUTPUT: a non-trivial factor d of n.
1. Select the factor base S = fp1; p2; : : : ; ptg, where p1 = -1 and pj (j _ 2) is the
(j - 1)th prime p for which n is a quadratic residue modulo p.
2. Compute m = b
p
nc.
3. (Collect t + 1 pairs (ai; bi). The x values are chosen in the order 0;_1;_2; : : : .)
Set i 1. While i _ t + 1 do the following:
3.1 Compute b = q(x) = (x+m)2-n, and test using trial division
(cf. Note 3.23)
by elements in S whether b is pt-smooth. If not, pick a new x and repeat step 3.1.
3.2 If b is pt-smooth, say b = Qt
j=1 peij
j , then set ai (x + m), bi b, and vi =
(vi1; vi2; : : : ; vit), where vij = eij mod 2 for 1 _ j _ t.
3.3 i i + 1.
4. Use linear algebra over Z2 to find a non-empty subset T _ f1; 2; : : : ; t + 1g such
thatPi2T vi = 0.
5. Compute x = Qi2T ai mod n.
6. For each j, 1 _ j _ t, compute lj = (Pi2T eij )=2.
7. Compute y = Qt
j=1 plj
j mod n.
8. If x _ _y (mod n), then find another non-empty
subset T _ f1; 2; : : : ; t+1g such
that Pi2T vi = 0, and go to step 5. (In the unlikely case such a subset T does not
exist, replace a few of the (ai; bi) pairs with new pairs (step 3), and go to step 4.)
9. Compute d = gcd(x - y; n) and return(d).
3.22 Example (quadratic sieve algorithm for
finding a non-trivial factor of n = 24961)
1. Select the factor base S = f-1; 2; 3; 5; 13; 23g of size t = 6. (7, 11, 17 and 19 are
omitted from S since _n
p_ = -1 for these primes.)
2. Compute m = b
p
24961c = 157.
3. Following is the data
collected for the first t + 1 values of x for which q(x) is 23-
smooth.
i x q(x) factorization
of q(x) ai vi
1 0 -312 -23 _ 3 _ 13 157 (1; 1; 1; 0; 1; 0)
2 1 3 3 158 (0; 0; 1; 0; 0; 0)
3 -1 -625 -54 156 (1; 0; 0; 0; 0; 0)
4 2 320 26 _ 5 159 (0; 0; 0; 1; 0; 0)
5 -2 -936 -23 _ 32 _ 13 155 (1; 1; 0; 0; 1; 0)
6 4 960 26 _ 3 _ 5 161 (0; 0; 1; 1; 0; 0)
7 -6 -2160 -24 _ 33 _ 5 151 (1; 0; 1; 1; 0; 0)
4. By inspection, v1 +v2 +v5 = 0. (In the notation of Algorithm 3.21, T = f1; 2; 5g.)
5. Compute x = (a1a2a5 mod n) = 936.
6. Compute l1 = 1, l2 = 3, l3 = 2, l4 = 0, l5 = 1, l6 = 0.
7. Compute y = -23 _ 32 _ 13 mod n = 24025.
8. Since 936_ -24025 (mod n), another linear dependency must be found.
9. By inspection, v3 + v6 + v7 = 0; thus T = f3; 6; 7g.
10. Compute x = (a3a6a7 mod n) = 23405.
11. Compute l1 = 1, l2 = 5, l3 = 2, l4 = 3, l5 = 0, l6 = 0.
12. Compute y = (-25 _ 32 _ 53 mod n) = 13922.
13. Now, 23405 6_ _13922 (mod n), so compute gcd(x-y; n) = gcd(9483; 24961) =
109. Hence, two non-trivial factors of 24961 are 109 and 229. _
3.23 Note (sieving) Instead of
testing smoothness by trial division in step 3.1 of Algorithm3.21,
a more efficient technique known
as sieving is employed in practice. Observe first that if p
is an odd prime in the factor
base and p divides q(x), then p also divides q(x+lp) for every
integer l. Thus by solving the equation q(x) _ 0 (mod p) for x (for example, using the
algorithms in x3.5.1), one knows either one or two (depending on the number of solutions
to the quadratic equation) entire
sequences of other values y for which p divides q(y).
The sieving process is the
following. An array Q[ ] indexed by x, -M _ x _ M, is
created and the xth entry is initialized to blg jq(x)jc. Let x1, x2 be the
solutions to q(x) _ 0
(mod p), where p is an odd prime in the factor base. Then the value blg pc is subtracted
from those entries Q[x] in the array for which x _ x1 or x2 (mod p) and -M _ x _ M.
This is repeated for each odd
prime p in the factor base. (The case of p = 2 and prime
powers can be handled in a
similar manner.) After the sieving, the array entries Q[x] with
values near 0 are most likely to be pt-smooth
(roundoff errors must be taken into account),
and this can be verified by
factoring q(x) by trial division.
3.24 Note (running time of the quadratic
sieve) To optimize the running time of the quadratic
sieve, the size of the factor
base should be judiciously chosen. The optimal selection of
t _ Ln[ 1
2 ; 1
2 ] (see Example 2.61) is derived from knowledge concerning the distribution
of smooth integers close to
p
n. With this choice, Algorithm 3.21 with sieving (Note 3.23)
has an expected running time of Ln[ 1
2 ; 1], independent of the size of the
factors of n.
3.25 Note (multiple polynomial variant)
In order to collect a sufficient number of (ai; bi) pairs,
the sieving interval must be
quite large. From equation (3.1) it can be seen that jq(x)j increases
linearly with jxj, and consequently the probability of smoothness decreases. To
overcome this problem, a variant
(the multiple polynomial quadratic sieve) was proposed
wherebymany
appropriately-chosenquadratic polynomials can be used instead of just q(x),
each polynomial being sieved over
an interval ofmuch smaller length. This variant also has
an expected running time of Ln[ 1
2 ; 1], and is the method of choice in
practice.
3.26 Note (parallelizing the quadratic
sieve) The multiple polynomial variant of the quadratic
sieve is well suited for
parallelization. Each node of a parallel computer, or each computer
in a network of computers, simply
sieves through different collections of polynomials. Any
(ai; bi) pair found is reported to a central processor. Once sufficient pairs have
been collected,
the corresponding system of
linear equations is solved on a single (possibly parallel)
computer.
3.27 Note (quadratic sieve vs. elliptic
curve factoring) The elliptic curve factoring algorithm
(x3.2.4) has the same4 expected
(asymptotic) running time as the quadratic sieve factoring
algorithm in the special case
when n is the product of two primes of equal size. However,
for such numbers, the quadratic
sieve is superior in practice because the main steps in the
algorithm are single precision
operations, compared to the much more computationally intensive
multi-precision elliptic curve
operations required in the elliptic curve algorithm.
3.2.7 Number field sieve
factoring
For several years it was believed
by some people that a running time of Ln[ 1
2 ; 1] was, in
fact, the best achievable by any
integer factorization algorithm. This barrier was broken in
1990 with the discovery of the number
field sieve. Like the quadratic sieve, the number field
sieve is an algorithm in the
random square family of methods (x3.2.5). That is, it attempts
to find integers x and y such that x2 _ y2 (mod n) and x 6_ _y (mod n). To achieve this
goal, two factor bases are used,
one consisting of all prime numbers less than some bound,
and the other consisting of all
prime ideals of norm less than some bound in the ring of
integers of a suitably-chosen
algebraic number field. The details of the algorithm are quite
complicated, and are beyond the
scope of this book.
A special version of the
algorithm (the special number field sieve) applies to integers
of the form n = re - s for small r and jsj, and has an expected running time of Ln[ 1
3; c],
where c = (32=9)1=3 _ 1:526.
The general version of the
algorithm, sometimes called the general number field sieve,
applies to all integers and has
an expected running time of Ln[ 1
3; c], where c = (64=9)1=3 _
1:923. This is, asymptotically, the fastest algorithm known for integer
factorization. The
primary reason why the running
time of the number field sieve is smaller than that of the
quadratic sieve is that the
candidate smooth numbers in the former are much smaller than
those in the latter.
The general number field sieve
was at first believed to be slower than the quadratic
sieve for factoring integers
having fewer than 150 decimal digits. However, experiments
in 1994–1996 have indicated that
the general number field sieve is substantially faster than
the quadratic sieve even for
numbers in the 115 digit range. This implies that the crossover
point between the effectiveness
of the quadratic sieve vs. the general number field sieve
may be 110–120 digits. For this
reason, the general number field sieve is considered the
current champion of all
general-purpose factoring algorithms.
3.3 The RSA
problem
The intractability of the RSA
problem forms the basis for the security of the RSA public-key
encryption scheme (x8.2) and the RSA signature scheme (x11.3.1).
3.28 Definition
The RSA
problem (RSAP) is the following: given a positive integer n that is a
product of two distinct odd
primes p and q, a positive integer e such that gcd(e; (p-1)(q-
1)) = 1, and an integer c, find an integer m such that me _ c (mod n).
In otherwords, the RSAproblemis
that of finding eth rootsmodulo a
composite integer
n. The conditions imposed on the problem parameters n and e ensure that for each integer
c 2 f0; 1; : : : ; n - 1g there is exactly one m 2 f0; 1; : : : ; n - 1g such that me _ c
(mod n). Equivalently, the function f : Zn -! Zn defined as f(m) = me mod n is a
permutation.
3.29 Remark (SQROOT vs. RSA problems)
Since p - 1 is even, it follows that e is odd. In
particular, e 6= 2, and hence the SQROOT problem (Definition 3.43) is not a special
case
of the RSA problem.
As is shown in x8.2.2(i), if the factors of n are known then the RSA problem
can be
easily solved. This fact is
stated next.
3.30 Fact RSAP _P FACTORING. That is, the RSA
problem polytime reduces to the integer
factorization problem.
It is widely believed that the
RSA and the integer factorization problems are computationally
equivalent, although no proof of
this is known.
3.4 The
quadratic residuosity problem
The security of the
Goldwasser-Micali probabilistic public-key encryption scheme (x8.7)
and the Blum-Blum-Shub
pseudorandom bit generator (x5.5.2) are both based on the
apparent
intractability of the quadratic
residuosity problem.
Recall from x2.4.5 that if n _ 3 is an odd integer, then Jn is the set of all a 2 Z_
n
having Jacobi symbol 1. Recall
also that Qn is the set of quadratic residues
modulo n and
that the set of pseudosquares
modulo n is defined by eQn = Jn - Qn.
3.31 Definition
The quadratic
residuosity problem (QRP) is the following: given an odd composite
integer n and a 2 Jn, decide whether or not a is a quadratic residue modulo n.
3.32 Remark (QRP with a prime modulus)
If n is a prime, then it is easy to decide whether
a 2 Z_
n is a quadratic
residue modulo n since, by definition, a 2 Qn if and only if _a
n_ = 1,
and the Legendre symbol _a
n_can be efficiently calculated by Algorithm 2.149.
Assume now that n is a product of two distinct odd primes p and q. It follows from
Fact 2.137 that if a 2 Jn, then a 2 Qn if and only if _a
p_ = 1. Thus, if the factorization of
n is known, then QRP can be solved simply by computing the Legendre symbol _a
p_. This
observation can be generalized to
all integers n and leads to the following fact.
3.33 Fact QRP _P FACTORING. That is, the QRP
polytime reduces to the FACTORING
problem.
On the other hand, if the
factorization of n is unknown, then there is no
efficient procedure
known for solving QRP, other than
by guessing the answer. If n = pq, then the
probability of a correct guess is
1
2 since jQnj = j eQnj (Fact 2.155). It is believed that the
QRP is as difficult as the
problem of factoring integers, although no proof of this is known.
3.5 Computing
square roots in Zn
The operations of squaring modulo
an integer n and extracting square roots modulo an integer
n are frequently used in cryptographic functions. The operation of computing
square
roots modulo n can be performed efficiently when n is a prime, but is difficult when
n is a
composite integer whose prime
factors are unknown.
3.5.1 Case (i): n prime
Recall from Remark 3.32 that if p is a prime, then it is easy to decide if a 2 Z_
p is a quadratic
residue modulo p. If a is, in fact, a quadratic residue modulo p, then the two square roots
of a can be efficiently computed, as
demonstrated by Algorithm 3.34.
3.34 Algorithm Finding square
roots modulo a prime p
INPUT: an odd prime p and an integer a, 1 _ a _ p - 1.
OUTPUT: the two square roots of a modulo p, provided a is a quadratic residue modulo p.
1. Compute the Legendre symbol _a
p_usingAlgorithm2.149. If _a
p_ = -1 then return(a
does not have a square root
modulo p) and terminate.
2. Select integers b, 1 _ b _ p - 1, at random until one is found with _b
p_ = -1. (b is
a quadratic non-residue modulo p.)
3. By repeated division by 2, write p -1 = 2st, where t is odd.
4. Compute a-1 mod p by the extended Euclidean
algorithm (Algorithm 2.142).
5. Set c bt mod p and r a(t+1)=2 mod p (Algorithm 2.143).
6. For i from 1 to s - 1 do the following:
6.1 Compute d = (r2 _ a-1)2s-i-1
mod p.
6.2 If d _ -1 (mod p) then set r r _ c mod p.
6.3 Set c c2 mod p.
7. Return(r, -r).
Algorithm 3.34 is a randomized
algorithm because of themanner inwhich the quadratic
non-residue b is selected in step 2. No deterministic polynomial-time algorithm for
finding
a quadratic non-residue modulo a
prime p is known (see Remark 2.151).
3.35 Fact Algorithm 3.34 has an expected
running time of O((lg p)4) bit operations.
This running time is obtained by
observing that the dominant step (step 6) is executed
s-1 times, each iteration involving amodular exponentiation and thus takingO((lg p)3) bit
operations (Table 2.5). Since in
the worst case s = O(lg p), the running time of O((lg p)4)
follows. When s is small, the loop in step 6 is executed only a small number of times, and
the running time of Algorithm3.34
is O((lg p)3) bit operations. This point is demonstrated
next for the special cases s = 1 and s = 2.
Specializing Algorithm3.34 to the
case s = 1yields the following simple
deterministic
algorithm for finding square
roots when p _ 3 (mod 4).
3.36 Algorithm Finding square
roots modulo a prime p where p _ 3 (mod 4)
INPUT: an odd prime p where p _ 3 (mod 4), and a square a 2 Qp.
OUTPUT: the two square roots of a modulo p.
1. Compute r = a(p+1)=4 mod p (Algorithm 2.143).
2. Return(r, -r).
Specializing Algorithm 3.34 to
the case s = 2, and using the fact that 2 is a quadratic
non-residue modulo p when p _ 5 (mod 8), yields the following simple deterministic algorithm
for finding square roots when p _ 5 (mod 8).
3.37 Algorithm Finding square
roots modulo a prime p where p _ 5 (mod 8)
INPUT: an odd prime p where p _ 5 (mod 8), and a square a 2 Qp.
OUTPUT: the two square roots of a modulo p.
1. Compute d = a(p-1)=4 mod p (Algorithm 2.143).
2. If d = 1 then compute r = a(p+3)=8 mod p.
3. If d = p - 1 then compute r = 2a(4a)(p-5)=8 mod p.
4. Return(r, -r).
3.38 Fact Algorithms 3.36 and 3.37 have
running times of O((lg p)3) bit operations.
Algorithm3.39 for finding square
rootsmodulo p is preferable to Algorithm3.34 when
p -1 = 2st with s large.
3.39 Algorithm Finding square
roots modulo a prime p
INPUT: an odd prime p and a square a 2 Qp.
OUTPUT: the two square roots of a modulo p.
1. Choose random b 2 Zp until b2 - 4a is a quadratic non-residue modulo p, i.e.,
_b2-4a
p _ = -1.
2. Let f be the polynomial x2 - bx + a in Zp[x].
3. Compute r = x(p+1)=2 mod f using Algorithm 2.227. (Note: r will be an integer.)
4. Return(r, -r).
3.40 Fact Algorithm 3.39 has an expected
running time of O((lg p)3) bit operations.
3.41 Note (computing square roots in a
finite field)Algorithms 3.34, 3.36, 3.37, and 3.39 can be
extended in a
straightforwardmanner to find square roots in any finite field Fq of odd order
q = pm, p prime, m _ 1. Square roots in finite fields of even order can also be computed
efficiently via Fact 3.42.
3.42 Fact Each element a 2 F2m has exactly one square root,
namely a2m-1 .
3.5.2 Case (ii): n composite
The discussion in this subsection
is restricted to the case of computing square roots modulo
n, where n is a product of two distinct odd primes p and q. However, all facts presented
here generalize to the case where
n is an arbitrary composite integer.
Unlike the case where n is a prime, the problem of deciding whether a given a 2 Z_
n
is a quadratic residue modulo a
composite integer n, is believed to be a difficult
problem.
Certainly, if the Jacobi symbol _a
n_ = -1, then a is a quadratic non-residue. On the other
hand, if _a
n_ = 1, then deciding whether or not a is a quadratic residue is precisely the
quadratic residuosity problem,
considered in x3.4.
3.43 Definition
The square
root modulo n problem (SQROOT) is the following: given
a composite
integer n and a quadratic residue a modulo n (i.e. a 2 Qn), find a square root of a
modulo n.
If the factors p and q of n are known, then the SQROOT problem can be solved effi-
ciently by first finding square
roots of a modulo p and modulo q, and then combining them
using the Chinese remainder
theorem (Fact 2.120) to obtain the square roots of a modulo
n. The steps are summarized in Algorithm 3.44, which, in fact, finds all of
the four square
roots of a modulo n.
3.44 Algorithm Finding square
roots modulo n given its prime factors p and q
INPUT: an integer n, its prime factors p and q, and a 2 Qn.
OUTPUT: the four square roots of a modulo n.
1. Use Algorithm 3.39 (or
Algorithm 3.36 or 3.37, if applicable) to find the two square
roots r and -r of a modulo p.
2. Use Algorithm 3.39 (or
Algorithm 3.36 or 3.37, if applicable) to find the two square
roots s and -s of a modulo q.
3. Use the extended Euclidean
algorithm(Algorithm2.107) to find integers c and d such
that cp + dq = 1.
4. Set x (rdq + scp) mod n and y (rdq - scp) mod n.
5. Return(_x mod n, _y mod n).
3.45 Fact Algorithm 3.44 has an expected
running time of O((lg p)3) bit operations.
Algorithm 3.44 shows that if one
can factor n, then the SQROOT problem is easy.
More precisely, SQROOT _P FACTORING. The converse of this
statement is also true,
as stated in Fact 3.46.
3.46 Fact FACTORING _P SQROOT. That is, the FACTORING
problem polytime reduces
to the SQROOT problem. Hence,
since SQROOT _P FACTORING, the
FACTORING
and SQROOT problems are
computationally equivalent.
Justification. Suppose that one has a
polynomial-time algorithm A for solving the SQROOT
problem. This algorithm can then
be used to factor a given composite integer n as
follows. Select an integer x at random with gcd(x; n) = 1, and compute a = x2 mod n.
Next, algorithmA is run with inputs a and n, and a square root y of a modulo n is returned.
If y _ _x (mod n), then the trial fails, and the above procedure is repeated with a new
x chosen at random. Otherwise, if y 6_ _x (mod n), then gcd(x - y; n) is guaranteed to
be a non-trivial factor of n (Fact 3.18), namely, p or q. Since a has four square roots modulo
n (_x and _z with _z 6_ _x (mod n)), the probability of success for each attempt
is 1
2 . Hence, the
expected number of attempts before a factor of n is obtained is two, and
consequently the procedure runs
in expected polynomial time. _
3.47 Note (strengthening of Fact 3.46)
The proof of Fact 3.46 can be easily modified to establish
the following stronger result.
Let c _ 1 be any constant. If there is an algorithm A
which, given n, can find a square root modulo n in polynomial time for a 1
(lg n)c fraction
of all quadratic residues a 2 Qn, then the algorithm A can be used to factor n in expected
polynomial time. The implication
of this statement is that if the problem of factoring n is
difficult, then for almost all
a 2 Qn it is difficult to find square
roots modulo n.
The computational equivalence of
the SQROOT and FACTORING problems was the
basis of the first “provably
secure” public-key encryption and signature schemes, presented
in x8.3.
3.6 The
discrete logarithm problem
The security of many cryptographic
techniques depends on the intractability of the discrete
logarithm problem. A partial list
of these includes Diffie-Hellman key agreement and its
derivatives (x12.6), ElGamal encryption (x8.4), and the ElGamal signature
scheme and its
variants (x11.5). This section summarizes the current knowledge regarding algorithms
for
solving the discrete logarithm
problem.
Unless otherwise specified,
algorithms in this section are described in the general setting
of a (multiplicatively written)
finite cyclic group G of order n with generator _ (see
Definition 2.167). For a more
concrete approach, the reader may find it convenient to think
of G as the multiplicative group Z_
p of order p - 1, where the group operation is simply
multiplication modulo p.
3.48 Definition
Let G be a finite cyclic group of order n. Let _ be a generator of G, and let
_ 2 G. The discrete logarithm of _ to the base _, denoted log_ _, is the unique integer x,
0 _ x _ n - 1, such that _ = _x.
3.49 Example Let p = 97. Then Z_
97 is a cyclic
group of order n = 96. A generator of Z_
97 is
_ = 5. Since 532 _ 35 (mod 97), log5 35 = 32 in Z_
97. _
The following are some elementary
facts about logarithms.
3.50 Fact Let _ be a generator of a cyclic group G of order n, and let _, 2 G. Let s be an
integer. Then log_(_) = (log_ _ + log_ ) mod n and log_(_s) = s log_ _ mod n.
The groups of most interest in
cryptography are the multiplicative groupF_
q of the finite
field Fq (x2.6), including the particular
cases of the multiplicative group Z_
p of the
integers
modulo a prime p, and the multiplicative group F_
2m of the finite field F2m of characteristic
two. Also of interest are the
group of units Z_
n where n is a composite integer, the group
of points on an elliptic curve
defined over a finite field, and the jacobian of a hyperelliptic
curve defined over a finite
field.
3.51 Definition
The discrete
logarithm problem (DLP) is the following: given a prime p, a
generator _ of Z_
p, and an
element _ 2 Z_
p, find the
integer x, 0 _ x _ p - 2, such that
_x _ _ (mod p).
3.52 Definition
The generalized
discrete logarithm problem (GDLP) is the following: given a
finite cyclic groupG of order n, a generator _ of G, and an element _ 2 G, find the integer
x, 0 _ x _ n - 1, such that _x = _.
The discrete logarithm problem in
elliptic curve groups and in the jacobians of hyperelliptic
curves are not explicitly
considered in this section. The discrete logarithm problem
in Z_
n is discussed
further in x3.8.
3.53 Note (difficulty of the GDLP is
independent of generator) Let _ and be two generators
of a cyclic groupG of order n, and let _ 2 G. Let x = log_ _, y = log _, and z = log_ .
Then _x = _ = y = (_z)y. Consequently x = zy mod n, and
log _ = (log_ _) (log_ )-1 mod n:
This means that any algorithm
which computes logarithms to the base _ can be used to
compute logarithms to any other
base that is also a generator of G.