This is a Chapter from the Handbook of Applied Cryptography, by A

This is a Chapter from the Handbook of Applied Cryptography, by A. Menezes, P. van

Oorschot, and S. Vanstone, CRC Press, 1996.

For further information, see www.cacr.math.uwaterloo.ca/hac

CRC Press has granted the following speci_c permissions for the electronic version of this

book:

Permission is granted to retrieve, print and store a single copy of this chapter for

personal use. This permission does not extend to binding multiple chapters of

the book, photocopying or producing copies for other than personal use of the

person creating the copy, or making electronic copies available for retrieval by

others without prior permission in writing from CRC Press.

Except where over-ridden by the speci_c permission above, the standard copyright notice

from CRC Press applies to this electronic version:

Neither this book nor any part may be reproduced or transmitted in any form or

by any means, electronic or mechanical, including photocopying, micro_lming,

and recording, or by any information storage or retrieval system, without prior

permission in writing from the publisher.

The consent of CRC Press does not extend to copying for general distribution,

for promotion, for creating new works, or for resale. Speci_c permission must be

obtained in writing from CRC Press for such copying.

c 1997 by CRC Press, Inc.

Number-Theoretic Reference

Problems

Contents in Brief

3.1 Introduction and overview : : : : : : : : : : : : : : : : : : : : : 87

3.2 The integer factorization problem : : : : : : : : : : : : : : : : : 89

3.3 The RSA problem : : : : : : : : : : : : : : : : : : : : : : : : : : 98

3.4 The quadratic residuosity problem : : : : : : : : : : : : : : : : : 99

3.5 Computing square roots in Zn : : : : : : : : : : : : : : : : : : : 99

3.6 The discrete logarithm problem : : : : : : : : : : : : : : : : : : 103

3.7 The Diffie-Hellman problem : : : : : : : : : : : : : : : : : : : : 113

3.8 Composite moduli : : : : : : : : : : : : : : : : : : : : : : : : : : 114

3.9 Computing individual bits : : : : : : : : : : : : : : : : : : : : : 114

3.10 The subset sum problem : : : : : : : : : : : : : : : : : : : : : : 117

3.11 Factoring polynomials over finite fields : : : : : : : : : : : : : : : 122

3.12 Notes and further references : : : : : : : : : : : : : : : : : : : : 125

3.1 Introduction and overview

The security of many public-key cryptosystems relies on the apparent intractability of the

computational problems studied in this chapter. In a cryptographic setting, it is prudent to

make the assumption that the adversary is very powerful. Thus, informally speaking, a computational

problem is said to be easy or tractable if it can be solved in (expected)1 polynomial

time, at least for a non-negligible fraction of all possible inputs. In otherwords, if there

is an algorithm which can solve a non-negligible fraction of all instances of a problem in

polynomial time, then any cryptosystem whose security is based on that problem must be

considered insecure.

The computational problems studied in this chapter are summarized in Table 3.1. The

true computational complexities of these problems are not known. That is to say, they are

widely believed to be intractable,2 although no proof of this is known. Generally, the only

lower bounds known on the resources required to solve these problems are the trivial linear

bounds, which do not provide any evidence of their intractability. It is, therefore, of interest

to study their relative difficulties. For this reason, various techniques of reducing one

1For simplicity, the remainder of the chapter shall generally not distinguish between deterministic polynomialtime

algorithms and randomized algorithms (see x2.3.4) whose expected running time is polynomial.

2More precisely, these problems are intractable if the problem parameters are carefully chosen.

Problem Description

FACTORING Integer factorization problem: given a positive integer n, find

its prime factorization; that is, write n = pe1

1 pe2

2 : : : pek

k where

the pi are pairwise distinct primes and each ei _ 1.

RSAP RSA problem (also known as RSA inversion): given a positive

integer n that is a product of two distinct odd primes p and q, a

positive integer e such that gcd(e; (p - 1)(q - 1)) = 1, and an

integer c, find an integer m such that me _ c (mod n).

QRP Quadratic residuosity problem: given an odd composite integer

n and an integer a having Jacobi symbol _a

n_ = 1, decide

whether or not a is a quadratic residue modulo n.

SQROOT Square rootsmodulo n: given a composite integer n and a 2 Qn

(the set of quadratic residues modulo n), find a square root of a

modulo n; that is, an integer x such that x2 _ a (mod n).

DLP Discrete logarithm problem: given a prime p, a generator _ of

p, and an element _ 2 Z_

p, find the integer x, 0 _ x _ p - 2,

such that _x _ _ (mod p).

GDLP Generalized discrete logarithm problem: given a finite cyclic

group G of order n, a generator _ of G, and an element _ 2 G,

find the integer x, 0 _ x _ n - 1, such that _x = _.

DHP Diffie-Hellman problem: given a prime p, a generator _ of Z_

and elements _a mod p and _b mod p, find _ab mod p.

GDHP Generalized Diffie-Hellman problem: given a finite cyclic group

G, a generator _ of G, and group elements _a and _b, find _ab.

SUBSET-SUM Subset sum problem: given a set of positive integers

fa1; a2; : : : ; ang and a positive integer s, determine whether or

not there is a subset of the aj that sums to s.

Table 3.1:Some computational problems of cryptographic relevance.

computational problem to another have been devised and studied in the literature. These reductions

provide a means for converting any algorithmthat solves the second problem into

an algorithm for solving the first problem. The following intuitive notion of reducibility

(cf. x2.3.3) is used in this chapter.

3.1 Definition Let A and B be two computational problems. A is said to polytime reduce to

B, written A _P B, if there is an algorithm that solves A which uses, as a subroutine, a

hypothetical algorithm for solving B, and which runs in polynomial time if the algorithm

for B does.3

Informally speaking, if A polytime reduces to B, then B is at least as difficult as A;

equivalently, A is no harder than B. Consequently, if A is a well-studied computational

problemthat iswidely believed to be intractable, then proving thatA _P B provides strong

evidence of the intractability of problem B.

3.2 Definition Let A and B be two computational problems. If A _P B and B _P A, then

A and B are said to be computationally equivalent, written A _P B.

3In the literature, the hypothetical polynomial-time subroutine for B is sometimes called an oracle

Informally speaking, if A _P B then A and B are either both tractable or both intractable,

as the case may be.

Chapter outline

The remainder of the chapter is organized as follows. Algorithms for the integer factorization

problem are studied in x3.2. Two problems related to factoring, the RSA problem and

the quadratic residuosity problem, are briefly considered in x3.3 and x3.4. Efficient algorithms

for computing square roots in Zp, p a prime, are presented in x3.5, and the equivalence

of the problems of finding square roots modulo a composite integer n and factoring

n is established. Algorithms for the discrete logarithm problem are studied in x3.6, and

the related Diffie-Hellman problem is briefly considered in x3.7. The relation between the

problems of factoring a composite integer n and computing discrete logarithms in (cyclic

subgroups of) the group Z_

n is investigated in x3.8. The tasks of finding partial solutions

to the discrete logarithm problem, the RSA problem, and the problem of computing square

roots modulo a composite integer n are the topics of x3.9. The L3-lattice basis reduction

algorithm is presented in x3.10, along with algorithms for the subset sum problem and for

simultaneous diophantine approximation. Berlekamp’s Q-matrix algorithm for factoring

polynomials is presented in x3.11. Finally, x3.12 provides references and further chapter

notes.

3.2 The integer factorization problem

The security of many cryptographic techniques depends upon the intractability of the integer

factorization problem. A partial list of such protocols includes the RSA public-key

encryption scheme (x8.2), the RSA signature scheme (x11.3.1), and the Rabin public-key

encryption scheme (x8.3). This section summarizes the current knowledge on algorithms

for the integer factorization problem.

3.3 Definition The integer factorization problem (FACTORING) is the following: given a

positive integer n, find its prime factorization; that is, write n = pe1

1 pe2

2 _ _ _ pek

k where the

pi are pairwise distinct primes and each ei _ 1.

3.4 Remark (primality testing vs. factoring) The problem of deciding whether an integer is

composite or prime seems to be, in general, much easier than the factoring problem. Hence,

before attempting to factor an integer, the integer should be tested to make sure that it is

indeed composite. Primality tests are a main topic of Chapter 4.

3.5 Remark (splitting vs. factoring) A non-trivial factorization of n is a factorization of the

form n = ab where 1 < a < n and 1 < b < n; a and b are said to be non-trivial factors

of n. Here a and b are not necessarily prime. To solve the integer factorization problem, it

suffices to study algorithms that split n, that is, find a non-trivial factorization n = ab. Once

found, the factors a and b can be tested for primality. The algorithmfor splitting integers can

then be recursively applied to a and/or b, if either is found to be composite. In this manner,

the prime factorization of n can be obtained.

3.6 Note (testing for perfect powers) If n _ 2, it can be efficiently checked as follows whether

or not n is a perfect power, i.e., n = xk for some integers x _ 2, k _ 2. For each prime

p _ lg n, an integer approximation x of n1=p is computed. This can be done by performing

a binary search for x satisfying n = xp in the interval [2; 2blg n=pc+1]. The entire procedure

takes O((lg3 n) lg lg lg n) bit operations. For the remainder of this section, it will always

be assumed that n is not a perfect power. It follows that if n is composite, then n has at least

two distinct prime factors.

Some factoring algorithms are tailored to perform better when the integer n being factored

is of a special form; these are called special-purpose factoring algorithms. The running

times of such algorithms typically depend on certain properties of the factors of n. Examples

of special-purpose factoring algorithms include trial division (x3.2.1), Pollard’s rho

algorithm (x3.2.2), Pollard’s p-1 algorithm (x3.2.3), the elliptic curve algorithm (x3.2.4),

and the special number field sieve (x3.2.7). In contrast, the running times of the so-called

general-purpose factoring algorithms depend solely on the size of n. Examples of generalpurpose

factoring algorithms include the quadratic sieve (x3.2.6) and the general number

field sieve (x3.2.7).

Whenever applicable, special-purpose algorithms should be employed as they will generally

be more efficient. A reasonable overall strategy is to attempt to find small factors

first, capitalize on any particular special forms an integer may have, and then, if all else

fails, bring out the general-purpose algorithms. As an example of a general strategy, one

might consider the following.

1. Apply trial division by small primes less than some bound b1.

2. Next, apply Pollard’s rho algorithm, hoping to find any small prime factors smaller

than some bound b2, where b2 > b1.

3. Apply the elliptic curve factoring algorithm, hoping to find any small factors smaller

than some bound b3, where b3 > b2.

4. Finally, apply one of the more powerful general-purpose algorithms (quadratic sieve

or general number field sieve).

3.2.1 Trial division

Once it is established that an integer n is composite, before expending vast amounts of time

with more powerful techniques, the first thing that should be attempted is trial division by

all “small” primes. Here, “small” is determined as a function of the size of n. As anextreme

case, trial division can be attempted by all primes up to

n. If this is done, trial division

will completely factor n but the procedure will take roughly

n divisions in the worst case

when n is a product of two primes of the same size. In general, if the factors found at each

stage are tested for primality, then trial division to factor n completely takes O(p + lgn)

divisions, where p is the second-largest prime factor of n.

Fact 3.7 indicates that if trial division is used to factor a randomly chosen large integer

n, then the algorithmcan be expected to find some small factors of n relatively quickly, and

expend a large amount of time to find the second largest prime factor of n.

3.7 Fact Let n be chosen uniformly at random from the interval [1; x].

(i) If 1

2 _ _ _ 1, then the probability that the largest prime factor of n is _ x_ is

approximately 1+ln_. Thus, for example, the probability that n has a prime factor

x is ln 2 _ 0:69.

(ii) The probability that the second-largest prime factor of n is _ x0:2117 is about 1

2 .

(iii) The expected total number of prime factors of n is ln ln x+O(1). (If n = Qpei

i , the

total number of prime factors of n isPei.)

3.2.2 Pollard’s rho factoring algorithm

Pollard’s rho algorithmis a special-purpose factoring algorithmfor finding small factors of

a composite integer.

Let f : S -! S be a random function, where S is a finite set of cardinality n. Let

x0 be a random element of S, and consider the sequence x0; x1; x2; : : : defined by xi+1 =

f(xi) for i _ 0. Since S is finite, the sequence must eventually cycle, and consists of a

tail of expected lengthp_n=8 followed by an endlessly repeating cycle of expected length

p_n=8 (see Fact 2.37). Aproblem that arises in some cryptanalytic tasks, including integer

factorization (Algorithm 3.9) and the discrete logarithm problem (Algorithm 3.60), is of

finding distinct indices i and j such that xi = xj (a collision is then said to have occurred).

An obvious method for finding a collision is to compute and store xi for i = 0; 1; 2; : : :

and look for duplicates. The expected number of inputs thatmust be tried before a duplicate

is detected isp_n=2 (Fact 2.27). This method requires O(

n) memory and O(

n) time,

assuming the xi are stored in a hash table so that new entries can be added in constant time.

3.8 Note (Floyd’s cycle-finding algorithm) The large storage requirements in the above technique

for finding a collision can be eliminated by using Floyd’s cycle-finding algorithm.

In this method, one starts with the pair (x1; x2), and iteratively computes (xi; x2i) from

the previous pair (xi-1; x2i-2), until xm = x2m for some m. If the tail of the sequence

has length _ and the cycle has length _, then the first time that xm = x2m is when m =

_(1 + b_=_c). Note that_ < m _ _ + _, and consequently the expected running time of

this method is O(

n).

Now, let p be a prime factor of a composite integer n. Pollard’s rho algorithm for factoring

n attempts to find duplicates in the sequence of integers x0; x1; x2; : : : defined by

x0 = 2, xi+1 = f(xi) = x2

i + 1 mod p for i _ 0. Floyd’s cycle-finding algorithm is utilized

to find xm and x2m such that xm _ x2m (mod p). Since p divides n but is unknown,

this is done by computing the terms xi modulo n and testing if gcd(xm - x2m; n) > 1.

If also gcd(xm - x2m; n) < n, then a non-trivial factor of n is obtained. (The situation

gcd(xm - x2m; n) = n occurs with negligible probability.)

3.9 Algorithm Pollard’s rho algorithm for factoring integers

INPUT: a composite integer n that is not a prime power.

OUTPUT: a non-trivial factor d of n.

1. Set a 2, b 2.

2. For i = 1; 2; : : : do the following:

2.1 Compute a a2 + 1 mod n, b b2 + 1 mod n, b b2 + 1 mod n.

2.2 Compute d = gcd(a - b; n).

2.3 If 1 < d < n then return(d) and terminate with success.

2.4 If d = n then terminate the algorithm with failure (see Note 3.12).

3.10 Example (Pollard’s rho algorithm for finding a non-trivial factor of n = 455459) The

following table lists the values of variables a, b, and d at the end of each iteration of step 2

of Algorithm 3.9.

a b d

5 26 1

26 2871 1

677 179685 1

2871 155260 1

44380 416250 1

179685 43670 1

121634 164403 1

155260 247944 1

44567 68343 743

Hence two non-trivial factors of 455459 are 743 and 455459=743 = 613. _

3.11 Fact Assuming that the function f(x) = x2 + 1mod p behaves like a random function,

the expected time for Pollard’s rho algorithmto find a factor p of n is O(

p) modular multiplications.

This implies that the expected time to find a non-trivial factor of n is O(n1=4)

modular multiplications.

3.12 Note (options upon termination with failure) If Pollard’s rho algorithm terminates with

failure, one option is to try again with a different polynomial f having integer coefficients

instead of f(x) = x2 + 1. For example, the polynomial f(x) = x2 + c may be used as

long as c 6= 0;-2.

3.2.3 Pollard’s p - 1 factoring algorithm

Pollard’s p-1 factoring algorithmis a special-purpose factoring algorithmthat can be used

to efficiently find any prime factors p of a composite integer n for which p - 1 is smooth

(see Definition 3.13) with respect to some relatively small bound B.

3.13 Definition Let B be a positive integer. An integer n is said to be B-smooth, or smooth

with respect to a bound B, if all its prime factors are _ B.

The idea behind Pollard’s p - 1 algorithm is the following. Let B be a smoothness

bound. Let Q be the least common multiple of all powers of primes _ B that are _ n. If

ql _ n, then l ln q _ ln n, and so l _ blnn

ln q c. Thus

Q = Yq_B

qbln n= ln qc;

where the product is over all distinct primes q _ B. If p is a prime factor of n such that p-1

is B-smooth, then p -1jQ, and consequently for any a satisfying gcd(a; p) = 1, Fermat’s

theorem (Fact 2.127) implies that aQ _ 1 (mod p). Hence if d = gcd(aQ - 1; n), then

pjd. It is possible that d = n, in which case the algorithm fails; however, this is unlikely to

occur if n has at least two large distinct prime factors.

3.14 Algorithm Pollard’s p - 1 algorithm for factoring integers

INPUT: a composite integer n that is not a prime power.

OUTPUT: a non-trivial factor d of n.

1. Select a smoothness bound B.

2. Select a random integer a, 2 _ a _ n - 1, and compute d = gcd(a; n). If d _ 2

then return(d).

3. For each prime q _ B do the following:

3.1 Compute l = b ln n

ln q c.

3.2 Compute a aql

mod n (using Algorithm 2.143).

4. Compute d = gcd(a - 1; n).

5. If d = 1 or d = n, then terminate the algorithm with failure. Otherwise, return(d).

3.15 Example (Pollard’s p - 1 algorithm for finding a non-trivial factor of n = 19048567)

1. Select the smoothness bound B = 19.

2. Select the integer a = 3 and compute gcd(3; n) = 1.

3. The following table lists the intermediate values of the variables q, l, and a after each

iteration of step 3 in Algorithm 3.14:

q l a

2 24 2293244

3 15 13555889

5 10 16937223

7 8 15214586

11 6 9685355

13 6 13271154

17 5 11406961

19 5 554506

4. Compute d = gcd(554506 - 1; n) = 5281.

5. Two non-trivial factors of n are p = 5281 and q = n=p = 3607 (these factors are in

fact prime).

Notice that p-1 = 5280 = 25 _3 _5_11, and q -1 = 3606 = 2 _3 _601. That

is, p - 1 is 19-smooth, while q - 1 is not 19-smooth. _

3.16 Fact Let n be an integer having a prime factor p such that p - 1 is B-smooth. The running

time of Pollard’s p - 1 algorithm for finding the factor p is O(B ln n= lnB) modular

multiplications.

3.17 Note (improvements) The smoothness boundB in Algorithm3.14 is selected based on the

amount of time one is willing to spend on Pollard’s p - 1 algorithm before moving on to

more general techniques. In practice, B may be between 105 and 106. If the algorithm

terminates with d = 1, then one might try searching over prime numbers q1; q2; : : : ; ql

larger than B by first computing a aqi mod n for 1 _ i _ l, and then computing d =

gcd(a - 1; n). Another variant is to start with a large bound B, and repeatedly execute

step 3 for a few primes q followed by the gcd computation in step 4. There are numerous

other practical improvements of the algorithm (see page 125).

3.2.4 Elliptic curve factoring

The details of the elliptic curve factoring algorithm are beyond the scope of this book; nevertheless,

a rough outline follows. The success of Pollard’s p-1 algorithmhinges on p-1

being smooth for some prime divisor p of n; if no such p exists, then the algorithm fails.

Observe that p - 1 is the order of the group Z_

p. The elliptic curve factoring algorithm is a

generalization of Pollard’s p - 1 algorithm in the sense that the group Z_

p is replaced by a

random elliptic curve group over Zp. The order of such a group is roughly uniformly distributed

in the interval [p+1-2

p; p+1+2

p]. If the order of the group chosen is smooth

with respect to some pre-selected bound, the elliptic curve algorithm will, with high probability,

find a non-trivial factor of n. If the group order is not smooth, then the algorithm

will likely fail, but can be repeated with a different choice of elliptic curve group.

The elliptic curve algorithm has an expected running time of Lp[ 1

2 ;

2] (see Example

2.61 for definition of Lp) to find a factor p of n. Since this running time depends on

the size of the prime factors of n, the algorithm tends to find small such factors first. The

elliptic curve algorithm is, therefore, classified as a special-purpose factoring algorithm. It

is currently the algorithm of choice for finding t-decimal digit prime factors, for t _ 40, of

very large composite integers.

In the hardest case, when n is a product of two primes of roughly the same size, the

expected running time of the elliptic curve algorithm is Ln[ 1

2 ; 1], which is the same as that

of the quadratic sieve (x3.2.6). However, the elliptic curve algorithm is not as efficient as

the quadratic sieve in practice for such integers.

3.2.5 Random square factoring methods

The basic idea behind the random square family of methods is the following. Suppose x

and y are integers such that x2 _ y2 (mod n) but x 6_ _y (mod n). Then n divides

x2-y2 = (x-y)(x+y) but n does not divide either (x-y) or (x+y). Hence, gcd(x-y; n)

must be a non-trivial factor of n. This result is summarized next.

3.18 Fact Let x, y, andn be integers. If x2 _ y2 (mod n) but x 6_ _y (mod n), thengcd(x-

y; n) is a non-trivial factor of n.

The random square methods attempt to find integers x and y at random so that x2 _ y2

(mod n). Then, as shown in Fact 3.19, with probability at least 1

2 it is the case that x 6_ _y

(mod n), whence gcd(x - y; n) will yield a non-trivial factor of n.

3.19 Fact Let n be an odd composite integer that is divisible by k distinct odd primes. If a 2

n, then the congruence x2 _ a2 (mod n) has exactly 2k solutions modulo n, two of

which are x = a and x = -a.

3.20 Example Let n = 35. Then there are four solutions to the congruence x2 _ 4 (mod 35),

namely x = 2, 12, 23, and 33. _

A common strategy employed by the random square algorithms for finding x and y at

random satisfying x2 _ y2 (mod n) is the following. A set consisting of the first t primes

S = fp1; p2; : : : ; ptg is chosen; S is called the factor base. Proceed to find pairs of integers

(ai; bi) satisfying

(i) a2

i _ bi (mod n); and

(ii) bi = Qt

j=1 peij

j , eij _ 0; that is, bi is pt-smooth.

Next find a subset of the bi’s whose product is a perfect square. Knowing the factorizations

of the bi’s, this is possible by selecting a subset of the bi’s such that the power of

each prime pj appearing in their product is even. For this purpose, only the parity of the

non-negative integer exponents eij needs to be considered. Thus, to simplify matters, for

each i, associate the binary vector vi = (vi1; vi2; : : : ; vit) with the integer exponent vector

(ei1; ei2; : : : ; eit) such that vij = eij mod 2. If t + 1 pairs (ai; bi) are obtained, then the

t-dimensional vectors v1; v2; : : : ; vt+1 must be linearly dependent over Z2. That is, there

must exist a non-empty subset T _ f1; 2; : : : ; t+1g such thatPi2T vi = 0 over Z2, and

henceQi2T bi is a perfect square. The set T can be found using ordinary linear algebra over

Z2. Clearly, Qi2T a2

i is also a perfect square. Thus setting x = Qi2T ai and y to be the

integer square root ofQi2T bi yields a pair of integers (x; y) satisfying x2 _ y2 (mod n).

If this pair also satisfies x 6_ _y (mod n), then gcd(x - y; n) yields a non-trivial factor

of n. Otherwise, some of the (ai; bi) pairs may be replaced by some new such pairs, and

the process is repeated. In practice, there will be several dependencies among the vectors

v1; v2; : : : ; vt+1, and with high probability at least one will yield an (x; y) pair satisfying

x 6_ _y (mod n); hence, this last step of generating new (ai; bi) pairs does not usually

occur.

This description of the random square methods is incomplete for two reasons. Firstly,

the optimal choice of t, the size of the factor base, is not specified; this is addressed in

Note 3.24. Secondly, a method for efficiently generating the pairs (ai; bi) is not specified.

Several techniques have been proposed. In the simplest of these, called Dixon’s algorithm,

ai is chosen at random, and bi = a2

i mod n is computed. Next, trial division by elements

in the factor base is used to test whether bi is pt-smooth. If not, then another integer ai is

chosen at random, and the procedure is repeated.

The more efficient techniques strategically select an ai such that bi is relatively small.

Since the proportion of pt-smooth integers in the interval [2; x] becomes larger as x decreases,

the probability of such bi being pt-smooth is higher. The most efficient of such

techniques is the quadratic sieve algorithm, which is described next.

3.2.6 Quadratic sieve factoring

Suppose an integer n is to be factored. Letm = b

nc, and consider the polynomial q(x) =

(x +m)2 - n. Note that

q(x) = x2 + 2mx +m2 - n _ x2 + 2mx; (3.1)

which is small (relative to n) if x is small in absolute value. The quadratic sieve algorithm

selects ai = (x + m) and tests whether bi = (x + m)2 - n is pt-smooth. Note that

i = (x + m)2 _ bi (mod n). Note also that if a prime p divides bi then (x + m)2 _ n

(mod p), and hence n is a quadratic residue modulo p. Thus the factor base need only

contain those primes p for which the Legendre symbol _n

p_is 1 (Definition 2.145). Furthermore,

since bi may be negative,-1 is included in the factor base. The steps of the quadratic

sieve algorithm are summarized in Algorithm 3.21.

3.21 Algorithm Quadratic sieve algorithm for factoring integers

INPUT: a composite integer n that is not a prime power.

OUTPUT: a non-trivial factor d of n.

1. Select the factor base S = fp1; p2; : : : ; ptg, where p1 = -1 and pj (j _ 2) is the

(j - 1)th prime p for which n is a quadratic residue modulo p.

2. Compute m = b

nc.

3. (Collect t + 1 pairs (ai; bi). The x values are chosen in the order 0;_1;_2; : : : .)

Set i 1. While i _ t + 1 do the following:

3.1 Compute b = q(x) = (x+m)2-n, and test using trial division (cf. Note 3.23)

by elements in S whether b is pt-smooth. If not, pick a new x and repeat step 3.1.

3.2 If b is pt-smooth, say b = Qt

j=1 peij

j , then set ai (x + m), bi b, and vi =

(vi1; vi2; : : : ; vit), where vij = eij mod 2 for 1 _ j _ t.

3.3 i i + 1.

4. Use linear algebra over Z2 to find a non-empty subset T _ f1; 2; : : : ; t + 1g such

thatPi2T vi = 0.

5. Compute x = Qi2T ai mod n.

6. For each j, 1 _ j _ t, compute lj = (Pi2T eij )=2.

7. Compute y = Qt

j=1 plj

j mod n.

8. If x _ _y (mod n), then find another non-empty subset T _ f1; 2; : : : ; t+1g such

that Pi2T vi = 0, and go to step 5. (In the unlikely case such a subset T does not

exist, replace a few of the (ai; bi) pairs with new pairs (step 3), and go to step 4.)

9. Compute d = gcd(x - y; n) and return(d).

3.22 Example (quadratic sieve algorithm for finding a non-trivial factor of n = 24961)

1. Select the factor base S = f-1; 2; 3; 5; 13; 23g of size t = 6. (7, 11, 17 and 19 are

omitted from S since _n

p_ = -1 for these primes.)

2. Compute m = b

24961c = 157.

3. Following is the data collected for the first t + 1 values of x for which q(x) is 23-

smooth.

i x q(x) factorization of q(x) ai vi

1 0 -312 -23 _ 3 _ 13 157 (1; 1; 1; 0; 1; 0)

2 1 3 3 158 (0; 0; 1; 0; 0; 0)

3 -1 -625 -54 156 (1; 0; 0; 0; 0; 0)

4 2 320 26 _ 5 159 (0; 0; 0; 1; 0; 0)

5 -2 -936 -23 _ 32 _ 13 155 (1; 1; 0; 0; 1; 0)

6 4 960 26 _ 3 _ 5 161 (0; 0; 1; 1; 0; 0)

7 -6 -2160 -24 _ 33 _ 5 151 (1; 0; 1; 1; 0; 0)

4. By inspection, v1 +v2 +v5 = 0. (In the notation of Algorithm 3.21, T = f1; 2; 5g.)

5. Compute x = (a1a2a5 mod n) = 936.

6. Compute l1 = 1, l2 = 3, l3 = 2, l4 = 0, l5 = 1, l6 = 0.

7. Compute y = -23 _ 32 _ 13 mod n = 24025.

8. Since 936_ -24025 (mod n), another linear dependency must be found.

9. By inspection, v3 + v6 + v7 = 0; thus T = f3; 6; 7g.

10. Compute x = (a3a6a7 mod n) = 23405.

11. Compute l1 = 1, l2 = 5, l3 = 2, l4 = 3, l5 = 0, l6 = 0.

12. Compute y = (-25 _ 32 _ 53 mod n) = 13922.

13. Now, 23405 6_ _13922 (mod n), so compute gcd(x-y; n) = gcd(9483; 24961) =

109. Hence, two non-trivial factors of 24961 are 109 and 229. _

3.23 Note (sieving) Instead of testing smoothness by trial division in step 3.1 of Algorithm3.21,

a more efficient technique known as sieving is employed in practice. Observe first that if p

is an odd prime in the factor base and p divides q(x), then p also divides q(x+lp) for every

integer l. Thus by solving the equation q(x) _ 0 (mod p) for x (for example, using the

algorithms in x3.5.1), one knows either one or two (depending on the number of solutions

to the quadratic equation) entire sequences of other values y for which p divides q(y).

The sieving process is the following. An array Q[ ] indexed by x, -M _ x _ M, is

created and the xth entry is initialized to blg jq(x)jc. Let x1, x2 be the solutions to q(x) _ 0

(mod p), where p is an odd prime in the factor base. Then the value blg pc is subtracted

from those entries Q[x] in the array for which x _ x1 or x2 (mod p) and -M _ x _ M.

This is repeated for each odd prime p in the factor base. (The case of p = 2 and prime

powers can be handled in a similar manner.) After the sieving, the array entries Q[x] with

values near 0 are most likely to be pt-smooth (roundoff errors must be taken into account),

and this can be verified by factoring q(x) by trial division.

3.24 Note (running time of the quadratic sieve) To optimize the running time of the quadratic

sieve, the size of the factor base should be judiciously chosen. The optimal selection of

t _ Ln[ 1

2 ; 1

2 ] (see Example 2.61) is derived from knowledge concerning the distribution

of smooth integers close to

n. With this choice, Algorithm 3.21 with sieving (Note 3.23)

has an expected running time of Ln[ 1

2 ; 1], independent of the size of the factors of n.

3.25 Note (multiple polynomial variant) In order to collect a sufficient number of (ai; bi) pairs,

the sieving interval must be quite large. From equation (3.1) it can be seen that jq(x)j increases

linearly with jxj, and consequently the probability of smoothness decreases. To

overcome this problem, a variant (the multiple polynomial quadratic sieve) was proposed

wherebymany appropriately-chosenquadratic polynomials can be used instead of just q(x),

each polynomial being sieved over an interval ofmuch smaller length. This variant also has

an expected running time of Ln[ 1

2 ; 1], and is the method of choice in practice.

3.26 Note (parallelizing the quadratic sieve) The multiple polynomial variant of the quadratic

sieve is well suited for parallelization. Each node of a parallel computer, or each computer

in a network of computers, simply sieves through different collections of polynomials. Any

(ai; bi) pair found is reported to a central processor. Once sufficient pairs have been collected,

the corresponding system of linear equations is solved on a single (possibly parallel)

computer.

3.27 Note (quadratic sieve vs. elliptic curve factoring) The elliptic curve factoring algorithm

(x3.2.4) has the same4 expected (asymptotic) running time as the quadratic sieve factoring

algorithm in the special case when n is the product of two primes of equal size. However,

for such numbers, the quadratic sieve is superior in practice because the main steps in the

algorithm are single precision operations, compared to the much more computationally intensive

multi-precision elliptic curve operations required in the elliptic curve algorithm.

3.2.7 Number field sieve factoring

For several years it was believed by some people that a running time of Ln[ 1

2 ; 1] was, in

fact, the best achievable by any integer factorization algorithm. This barrier was broken in

1990 with the discovery of the number field sieve. Like the quadratic sieve, the number field

sieve is an algorithm in the random square family of methods (x3.2.5). That is, it attempts

to find integers x and y such that x2 _ y2 (mod n) and x 6_ _y (mod n). To achieve this

goal, two factor bases are used, one consisting of all prime numbers less than some bound,

and the other consisting of all prime ideals of norm less than some bound in the ring of

integers of a suitably-chosen algebraic number field. The details of the algorithm are quite

complicated, and are beyond the scope of this book.

A special version of the algorithm (the special number field sieve) applies to integers

of the form n = re - s for small r and jsj, and has an expected running time of Ln[ 1

3; c],

where c = (32=9)1=3 _ 1:526.

The general version of the algorithm, sometimes called the general number field sieve,

applies to all integers and has an expected running time of Ln[ 1

3; c], where c = (64=9)1=3 _

1:923. This is, asymptotically, the fastest algorithm known for integer factorization. The

primary reason why the running time of the number field sieve is smaller than that of the

quadratic sieve is that the candidate smooth numbers in the former are much smaller than

those in the latter.

The general number field sieve was at first believed to be slower than the quadratic

sieve for factoring integers having fewer than 150 decimal digits. However, experiments

in 1994–1996 have indicated that the general number field sieve is substantially faster than

the quadratic sieve even for numbers in the 115 digit range. This implies that the crossover

point between the effectiveness of the quadratic sieve vs. the general number field sieve

may be 110–120 digits. For this reason, the general number field sieve is considered the

current champion of all general-purpose factoring algorithms.

3.3 The RSA problem

The intractability of the RSA problem forms the basis for the security of the RSA public-key

encryption scheme (x8.2) and the RSA signature scheme (x11.3.1).

3.28 Definition The RSA problem (RSAP) is the following: given a positive integer n that is a

product of two distinct odd primes p and q, a positive integer e such that gcd(e; (p-1)(q-

1)) = 1, and an integer c, find an integer m such that me _ c (mod n).

In otherwords, the RSAproblemis that of finding eth rootsmodulo a composite integer

n. The conditions imposed on the problem parameters n and e ensure that for each integer

c 2 f0; 1; : : : ; n - 1g there is exactly one m 2 f0; 1; : : : ; n - 1g such that me _ c

(mod n). Equivalently, the function f : Zn -! Zn defined as f(m) = me mod n is a

permutation.

3.29 Remark (SQROOT vs. RSA problems) Since p - 1 is even, it follows that e is odd. In

particular, e 6= 2, and hence the SQROOT problem (Definition 3.43) is not a special case

of the RSA problem.

As is shown in x8.2.2(i), if the factors of n are known then the RSA problem can be

easily solved. This fact is stated next.

3.30 Fact RSAP _P FACTORING. That is, the RSA problem polytime reduces to the integer

factorization problem.

It is widely believed that the RSA and the integer factorization problems are computationally

equivalent, although no proof of this is known.

3.4 The quadratic residuosity problem

The security of the Goldwasser-Micali probabilistic public-key encryption scheme (x8.7)

and the Blum-Blum-Shub pseudorandom bit generator (x5.5.2) are both based on the apparent

intractability of the quadratic residuosity problem.

Recall from x2.4.5 that if n _ 3 is an odd integer, then Jn is the set of all a 2 Z_

having Jacobi symbol 1. Recall also that Qn is the set of quadratic residues modulo n and

that the set of pseudosquares modulo n is defined by eQn = Jn - Qn.

3.31 Definition The quadratic residuosity problem (QRP) is the following: given an odd composite

integer n and a 2 Jn, decide whether or not a is a quadratic residue modulo n.

3.32 Remark (QRP with a prime modulus) If n is a prime, then it is easy to decide whether

a 2 Z_

n is a quadratic residue modulo n since, by definition, a 2 Qn if and only if _a

n_ = 1,

and the Legendre symbol _a

n_can be efficiently calculated by Algorithm 2.149.

Assume now that n is a product of two distinct odd primes p and q. It follows from

Fact 2.137 that if a 2 Jn, then a 2 Qn if and only if _a

p_ = 1. Thus, if the factorization of

n is known, then QRP can be solved simply by computing the Legendre symbol _a

p_. This

observation can be generalized to all integers n and leads to the following fact.

3.33 Fact QRP _P FACTORING. That is, the QRP polytime reduces to the FACTORING

problem.

On the other hand, if the factorization of n is unknown, then there is no efficient procedure

known for solving QRP, other than by guessing the answer. If n = pq, then the

probability of a correct guess is 1

2 since jQnj = j eQnj (Fact 2.155). It is believed that the

QRP is as difficult as the problem of factoring integers, although no proof of this is known.

3.5 Computing square roots in Zn

The operations of squaring modulo an integer n and extracting square roots modulo an integer

n are frequently used in cryptographic functions. The operation of computing square

roots modulo n can be performed efficiently when n is a prime, but is difficult when n is a

composite integer whose prime factors are unknown.

3.5.1 Case (i): n prime

Recall from Remark 3.32 that if p is a prime, then it is easy to decide if a 2 Z_

p is a quadratic

residue modulo p. If a is, in fact, a quadratic residue modulo p, then the two square roots

of a can be efficiently computed, as demonstrated by Algorithm 3.34.

3.34 Algorithm Finding square roots modulo a prime p

INPUT: an odd prime p and an integer a, 1 _ a _ p - 1.

OUTPUT: the two square roots of a modulo p, provided a is a quadratic residue modulo p.

1. Compute the Legendre symbol _a

p_usingAlgorithm2.149. If _a

p_ = -1 then return(a

does not have a square root modulo p) and terminate.

2. Select integers b, 1 _ b _ p - 1, at random until one is found with _b

p_ = -1. (b is

a quadratic non-residue modulo p.)

3. By repeated division by 2, write p -1 = 2st, where t is odd.

4. Compute a-1 mod p by the extended Euclidean algorithm (Algorithm 2.142).

5. Set c bt mod p and r a(t+1)=2 mod p (Algorithm 2.143).

6. For i from 1 to s - 1 do the following:

6.1 Compute d = (r2 _ a-1)2s-i-1

mod p.

6.2 If d _ -1 (mod p) then set r r _ c mod p.

6.3 Set c c2 mod p.

7. Return(r, -r).

Algorithm 3.34 is a randomized algorithm because of themanner inwhich the quadratic

non-residue b is selected in step 2. No deterministic polynomial-time algorithm for finding

a quadratic non-residue modulo a prime p is known (see Remark 2.151).

3.35 Fact Algorithm 3.34 has an expected running time of O((lg p)4) bit operations.

This running time is obtained by observing that the dominant step (step 6) is executed

s-1 times, each iteration involving amodular exponentiation and thus takingO((lg p)3) bit

operations (Table 2.5). Since in the worst case s = O(lg p), the running time of O((lg p)4)

follows. When s is small, the loop in step 6 is executed only a small number of times, and

the running time of Algorithm3.34 is O((lg p)3) bit operations. This point is demonstrated

next for the special cases s = 1 and s = 2.

Specializing Algorithm3.34 to the case s = 1yields the following simple deterministic

algorithm for finding square roots when p _ 3 (mod 4).

3.36 Algorithm Finding square roots modulo a prime p where p _ 3 (mod 4)

INPUT: an odd prime p where p _ 3 (mod 4), and a square a 2 Qp.

OUTPUT: the two square roots of a modulo p.

1. Compute r = a(p+1)=4 mod p (Algorithm 2.143).

2. Return(r, -r).

Specializing Algorithm 3.34 to the case s = 2, and using the fact that 2 is a quadratic

non-residue modulo p when p _ 5 (mod 8), yields the following simple deterministic algorithm

for finding square roots when p _ 5 (mod 8).

3.37 Algorithm Finding square roots modulo a prime p where p _ 5 (mod 8)

INPUT: an odd prime p where p _ 5 (mod 8), and a square a 2 Qp.

OUTPUT: the two square roots of a modulo p.

1. Compute d = a(p-1)=4 mod p (Algorithm 2.143).

2. If d = 1 then compute r = a(p+3)=8 mod p.

3. If d = p - 1 then compute r = 2a(4a)(p-5)=8 mod p.

4. Return(r, -r).

3.38 Fact Algorithms 3.36 and 3.37 have running times of O((lg p)3) bit operations.

Algorithm3.39 for finding square rootsmodulo p is preferable to Algorithm3.34 when

p -1 = 2st with s large.

3.39 Algorithm Finding square roots modulo a prime p

INPUT: an odd prime p and a square a 2 Qp.

OUTPUT: the two square roots of a modulo p.

1. Choose random b 2 Zp until b2 - 4a is a quadratic non-residue modulo p, i.e.,

_b2-4a

p _ = -1.

2. Let f be the polynomial x2 - bx + a in Zp[x].

3. Compute r = x(p+1)=2 mod f using Algorithm 2.227. (Note: r will be an integer.)

4. Return(r, -r).

3.40 Fact Algorithm 3.39 has an expected running time of O((lg p)3) bit operations.

3.41 Note (computing square roots in a finite field)Algorithms 3.34, 3.36, 3.37, and 3.39 can be

extended in a straightforwardmanner to find square roots in any finite field Fq of odd order

q = pm, p prime, m _ 1. Square roots in finite fields of even order can also be computed

efficiently via Fact 3.42.

3.42 Fact Each element a 2 F2m has exactly one square root, namely a2m-1 .

3.5.2 Case (ii): n composite

The discussion in this subsection is restricted to the case of computing square roots modulo

n, where n is a product of two distinct odd primes p and q. However, all facts presented

here generalize to the case where n is an arbitrary composite integer.

Unlike the case where n is a prime, the problem of deciding whether a given a 2 Z_

is a quadratic residue modulo a composite integer n, is believed to be a difficult problem.

Certainly, if the Jacobi symbol _a

n_ = -1, then a is a quadratic non-residue. On the other

hand, if _a

n_ = 1, then deciding whether or not a is a quadratic residue is precisely the

quadratic residuosity problem, considered in x3.4.

3.43 Definition The square root modulo n problem (SQROOT) is the following: given a composite

integer n and a quadratic residue a modulo n (i.e. a 2 Qn), find a square root of a

modulo n.

If the factors p and q of n are known, then the SQROOT problem can be solved effi-

ciently by first finding square roots of a modulo p and modulo q, and then combining them

using the Chinese remainder theorem (Fact 2.120) to obtain the square roots of a modulo

n. The steps are summarized in Algorithm 3.44, which, in fact, finds all of the four square

roots of a modulo n.

3.44 Algorithm Finding square roots modulo n given its prime factors p and q

INPUT: an integer n, its prime factors p and q, and a 2 Qn.

OUTPUT: the four square roots of a modulo n.

1. Use Algorithm 3.39 (or Algorithm 3.36 or 3.37, if applicable) to find the two square

roots r and -r of a modulo p.

2. Use Algorithm 3.39 (or Algorithm 3.36 or 3.37, if applicable) to find the two square

roots s and -s of a modulo q.

3. Use the extended Euclidean algorithm(Algorithm2.107) to find integers c and d such

that cp + dq = 1.

4. Set x (rdq + scp) mod n and y (rdq - scp) mod n.

5. Return(_x mod n, _y mod n).

3.45 Fact Algorithm 3.44 has an expected running time of O((lg p)3) bit operations.

Algorithm 3.44 shows that if one can factor n, then the SQROOT problem is easy.

More precisely, SQROOT _P FACTORING. The converse of this statement is also true,

as stated in Fact 3.46.

3.46 Fact FACTORING _P SQROOT. That is, the FACTORING problem polytime reduces

to the SQROOT problem. Hence, since SQROOT _P FACTORING, the FACTORING

and SQROOT problems are computationally equivalent.

Justification. Suppose that one has a polynomial-time algorithm A for solving the SQROOT

problem. This algorithm can then be used to factor a given composite integer n as

follows. Select an integer x at random with gcd(x; n) = 1, and compute a = x2 mod n.

Next, algorithmA is run with inputs a and n, and a square root y of a modulo n is returned.

If y _ _x (mod n), then the trial fails, and the above procedure is repeated with a new

x chosen at random. Otherwise, if y 6_ _x (mod n), then gcd(x - y; n) is guaranteed to

be a non-trivial factor of n (Fact 3.18), namely, p or q. Since a has four square roots modulo

n (_x and _z with _z 6_ _x (mod n)), the probability of success for each attempt

is 1

2 . Hence, the expected number of attempts before a factor of n is obtained is two, and

consequently the procedure runs in expected polynomial time. _

3.47 Note (strengthening of Fact 3.46) The proof of Fact 3.46 can be easily modified to establish

the following stronger result. Let c _ 1 be any constant. If there is an algorithm A

which, given n, can find a square root modulo n in polynomial time for a 1

(lg n)c fraction

of all quadratic residues a 2 Qn, then the algorithm A can be used to factor n in expected

polynomial time. The implication of this statement is that if the problem of factoring n is

difficult, then for almost all a 2 Qn it is difficult to find square roots modulo n.

The computational equivalence of the SQROOT and FACTORING problems was the

basis of the first “provably secure” public-key encryption and signature schemes, presented

in x8.3.

3.6 The discrete logarithm problem

The security of many cryptographic techniques depends on the intractability of the discrete

logarithm problem. A partial list of these includes Diffie-Hellman key agreement and its

derivatives (x12.6), ElGamal encryption (x8.4), and the ElGamal signature scheme and its

variants (x11.5). This section summarizes the current knowledge regarding algorithms for

solving the discrete logarithm problem.

Unless otherwise specified, algorithms in this section are described in the general setting

of a (multiplicatively written) finite cyclic group G of order n with generator _ (see

Definition 2.167). For a more concrete approach, the reader may find it convenient to think

of G as the multiplicative group Z_

p of order p - 1, where the group operation is simply

multiplication modulo p.

3.48 Definition Let G be a finite cyclic group of order n. Let _ be a generator of G, and let

_ 2 G. The discrete logarithm of _ to the base _, denoted log_ _, is the unique integer x,

0 _ x _ n - 1, such that _ = _x.

3.49 Example Let p = 97. Then Z_

97 is a cyclic group of order n = 96. A generator of Z_

97 is

_ = 5. Since 532 _ 35 (mod 97), log5 35 = 32 in Z_

97. _

The following are some elementary facts about logarithms.

3.50 Fact Let _ be a generator of a cyclic group G of order n, and let _, 2 G. Let s be an

integer. Then log_(_) = (log_ _ + log_ ) mod n and log_(_s) = s log_ _ mod n.

The groups of most interest in cryptography are the multiplicative groupF_

q of the finite

field Fq (x2.6), including the particular cases of the multiplicative group Z_

p of the integers

modulo a prime p, and the multiplicative group F_

2m of the finite field F2m of characteristic

two. Also of interest are the group of units Z_

n where n is a composite integer, the group

of points on an elliptic curve defined over a finite field, and the jacobian of a hyperelliptic

curve defined over a finite field.

3.51 Definition The discrete logarithm problem (DLP) is the following: given a prime p, a

generator _ of Z_

p, and an element _ 2 Z_

p, find the integer x, 0 _ x _ p - 2, such that

_x _ _ (mod p).

3.52 Definition The generalized discrete logarithm problem (GDLP) is the following: given a

finite cyclic groupG of order n, a generator _ of G, and an element _ 2 G, find the integer

x, 0 _ x _ n - 1, such that _x = _.

The discrete logarithm problem in elliptic curve groups and in the jacobians of hyperelliptic

curves are not explicitly considered in this section. The discrete logarithm problem

in Z_

n is discussed further in x3.8.

3.53 Note (difficulty of the GDLP is independent of generator) Let _ and be two generators

of a cyclic groupG of order n, and let _ 2 G. Let x = log_ _, y = log _, and z = log_ .

Then _x = _ = y = (_z)y. Consequently x = zy mod n, and

log _ = (log_ _) (log_ )-1 mod n:

This means that any algorithm which computes logarithms to the base _ can be used to

compute logarithms to any other base that is also a generator of G.