Pollard’s rho method

(1)

Degree project

Pollard’s rho method

Author: Ida Bucic

Supervisor: Marcus Nilsson Examiner: Per-Anders Svensson Date: 2019-06-25

Course Code: 2MA41E Subject: Dynamical systems Level: Bachelor

(2)

Abstract

In this work we are going to investigate a factorization method that was invented by John Pollard. It makes possible to factorize medium large integers into a product of prime numbers. We will run a C++

program and test how do different parameters affect the results. There will be a connection drawn between the Pollard’s rho method, the Birthday paradox and the Floyd’s cycle finding algorithm. In results we will find a polynomial function that has the best effectiveness and performance for Pollard’s rho method.

Acknowledgements

I would like to thank my supervisor Professor Marcus Nilsson for the exceptional guidance and knowledge he passed on me through the process of writing the thesis. I am grateful for the captivating topic that combined all the fields in mathematics that interest me and for discussions that opened my mind to the new ideas. Finally, I would like to thank my family for the unconditional support.

(3)

1 Introduction

The Pollard’s rho method is a factorization method introduced for the first time by John Pollard in 1975 in the article “A Monte Carlo method for the factorization”. It was a new factorization method involving probabilistic ideas.

Pollard invented a factorization algorithm based on congruences using a polynomial function and had a fixed starting value.

In this work we will investigate how do results change when we vary the function and starting value. We would like to know if we can get better results for a certain function in sense of quicker running time which demonstrates in lower mean value which will be of interest as well as error that can occur.

In order to understand Pollard’ rho method we have to comprehend theory behind it. Birthday paradox relies on the probability theory and guarantees that algorithm will be quicker than expected.

Floyd’s cycle finding algorithm is the core tool which enables us to calculate a factor from a given integer.

We will obtain all the results from the program in C++. We will visualize results in tables and create graphs in Mathematica.

2 The Pollard’s rho method

The Pollard’s rho method is a factorization method based on congruences with foundation in probability theory.

The main use of this method is to factorize a medium large number n = p · q, where p and q are primes such that p ≤ q. Let f (x) be a polynomial function which is set to be x²+ 1. Let x_k and y_k, k ∈ N be two sequences defined using the recursive formula

x_k≡ f (x_k−1) (mod n) yk ≡ f (f (yk−1)) (mod n) and starting value x0= 2, y0= x0.

We iterate this sequence until we find k ∈ N for which the greatest common divisor of | yk− xk | and n is greater than 1, i.e.

(| yk− xk|, n) > 1.

If we find xk and yk such that

(| yk− xk|, n) = p or (| yk− xk |, n) = q we have factorized n.

(5)

2.1 Example

Let n = 1403, f (x) = x²+ 1 and x₀= 2. First we calculate x_k. For k = 1 we have

x₁≡ x₀²+ 1 (mod 1403)

≡ 2²+ 1 (mod 1403)

and we get x₁= 5. Then we calculate y_k. For k = 1 we have y₁≡ (y₀²+ 1)²+ 1 (mod 1403)

≡ 5²+ 1 (mod 1403)

and we get y₁ = 26. Now we need to check if the greatest common divisor of

| y₁− x₁| and n is greater than 1. We do so using Euclidean algorithm. We get (| y₁− x₁|, 1403) = (21, 1403) = 1.

This means that we need to continue iterating xk and yk until we get a value greater than 1. In the same way we continue calculating xk and yk for k = 2, 3, .... We can look at the results in the table.

k x_k y_k gcd

1 5 26 1

2 26 952 1

3 677 1090 1

4 952 78 23

Table 1

In this example p = 23 was found for k = 4 which is close to √ p =√

23 = 4.79583. We will come back to this when explaining the theory behind the Pollard’s rho method and why√

p plays an important role. Now we see that 1403 = 23 · 61.

Now consider the sequence ximodulo 23. This will eventually have a periodic behaviour. To see this we have to calculate as many elements xi until we notice values modulo 23 start to repeat.

(6)

k xk mod 23

0 2 2

1 5 5

2 26 3

3 677 10

4 952 9

5 1370 13

6 1090 9

7 1163 13

8 78 9

Table 2: Table of periodicity

In this example we see that periodicity starts with x₄= 952 ≡ 9 (mod 23) and x₅ = 1370 ≡ 13 (mod 23). This can be visualized by a directed graph, see Figure 1.

Figure 1

We have seen that for modulo 23 the periodicity starts quite soon, now we wonder how long would it take for the other factor 61 and even for n which is 1403. From Table 3 on page 7 we can see that for modulo 61 the periodicity starts with x3= 6 but the loop is long and goes all the way to x13. For modulo 1403 periodicity starts at the same point as for modulo 23 but in this case x4 = 952 and the loop finishes at x14. We can see that the periodicity started first for p as p ≤ q < n. Later we will explain why is this so and learn from the probability theory that in most cases we do not have to continue the procedure until the periodicity starts for q.

(7)

k xk mod 61 mod 1403

0 2 2 2

1 5 5 5

2 26 26 26

3 677 6 677

4 952 37 952

5 1370 28 1370

6 1090 53 1090

7 1163 4 1163

8 78 17 78

9 473 46 473

10 653 43 653

11 1301 20 1301

12 584 35 584

13 128 6 128

14 952 37 952

Table 3: Table of periodicity

2.2 Floyd’s cycle finding algorithm

Let A be a finite set with m elements. Let f : A → A be a map. We can visualize the map f by a directed graph with vertex set A. There will be an edge from vertex a to vertex b if f (a) = b. Let x0∈ A. We can form a sequence {x0, x1, x2, ..., xm} when xk = f (xk−1), k ∈ N. Since A is finite the sequence will eventually start to repeat. It will end up in a cycle. In order to find this cycle we can use so called Floyd’s cycle finding algorithm.

This algorithm was invented by Robert W. Floyd in 1969. It is also called tortoise and hare algorithm. Let’s imagine we have a closed path over which tortoise and hare are moving and let’s say hare is twice as fast as tortoise.

Floyd’s cycle finding algorithm claims that we can be certain that they will meet at some point at the same position.

Now let’s explain the algorithm using the sequences. We will follow the proof from [3]. Let xk be the sequence that moves one position at the time and let yk

be the sequence defined such that y0= x0 and yk = f (f (yk−1)). We see that yk moves two positions at the time, so yk = x2k. We want to know when the two sequences will have the same value modulo p. In order to find that out let’s take a look at the Figure 2 on page 8.

(8)

Figure 2: The Pollard’s rho method graph.

The first T elements in a directed graph form a tail. The elements x0, x₁,...,x_{T −1} appear in the sequence without repeating. After those elements there appears a loop of the length L such that all the elements after x_T which are x_{T +1}, x_{T +2},..., x_{T +L} continue repeating.

We can define T as the largest integer such that all the x_k, k < T appear only once in the sequence and L as the smallest integer such that all the x_k, T ≤ k < T + L repeat in the sequence and x_T = x_{T +L}.

The next expression is the answer to the previous question.

x2k= xk ⇐⇒ k ≥ T and 2k ≡ k (mod L).

This means that we need to find such k that is larger than T and is divisible by L. The first time it will happen is when k is the first multiple of L. We have now proved the following theorem.

Theorem 2.1. Let A be a finite set with m elements and let f : A → A be a map. Let x0be a starting value of a sequence {x0, x1, x2, ..., xm} which is formed by the map f . The numbers {x₀, x₁, x₂, ..., x_{T −1}} appear only once and form a tail of length T. Numbers {x_T, x₁, x₂, ..., x_{T +L}} repeat so they form a loop of length L. Let y_k= x_2k then

y_k= x_k for some 1 ≤ k < T + L and L|k.

(9)

2.3 The Birthday paradox and the running time

The Birthday paradox is based on the fact that given n randomly chosen people some of them will have birthday on the same date.

Naturally, if n reaches 366 we are 100% certain that at least one pair of people will have birthday on the same day.

Calculating the probability for 70 people brings us a rather interesting fact, that is, a probability of 99.9% is reached that some pair of them will have birthday on the same day.

Furthermore, calculating the probability for 23 people gives us the result of 50%.

If B is an event, by B^C we denote the complement of B.

Let B be the event in which at least two out of n randomly chosen persons have birthday on the same day.

B^C is an event in which n randomly chosen persons have birthdays on different days.

Let us choose a person. The second person we choose has a probability of 1 − ₃₆₅¹ of not having the birthday on the same day as the first person. The third person we choose has a probability of 1 −₃₆₅² of not having the birthday on the same day as the first person and the second person, given that the first and second person have birthdays on different days. Continuing with this process, we can calculate probabilities, more of which can be read in [4]. First we calculate probability of B complement

P (B^C) = (1 − 1

365) · (1 − 2

365) · · · (1 − n − 1 365 ) and apply ln(x) function

ln P (B^C) =

n−1

X

i=1

ln (1 − i 365) ≈ −

n−1

X

i=1

i

365 = −(n − 1) · n 2 · 365 . Now using the e^x function we get

P (B^C) = e⁻^(n−1)·n^2·365 . Hence

P (B) = 1 − e⁻^(n−1)·n^2·365 . Plugging in n = 70 we get P (B) ≈ 99.9%.

Plugging in n = 23 we get P (B) ≈ 50%.

Importance of the birthday paradox for Pollard’s rho method is as follows.

We know that the numbers xkof the sequence, along with the yk = x2k start to repeat when we look at them as remainders modulo p. Without the birthday paradox we could be certain that the numbers will start to repeat after p different numbers because there is only p different remainders modulo p. So the length of the loop would be equal to p. Using the idea from birthday paradox, the

(10)

probability theory enables us to expect the numbers will start to repeat earlier and the length of the loop will be shorter.

We can use Floyd’s cycle finding algorithm in Pollard’s rho method because we are certain that the numbers in a sequence modulo p will form a loop. The maximum length of this loop is p, but because of the Birthday paradox there is a great probability that it is less than p.

Floyd’s cycle finding algorithm guarantees that there will be a match for two numbers modulo p. That means that for some k

y_k≡ x_k (mod p).

This is important because in Pollard’s rho method we are looking for the greatest common divisor of absolute difference between ykand xk, and n which we expect to be p, i.e.

(| yk− xk|, n) = p.

Now we are interested how many steps will it take to factorize n. By the following theorem we can see that it takes close to√

p steps.

Theorem 2.2. Let A be a finite set with m elements and let f : A → A be a map. Let x0be a starting value of a sequence {x0, x1, x2, ..., xm} which is formed by a map f . The numbers {x₀, x₁, x₂, ..., x_{T −1}} appear only once and form a tail of length T. Numbers {x_T, x₁, x₂, ..., x_{T +L}} repeat so they form a loop of length L. If the map f is random enough, we expect the value of T + L to be

E(T + L) ≈ 1.2533 ·√ m.

A proof for this theorem can be found in [3].

2.4 The theory behind the Pollard’s rho method

After an introduction on how does the Pollard’s rho method work we can look into more abstract theory to understand why it works. We want to factorize a medium large number n into two prime numbers p and q, p ≤ q by forming a sequence of numbers x_k and y_k, where y_k = x_2k. Each element of the sequence is congruent to f of it’s predecessor modulo n. Pollard used a function f (x) = x²+ 1 but it is also possible to use a polynomial of different degree and constant which will be investigated in this work. Namely the polynomial of first degree does not lean on probabilistic ideas whilst higher degrees do.

Another parameter is the seed x0with which we begin the process of forming a sequence. Pollard used the seed x0 = 2 but in this work we will investigate other cases as well.

Another characteristic of this method is that it works with a minimum stor- age by virtue of calculating two steps at a time. In each step we iterate xk and yk and calculate the greatest common divisor of | yk− xk| and n. This way we only need to store two values at a time. We are using absolute value because of the implementation in C++ program. Let’s take a look at the algorithm.

(11)

We start with calculating x1and y1

x1≡ f (x0) (mod n) y1≡ f (f (y0)) (mod n) and continue this procedure for xk and yk

xk≡ f (xk−1) (mod n) y_k≡ f (f (y_k−1)) (mod n) until we find k ∈ N for which

1 < (| yk− xk |, n).

We could certainly find (| yk− xk |, n) = n but this is not of interest since this way we wouldn’t factorize n and that is why we set this case to be an error. Since p ≤ q < n we will probably find those numbers for p first, but it is also possible in some cases that q comes first. Based on Floyd’s cycle finding algorithm we know we will find such pair of xk and yk that is congruent modulo p i.e.

y_k≡ xk (mod p).

Then

(| y_k− x_k|, n) = p

and k is the number of steps which is in Pollard’s case based on Theorem 2.2 less than n, moreover it is close to√

p. We can see how the method works in Example 2.1. Now that we have obtained p we can visualize the cycle modulo p, see Figure 1 on page 6. The residues modulo p of xk will form a cycle after a while. If we create a graph, its structure will remind us of a Greek letter ρ.

The numbers before a cycle are called the tail of the ρ.

(12)

3 The method

In this work our goal is to investigate how does the Pollard’s rho method behave when changing the function which is originally set to be f (x) = x²+ 1 with the seed x₀ = 2 i.e. variation of the constant c ∈ I₁₀ = {x ∈ Z | |x| ≤ 10}, the degree of the polynomial in the set d ∈ {1, 2, 3, 4, 5} and the seed x₀ ∈ I₅ = {x ∈ Z | |x| ≤ 5}. Our functions will be of form f(x) = x^d+ c. We will generate l = 224115 numbers such that we take all the prime numbers from the interval [1, 5000] and calculate all possible combinations of multiplication of two prime numbers. We decided to limit the value of each factor to 5000 because of the overflow problem in the C++ program. We will create a sample set N = { n | n = p · q, p, q primes ≤ 5000 }. For each such number ni, for i ∈ {1, ...l} we will run the Pollard’s rho method to obtain:

• pithe prime number which is a factor of ni

• ki the number of steps needed to obtain pi

• e =Pl

i=1e_ithe number of errors i.e. the number of times n_iwas obtained instead of p_i which we will call the risk of failure

We will have two parameters of interest:

1. Effectiveness. Effectiveness will be obtained by calculating the weighted mean

µ = Pl

i=1

ki

√p_i l − e .

2. Performance. Performance is shown through the risk of failure which is percentage of times the method failed out of the total number of l trials

ϕ = e · 100 l .

In order to write a proper C++ code we need to modify algorithms we would usually use. We should keep in mind that some operations could escalate in size and data overflow can occur. In this work we are investigating changes in Pollard’s rho method depending on different functions and degrees of functions are between first and fifth. Now it is obvious that we will need to lower degrees of functions through the algorithm.

The first obstacle is appearing when calculating x_k and x_2k. The great idea of Pollard is that we calculate those two values simultaneously. Calculating it like

xk≡ f (xk−1) (mod n) x2k ≡ f (f (xk−1)) (mod n) or in C++ code:

(13)

x = s e e d ; y = s e e d ;

x = c o n g r u e n c e ( x , n , e x p o n e n t , c );

y = c o n g r u e n c e ( c o n g r u e n c e ( y , n , e x p o n e n t , c ) , n , e x p o n e n t , c );

is good theoretically but implementing it in C++ causes an overflow. Just by slightly modifying it we can avoid the error.

y = c o n g r u e n c e ( y , n , e x p o n e n t , c );

Another example of adjusting code is using the fast powering algorithm.

l o n g l o n g int p o w e r (l o n g l o n g int x , l o n g l o n g int e x p o n e n t , l o n g l o n g int n ){

if ( e x p o n e n t == 0) r e t u r n 1;

e l s e if ( e x p o n e n t == 1) r e t u r n x ; e l s e if ( e x p o n e n t % 2 = = 0 ) {

r e t u r n p o w e r (( x * x )% n , e x p o n e n t /2 , n );

}

e l s e if ( e x p o n e n t % 2 = = 1 ) {

r e t u r n ( x * p o w e r (( x * x )% n , ( e x p o n e n t - 1) / 2 , n ))% n ; }

}

Fast powering algorithm is of importance in this program because it constantly lowers power of a function while calculating the sequence x_k. This function is recursive and returns value of x to the power of given exponent which is after- wards implemented in another function which actually computes the value of the function we have assigned. Function power has listed parameters x , exponent which is a power x is raised to and n which is the number we want to factor in Pollard’s rho method and here we calculate in each step x^exponent(mod n) while lowering the exponent .

1. If exponent=1 we have x⁰= 1 so the function power returns 1.

2. If exponent%2=0 we have an even exponent. The function power is recursive so it returns x · x modulo n and divides exponent in half until the exponent is equal to 1, in which case we are at the first step.

3. If exponent%2=1 we have an odd exponent. We first subtract 1 from the exponent so we have the same situation as in second case, but because of that we have one extra x so we multiply it with the x · x modulo n.

Let’s take a look at all important functions we were using in the program.

For the function pollard we used structure rho so that the function can return multiple values.

s t r u c t rho {

l o n g l o n g int k ; l o n g l o n g int p ; l o n g l o n g int e ; };

(14)

Function pollard has listed parameters:

1. seed which is the starting value

2. n which is the given integer we want to factor

3. exponent which is the degree of the polynomial function 4. c which is the constant of the polynomial function

We can easily change arguments when calling the function in the main function which is important for obtaining results. The function is returning values:

1. k which is the number of steps it took the function to factorize n 2. p which is one of the factors of n

3. e which signalizes if the error has occurred i.e. e = 1 if p = n and e = 0 otherwise

The function pollard is calculating xk and yk at the same time while the greatest common divisor of the absolute difference is equal to 1. If the greatest common divisor of the absolute difference is equal to n the function breaks and error is set to value 1 with e++. When we find xk and y_k for which the greatest common divisor of the absolute difference is not equal to 1 or n we set p to that value and remember for which k that happened.

rho p o l l a r d (l o n g l o n g int seed ,l o n g l o n g int n , l o n g l o n g int e x p o n e n t , l o n g l o n g int c ){

l o n g l o n g int x ; l o n g l o n g int y ; l o n g l o n g int k = 0;

l o n g l o n g int e = 0;

l o n g l o n g int p ;

x = s e e d ; y = s e e d ;

do{

y = c o n g r u e n c e ( y , n , e x p o n e n t , c );

k ++;

if ( gcd ( a b s o l u t e ( x , y ) , n ) == n ) { e ++;

b r e a k; }

} w h i l e ( gcd ( a b s o l u t e ( x , y ) , n ) == 1 );

p = gcd ( a b s o l u t e ( x , y ) , n );

r e t u r n rho { k , p , e };

}

(15)

Function congruence is returning remainders modulo n, more precise it returns the element xk or yk.

l o n g l o n g int c o n g r u e n c e (l o n g l o n g int x ,l o n g l o n g int n , l o n g l o n g int e x p o n e n t , l o n g l o n g int c ){

r e t u r n ( p o w e r ( x , e x p o n e n t , n )+ c )% n ; // r e t u r n x_k

}

Function gcd works as the Euclidean algorithm. It is a recursive function returning the greatest common divisor of integers a and b. It has parameters a and b, and if b is greater than zero then the function runs again but changes the order of parameters and now we have b on the first place and remainder of a divided by b on the second place. The function continues this procedure until b is equal to zero which means that in the previous step a divided by b had remainder zero i.e. we found the greatest common divisor.

l o n g l o n g int gcd (l o n g l o n g int a ,l o n g l o n g int b ){

if ( b )

r e t u r n gcd ( b , a % b );

e l s e

r e t u r n a ; }

Function absolute has listed parameters x and y, and returns absolute value of their difference.

l o n g l o n g int a b s o l u t e (l o n g l o n g x ,l o n g l o n g y ){

if( y >= x ){

r e t u r n y - x ; }

e l s e{

r e t u r n x - y ; }

}

Now that we have all functions for calculating factors of n we can call pollard function from the main function and start variation of parameters.

First we need to make sample space of all the values of n we are going to use.

Because of the possible integer overflow we are going to limit values of p and q to 5000 maximum each. We are going to use function prime for detecting whether the number from the set of integers from 1 to 5000 is prime.

b o o l p r i m e (l o n g l o n g int x ){

l o n g l o n g int k =2;

if ( x == 1) r e t u r n f a l s e; w h i l e( k * k <= x ){

if ( x % k == 0){

r e t u r n f a l s e; }

(16)

k = k +1;

}

r e t u r n t r u e; }

The function checks if the given number x is prime by checking if it has any divisors from 2 to square root of x.

With following for loop we are creating the array of prime numbers.

for( i =1; i < 5 0 0 0 ; i + + ) {

if( p r i m e ( i )){

b [ j ]= i ; j ++;

}

Now that we have one array of all prime numbers from 1 to 5000 we can create list of all combinations of multiplications of two prime numbers which we are storing in a text file.

o f s t r e a m f i l e 1 ;

f i l e 1 . o p e n (" n u m b e r s . txt ");

for( i =0; i < j ; i + + ) {

for( k = i ; k < j ; k + + ) { d [ i ]= b [ i ]* b [ k ];

l ++;

file1 < < d [ i ] < < e n d l ; }

}

f i l e 1 . c l o s e ();

After creating all the functions we need for Pollard’s rho method to work we can start using them from the main function. We stored all numbers from the set N = { n | n = p · q, p, q primes ≤ 5000 } in the text file "numbers.txt".

We are creating three for loops in order to generate different variables for polynomial function f (x) = x^exponent+ c and seed x0.

1. Variation of seed x0

2. Variation of exponent exponent 3. Variation of constant c

When calling the function pollard different arguments are being implemented for each iteration of each for loop. For reading the results easier it is possible to fix one of the arguments and generate results for others. We are storing results in "results.txt" text file.

l = 2 2 4 1 1 5 ; o f s t r e a m f i l e 3 ;

f i l e 3 . o p e n (" r e s u l t s . txt ");

(17)

l o n g l o n g int l i n e ; i f s t r e a m f i l e 2 ;

f i l e 2 . o p e n (" n u m b e r s . txt ");

for( s e e d = -5; seed < = 5 ; s e e d + + ) {

for( e x p o n e n t =1; e x p o n e n t < = 5 ; e x p o n e n t + + ) { for( c = -10; c < = 1 0 ; c + + ) {

for( i =0; i < l ; i + + ) { file2 > > l i n e ;

r = p o l l a r d ( seed , line , e x p o n e n t , c );

f = f + r . e ;

if ( r . e = = 0 ) {

r e s u l t = r e s u l t + r . k / s q r t ( r . p );

} }

file3 < < r e s u l t /( l - f ) < <" ";

file3 < < f * 1 0 0 / l < <" ";

r e s u l t =0;

f =0;

f i l e 2 . c l o s e ();

}

file3 < < e n d l ; }

}

(18)

4 Investigating the Pollard’s rho method

In this section we are going to show our results regarding variation of functions in the Pollard’s rho method. We will use a polynomial function of form x 7→ x^d+c.

We will represent the results in tables and visualize them with graphs. We encountered the errors for some parameters so in some cases we were unable to calculate the mean. Therefore some effectiveness graphs are discontinuous.

However performance graphs were purposely discontinued at points where values were equal to 100% of error in order to have a better visualization.

4.1 Linear function

Using a polynomial of first order f (x) = x + c to find a factor p of a composite number n is not effective but the risk of failure is minimized. In the linear case we obtain results independent of seed, so Table 4 and Figure 3 on page 20 are representing data for all seeds, i.e. x₀ ∈ I₅. This is explained in the proof of the Theorem 4.1.

Theorem 4.1. Let f (x) = x + c be a polynomial of first order, let x₀ = s be the seed and n = p · q, where p ≤ q. Suppose (c, n) = 1. Then the number of steps to factorize n is equal to p.

Proof. Let’s start from the definition of the sequence x_i, i ∈ N:

x₁≡ x0+ c (mod n) ⇐⇒ x₁≡ s + c (mod n) x2≡ x1+ c (mod n) ⇐⇒ x2≡ s + 2c (mod n)

...

xk≡ xk−1+ c (mod n) ⇐⇒ xk≡ s + k · c (mod n)

To find the factor of n that is p, we need to find the greatest common divisor of | x2k− xk| and n for k ∈ N.

| x2− x1| =| s + 2c − s − c |=| c |

| x4− x2| =| s + 4c − s − 2c |=| 2c | ...

| x2k− xk | =| s + 2k · c − s − k · c |=| k · c |

We can see that the number of steps k does not depend on seed x0= s since we lose s when subtracting x2k and xk. Now we see that

(| x2k− xk|, n) = (| k · c |, n).

By the Floyd’s cycle finding algorithm we know that x2k and xk will be equal modulo p (also modulo n, but modulo p will happen first) so we know that for some k, the equation

(| x_2k− xk|, n) = p is true. Now we wonder for which k this is happening.

Since (c, n) = 1 i.e. c and n are relatively prime, then c and p are relatively

(19)

prime as well. So the only candidate that can be divisible by p is k. In fact k must be equal to p, k cannot be a multiple of p because this is the first time we have found the match.

To find the number of steps needed to factor n lets take a look at following equation.

x_k ≡ x0 (mod p) s + k · c ≡ s (mod p)

As stated before k = p so p divides k · c and we can see that the number of steps really is p because xk is the first one in the sequence {x1, ..., xk} for which this is true.

Sometimes, in very few cases,it can happen that (c, n) 6= 1 i.e. c and n are not relatively prime. We would like to know how effective this method is and how large can k get in that case. Then

(c, n) = p =⇒ c = a · p =⇒ (k · a · p, p · q) = p =⇒ k = 1, a ∈ N.

We know k = 1 because this will happen in the first step | x2− x1|= c and (| x2− x1|, n) = p.

We would have the same reasoning for q.

In case a = q we have the following situation. Let’s take a look at Table 4 and Figure 3. For polynomial of first order f (x) = x + c where c ∈ I10and seed x0 ∈ I5 we have following results. For non prime absolute value of constants, i.e. c ∈ {±4, ±6, ±8, ±9, ±10} the risk of failure is not zero. Here is what is happening:

Constant is the same as only one n_i and that is why neither p nor q can be calculated. Let x₀= s, we have:

x₁≡ s + c (mod n_i) x2≡ s + 2c (mod ni) (| x2− x1|, ni) = (| c |, ni) = ni

We can see from results for effectiveness for linear function that the mean value is high, whereas the error is almost zero. In conclusion linear function in not good for using in Pollard’s rho method because it takes too long to factorize given number n into prime numbers. It would be simpler to use trial division.

(20)

Seed x0∈ I5

Const. Mean Risk

-10 34.5225 0.000446199 -9 34.5265 0.000446199 -8 34.5279 0.000446199

-7 34.5231 0

-6 34.5244 0.000446199

-5 34.5245 0

-4 34.5279 0.000446199

-3 34.5264 0

-2 34.5277 0

-1 34.5298 0

0 - 100

1 34.5298 0

2 34.5277 0

3 34.5264 0

4 34.5279 0.000446199

5 34.5245 0

6 34.5244 0.000446199

7 34.5231 0

8 34.5279 0.000446199 9 34.5265 0.000446199 10 34.5225 0.000446199 Table 4: Table of f (x) = x + c , x0∈ I5

(a) Effectiveness (b) Performance

Figure 3: Graphs of f (x) = x + c, x0∈ I5

(21)

4.2 Quadratic function

In this section we are considering polynomial of second degree f (x) = x²+ c where constant c ∈ I₁₀ and seed x₀∈ I₅.

In this case we will get the same results for seed x₀= a and x₀= −a. This is of course because we are using a quadratic function which is an even function.

Let’s take a look at tables and figures.

From Tables and Figures for quadratic function, looking at the risk of failure we can see that algorithm fails in few cases and therefore we were unable to calculate the mean value. The risk of failure was 100% which means we couldn’t obtain the factor p, we were getting n instead. This happens because xi starts to repeat at the very beginning and algorithm falls into a loop. This can be fixed by changing the seed x0.

Looking at the mean value, effectiveness is the best when it is lowest possible.

That means that we have factorized n in fewest possible steps k. We should not consider the cases where algorithm has failed.

We can see from the Table 6 on page 24 and Figure 8 on page 25 that the best result for mean is for x₀∈ {−4, 4}, c = −5 and it is µ = 0.689403.

Risk of failure is also the best when it is lowest possible. The best result for risk of failure is for x₀ ∈ {−4, 4}, c = −2 and it is ϕ = 0.99681. We can now compare our best results with the Pollard’s chosen function. The function was f (x) = x²+ 1 and seed was x0= 2. From Table 5 on page 22 and Figure 6 on page 23 we can see that mean is µ = 0.744417 which is higher then our best mean. However risk is ϕ = 1.18644 which is lower than the risk obtained for our best mean result.

(22)

Seed 0 1 2

Const. Mean(0) Risk(0) Mean(1) Risk(1) Mean(2) Risk(2) -10 0.792087 1.08337 0.756351 1.14227 0.756478 1.08516 -9 0.816192 1.1445 0.766845 1.19314 0.761913 1.23419 -8 0.731411 1.18064 0.704027 1.26141 0.753825 1.17841 -7 0.747895 1.27925 0.764458 1.26988 0.447214 99.7015 -6 0.773843 1.14986 0.774578 1.17395 - 100 -5 0.780828 1.18109 0.708133 1.25694 0.730283 1.28996 -4 0.744161 1.31093 0.754948 1.27568 0.779367 1.28996 -3 0.782553 1.27167 0.57735 99.7015 0.57735 99.7015

-2 0.707107 99.7019 - 100 - 100

-1 - 100 - 100 0.777832 1.16012

0 - 100 - 100 1.53791 1.28862

1 0.801209 1.19894 0.769188 1.21099 0.744417 1.18644 2 0.808359 1.21277 0.800752 1.1686 0.787183 1.19537 3 0.807505 1.24311 0.754518 1.25248 0.797933 1.19581 4 0.817556 1.11862 0.834975 1.08025 0.779986 1.08917 5 0.829791 1.20518 0.772939 1.17261 0.794189 1.2034 6 0.764032 1.29175 0.766711 1.24891 0.739942 1.23508 7 0.77213 1.16547 0.798088 1.1271 0.774138 1.10211 8 0.804717 1.24356 0.788333 1.17306 0.747466 1.28059 9 0.78485 1.11505 0.759822 1.19358 0.756704 1.13959 10 0.799535 1.11148 0.753802 1.11416 0.751427 1.22303

Table 5: Table of f (x) = x²+ c, x0∈ {0, 1, 2}

(23)

Figure 4: Graphs of f (x) = x²+ c, x0= 0

Figure 6: Graphs of f (x) = x²+ c, x₀= 2

(24)

Seed 3 4 5

Const. Mean(3) Risk(3) Mean(4) Risk(4) Mean(5) Risk(5) -10 0.783265 1.13335 0.755133 1.08426 0.767172 1.19983 -9 0.840004 1.18332 0.756325 1.25114 0.742966 1.20608 -8 0.730385 1.31629 0.729675 1.17975 0.732192 1.26631 -7 0.447214 99.7015 0.737124 1.20563 0.729439 1.23776

-6 - 100 0.75151 1.22794 0.749924 1.2266

-5 0.710078 1.25739 0.689403 1.25873 0.756627 1.16994 -4 0.730447 1.24311 0.717599 1.29666 0.705765 1.21054 -3 0.763236 1.25337 0.770243 1.21143 0.738579 1.20608 -2 1.18825 1.04991 1.21383 0.99681 1.16348 1.03608 -1 0.751582 1.14673 0.743151 1.13379 0.75842 1.15923 0 1.53672 1.20072 1.53433 1.27435 1.53358 1.29576 1 0.795063 1.07668 0.786792 1.12844 0.72753 1.19314 2 0.783195 1.14093 0.773397 1.13736 0.768649 1.22259 3 0.784506 1.20786 0.736326 1.27524 0.836537 1.19849 4 0.798175 1.1155 0.791002 1.0972 0.810508 1.06731 5 0.78923 1.20295 0.791119 1.10881 0.804779 1.19715 6 0.773011 1.20028 0.698214 1.25873 0.757465 1.28372 7 0.777249 1.11996 0.766199 1.14316 0.793618 1.09765 8 0.801383 1.22259 0.743982 1.25337 0.775377 1.244 9 0.77913 1.17931 0.736612 1.15744 0.765779 1.15387 10 0.772206 1.14004 0.753407 1.19626 0.764834 1.26498

Table 6: Table of f (x) = x²+ c, x0∈ {3, 4, 5}

(25)

Figure 9: Graphs of f (x) = x²+ c, x₀= 5

(26)

4.3 Cubic function

In this section we are considering polynomial of third degree f (x) = x³ + c where constant c ∈ I₁₀ and seed x₀ ∈ I₅. By looking at tables and figures we can notice symmetry between c and −c, about the mean value axis. Here is what is happening:

Let’s separate f (x) into f1(x) = x³+ c and f2(x) = x³− c. Now f1(x) = −x³+ c = −(x³− c) = −f2(x)

for f1: {−5, −4, −3, −2, −1, 0} → Z and f²: {0, 1, 2, 3, 4, 5} → Z and f2(x) = −x³− c = −(x³+ c) = −f1(x)

for f2: {−5, −4, −3, −2, −1, 0} → Z and f¹: {0, 1, 2, 3, 4, 5} → Z.

This is all because f (x) = x³+ c is an odd function. But here instead of value of function f (x) we have a value of mean µ. How we can look at this problem is that we initially, before the computation of µ starts, give the same value of function f (x) but for different constants and seeds.

The algorithm is failing for c = 6 and x₀= −2, c = 0 and x₀ ∈ {−1, 0, 1}, c = −6 and x₀= 2. This is happening because for those parameters algorithm falls into a loop and cannot proceed with the calculations.

We can see that the best result for mean is for x0 ∈ {−4, 4}, c = 0 and it is µ = 1.32419 which can be seen from Figure 11 on page 28 and Figure 19 on page 32. Risk of failure is lowest for x0 = −2, c = 4 and x0 = 2, c = −4 it is ϕ = 0.519822 which can be seen from Figure 13 on page 28 and Figure 17 on page 30.

(27)

Seed -5 -4 -3 -2

Const. Mean(-5) Risk(-5) Mean(-4) Risk(-4) Mean(-3) Risk(-3) Mean(-2) Risk(-2) -10 3.80466 0.542132 3.59034 0.580952 3.97583 0.535439 3.90421 0.543025 -9 3.61649 0.587199 3.63042 0.579167 3.39783 0.587645 3.71239 0.554626 -8 3.92209 0.571582 4.01371 0.539455 3.94723 0.596569 4.00365 0.591661 -7 4.1822 0.539901 4.16058 0.585414 3.85387 0.55418 3.86361 0.568458 -6 4.04816 0.573366 4.00952 0.574705 3.8978 0.598354 4.00457 0.544363 -5 3.99173 0.669299 3.81298 0.630926 3.89176 0.605939 4.00395 0.539009 -4 3.92987 0.561765 3.91824 0.613524 3.91146 0.547933 4.1035 0.553287 -3 3.87929 0.568458 3.7393 0.58943 3.79232 0.663052 3.763 0.569797 -2 3.98651 0.568012 3.99087 0.544363 3.92298 0.567566 3.848 0.661268 -1 3.95045 0.527408 3.98704 0.546148 3.84844 0.538563 3.9751 0.627356 0 1.4887 1.09631 1.32419 1.13959 1.50338 1.1039 1.40353 1.1628 1 3.82584 0.551056 4.13201 0.59523 4.07108 0.53187 3.74117 0.568904 2 4.02797 0.576936 4.00085 0.557749 3.59417 0.55998 3.94021 0.569797 3 3.91545 0.545702 3.8376 0.605493 3.75428 0.573813 3.93232 0.546148 4 3.91363 0.546594 4.08732 0.537224 4.14739 0.53187 4.10548 0.519822 5 4.08356 0.631372 3.79248 0.573366 4.00194 0.55998 4.01538 0.560873

6 4.10283 0.568458 3.93032 0.597907 4.16432 0.574705 - 100

7 3.84828 0.562658 3.89962 0.545256 3.95624 0.568012 3.83729 0.559088 8 3.92404 0.596123 3.7002 0.594784 3.92799 0.610847 3.71652 0.66573 9 3.50122 0.555072 3.90982 0.592553 3.80997 0.599246 3.858 0.576936 10 3.785 0.544363 3.92379 0.55998 3.82397 0.573813 3.92147 0.580059

Table 7: Table of f (x) = x³+ c, x0∈ {−5, −4, −3, −2}

(28)

Figure 10: Graphs of f (x) = x³+ c, x0= −5

Figure 13: Graphs of f (x) = x³+ c, x₀= −2

(29)

Seed -1 0 1 2

Const. Mean(-1) Risk(-1) Mean(0) Risk(0) Mean(1) Risk(1) Mean(2) Risk(2) -10 3.67301 0.578275 3.89801 0.63628 3.9905 0.566673 3.92147 0.580059

-9 3.84526 0.609509 3.71225 0.640742 3.78468 0.527854 3.858 0.576936 -8 3.99668 0.558196 3.6976 0.642081 3.77843 0.57649 3.71652 0.66573 -7 3.77434 0.581398 3.91072 0.634496 3.82626 0.546148 3.83729 0.559088 -6 3.71156 0.544363 4.17645 0.601031 4.06236 0.557303 - 100 -5 3.70693 0.548379 4.00636 0.655467 3.82481 0.600138 4.01538 0.560873 -4 3.94069 0.579613 3.9336 0.616202 3.92248 0.544363 4.10548 0.519822 -3 3.75772 0.576044 3.81216 0.66573 3.77787 0.565781 3.93232 0.546148 -2 3.93059 0.555965 3.86087 0.656359 3.94914 0.556411 3.94021 0.569797 -1 3.98326 0.606831 3.99984 0.622894 4.02406 0.63048 3.74117 0.568904

0 - 100 - 100 - 100 1.40353 1.1628

1 4.02406 0.63048 3.99984 0.622894 3.98326 0.606831 3.9751 0.627356 2 3.94914 0.556411 3.86087 0.656359 3.93059 0.555965 3.848 0.661268 3 3.77787 0.565781 3.81216 0.66573 3.75772 0.576044 3.763 0.569797 4 3.92248 0.544363 3.9336 0.616202 3.94069 0.579613 4.1035 0.553287 5 3.82481 0.600138 4.00636 0.655467 3.70693 0.548379 4.00395 0.539009 6 4.06236 0.557303 4.17645 0.601031 3.71156 0.544363 4.00457 0.544363 7 3.82626 0.546148 3.91072 0.634496 3.77434 0.581398 3.86361 0.568458 8 3.77843 0.57649 3.6976 0.642081 3.99668 0.558196 4.00365 0.591661 9 3.78468 0.527854 3.71225 0.640742 3.84526 0.609509 3.71239 0.554626 10 3.9905 0.566673 3.89801 0.63628 3.67301 0.578275 3.90421 0.543025

Table 8: Table of f (x) = x³+ c, x0∈ {−1, 0, 1, 2}

(30)

Figure 15: Graphs of f (x) = x³+ c, x0= 0

Figure 17: Graphs of f (x) = x³+ c, x₀= 2

(31)

Seed 3 4 5

Const. Mean(3) Risk(3) Mean(4) Risk(4) Mean(5) Risk(5) -10 3.82397 0.573813 3.92379 0.55998 3.785 0.544363

-9 3.80997 0.599246 3.90982 0.592553 3.50122 0.555072 -8 3.92799 0.610847 3.7002 0.594784 3.92404 0.596123 -7 3.95624 0.568012 3.89962 0.545256 3.84828 0.562658 -6 4.16432 0.574705 3.93032 0.597907 4.10283 0.568458 -5 4.00194 0.55998 3.79248 0.573366 4.08356 0.631372 -4 4.14739 0.53187 4.08732 0.537224 3.91363 0.546594 -3 3.75428 0.573813 3.8376 0.605493 3.91545 0.545702 -2 3.59417 0.55998 4.00085 0.557749 4.02797 0.576936 -1 4.07108 0.53187 4.13201 0.59523 3.82584 0.551056 0 1.50338 1.1039 1.32419 1.13959 1.4887 1.09631 1 3.84844 0.538563 3.98704 0.546148 3.95045 0.527408 2 3.92298 0.567566 3.99087 0.544363 3.98651 0.568012 3 3.79232 0.663052 3.7393 0.58943 3.87929 0.568458 4 3.91146 0.547933 3.91824 0.613524 3.92987 0.561765 5 3.89176 0.605939 3.81298 0.630926 3.99173 0.669299 6 3.8978 0.598354 4.00952 0.574705 4.04816 0.573366 7 3.85387 0.55418 4.16058 0.585414 4.1822 0.539901 8 3.94723 0.596569 4.01371 0.539455 3.92209 0.571582 9 3.39783 0.587645 3.63042 0.579167 3.61649 0.587199 10 3.97583 0.535439 3.59034 0.580952 3.80466 0.542132

Table 9: Table of f (x) = x³+ c, x₀∈ {3, 4, 5}

(32)

Figure 20: Graphs of f (x) = x³+ c, x₀= 5

(33)

4.4 Quartic function

In this section we are considering polynomial of fourth degree f (x) = x⁴+ c where constant c ∈ I₁₀ and seed x₀∈ I₅.

In this case we will get the same results for seed x₀= a and x₀= −a. This is of course because we are using a quartic function which is an even function.

Let’s take a look at tables and figures. From the Table 10 on page 34, Figure 21 and Figure 22 on page 35, looking at the risk of failure we can see that the algorithm fails for x0 = 0 and constant c ∈ {−1, 0}, seed x0= 1 and c ∈ {−2, −1, 0}. This happens because xi starts to repeat at the very beginning and algorithm falls into a loop. This can be fixed by changing the seed x0.

Looking at the mean value, effectiveness is the best when it is lowest possible.

That means that we factorized n in fewest possible steps k. We should not consider the cases where algorithm has failed.

We can see from Table 11 on page 36 and Figure 26 on page 37 that the best result for mean is for x₀ ∈ {−5, 5} (here we also take x0 = −5 because f (x) = x⁴+ c is an even function), c = −2 and it is µ = 0.536115.

This is the best result for mean out of all combinations of degrees of poly- nomials, constants and seeds. However it is not that simple to say that this is the best result overall because we should also look at the risk of failure which is in this case ϕ = 1.48763. Looking at the second best result for mean, which we obtained for quadratic function, for x0∈ {−4, 4}, c = −5 which is µ = 0.689403 with risk of failure ϕ = 1.25873 which is lower than in previous case.

Risk of failure is also the best when it is lowest possible. The best result for risk of failure is for x0∈ {−2, 2}, c = 6 and it is ϕ = 1.36359 which is shown in Table 10 on page 34 and Figure 23 on page 35.

(34)

Seed 0 1 2

Const. Mean(0) Risk(0) Mean(1) Risk(1) Mean(2) Risk(2) -10 0.578034 1.67816 0.586773 1.56081 0.554619 1.56348 -9 0.621947 1.50815 0.580252 1.49075 0.598627 1.62461 -8 0.574618 1.55054 0.57976 1.52065 0.570831 1.53805 -7 0.595238 1.59695 0.582752 1.52734 0.579523 1.55813 -6 0.625751 1.46309 0.578917 1.61703 0.589953 1.60944 -5 0.641293 1.50191 0.585796 1.57285 0.582277 1.49566 -4 0.599996 1.51931 0.576663 1.58802 0.592209 1.53448 -3 0.588939 1.48763 0.580316 1.49031 0.561086 1.51039

-2 0.589504 1.47603 - 100 0.565 1.46666

-1 - 100 - 100 0.590706 1.43721

0 - 100 - 100 0.837491 1.4323

1 0.635563 1.46175 0.606084 1.45327 0.58699 1.42828 2 0.615039 1.50771 0.60194 1.43944 0.587998 1.48138 3 0.60384 1.50235 0.588478 1.45506 0.603049 1.4149 4 0.610959 1.54608 0.622819 1.48406 0.604183 1.47781 5 0.603176 1.52779 0.595418 1.45238 0.558111 1.52734 6 0.606354 1.42338 0.593368 1.4729 0.577539 1.36359 7 0.587421 1.62684 0.588739 1.57062 0.575562 1.5202 8 0.635579 1.56973 0.594476 1.50191 0.621188 1.48674 9 0.60956 1.51128 0.592025 1.54564 0.578189 1.49075 10 0.594957 1.56527 0.565728 1.53716 0.564386 1.52422

Table 10: Table of f (x) = x⁴+ c, x₀∈ {0, 1, 2}

(35)

Figure 21: Graphs of f (x) = x⁴+ c, x0= 0

(36)

Seed 3 4 5

Const. Mean(3) Risk(3) Mean(4) Risk(4) Mean(5) Risk(5) -10 0.566437 1.55991 0.56414 1.52645 0.55713 1.58669 -9 0.567135 1.57732 0.618809 1.55144 0.587541 1.53359 -8 0.603329 1.47692 0.5665 1.59427 0.577955 1.43765 -7 0.561957 1.4149 0.551719 1.51395 0.573612 1.52065 -6 0.592711 1.56795 0.576693 1.56348 0.557261 1.57642 -5 0.600676 1.4564 0.561785 1.53582 0.610813 1.45238 -4 0.554431 1.55768 0.572916 1.52243 0.570336 1.50191 -3 0.560809 1.47871 0.57599 1.45149 0.569463 1.60855 -2 0.564293 1.47335 0.550878 1.50592 0.536115 1.48763 -1 0.581284 1.3917 0.566848 1.46398 0.584453 1.46577 0 0.839376 1.36894 0.836215 1.42739 0.829052 1.44122 1 0.561622 1.60766 0.564308 1.58802 0.549468 1.52154 2 0.583707 1.41713 0.605355 1.49075 0.582768 1.52199 3 0.57757 1.47871 0.566143 1.48941 0.570705 1.46978 4 0.588525 1.40821 0.582647 1.49878 0.596442 1.45818 5 0.590758 1.50815 0.570855 1.4439 0.584836 1.50994 6 0.606903 1.40642 0.58053 1.50414 0.610313 1.3792 7 0.598092 1.46666 0.559539 1.57999 0.57385 1.64112 8 0.583466 1.60989 0.570144 1.61435 0.583243 1.49298 9 0.593907 1.48763 0.569294 1.57285 0.575975 1.52109 10 0.58438 1.51306 0.585347 1.48495 0.578035 1.58356

Table 11: Table of f (x) = x⁴+ c, x0∈ {3, 4, 5}

(37)

4.5 Quintic function

In this section we are considering polynomial of fifth degree f (x) = x⁵+ c where constant c ∈ I₁₀ and seed x₀ ∈ I5. In this case, like in the case of polynomial of third degree, there is a symmetry between c and −c, about the mean value axis. From Table 13 on page 40 we can see that for quintic function algorithm only fails for c = 0 and x₀ ∈ {−1, 0, 1}. We can see that the best result for mean is for x0∈ {−4, 4}, c = 0 and it is µ = 1.33542 which can be seen from Table 12 on page 38, Table 14 on page 42, Figure 28 on page 39 and Figure 36 on page 43.

(38)

Risk of failure is lowest for x0= −1, c = 10 and x0= 1, c = −10 and it is ϕ = 0.3052 which can be seen from Table 13 on page 40, Figure 31 and Figure 33 on page 41.

Seed -5 -4 -3 -2

Const. Mean(-5) Risk(-5) Mean(-4) Risk(-4) Mean(-3) Risk(-3) Mean(-2) Risk(-2) -10 8.05605 0.33465 7.82874 0.314124 8.16875 0.343127 8.11903 0.325279 -9 8.25223 0.323495 8.59994 0.33465 8.48745 0.340004 8.30781 0.360975 -8 7.87359 0.327064 8.11443 0.314571 8.23999 0.330634 8.09445 0.314571 -7 8.57486 0.311447 7.65584 0.361868 7.64465 0.355175 7.89439 0.363206 -6 8.11402 0.33108 8.06713 0.330188 8.17419 0.328403 8.3993 0.313232 -5 7.55322 0.437276 7.33845 0.349374 7.64857 0.367668 7.31571 0.353836 -4 7.42707 0.323048 7.83382 0.425228 8.07086 0.323048 8.19274 0.343574 -3 8.11225 0.323048 7.79628 0.337327 7.87295 0.435491 8.18423 0.341343 -2 7.94896 0.332419 7.84806 0.344912 8.56072 0.33465 8.36567 0.450215 -1 7.90995 0.342681 8.33117 0.331972 7.83927 0.365884 7.90843 0.46137

0 1.42819 1.10881 1.33542 1.26899 1.44949 1.17395 1.36185 1.22437 1 7.91991 0.313232 7.56784 0.338665 7.65367 0.329741 7.94223 0.335988 2 7.92568 0.343127 7.58059 0.32751 7.84405 0.314571 8.14193 0.326618 3 8.21397 0.325279 7.71775 0.349374 8.05457 0.335542 8.34516 0.346697 4 7.99645 0.35696 7.76634 0.33465 7.75271 0.33465 8.11959 0.319479 5 7.63373 0.360083 7.42459 0.306985 7.5268 0.330634 7.68637 0.340896 6 8.23288 0.368561 8.09798 0.373469 8.00834 0.322156 8.18855 0.340004 7 8.12628 0.346697 7.59103 0.34982 8.18183 0.355175 7.85946 0.351159 8 8.23855 0.34045 7.514 0.338665 8.17329 0.30877 7.85251 0.355175 9 8.18732 0.341343 8.33238 0.339112 7.99145 0.363206 7.893 0.338219 10 7.73273 0.335988 7.87298 0.321264 7.92102 0.329741 8.04261 0.335542

Table 12: Table of f (x) = x⁵+ c, x₀∈ {−5, −4, −3, −2}

(39)

Figure 27: Graphs of f (x) = x⁵+ c, x0= −5

Figure 30: Graphs of f (x) = x⁵+ c, x₀= −2

(40)

Seed -1 0 1 2

Const. Mean(-1) Risk(-1) Mean(0) Risk(0) Mean(1) Risk(1) Mean(2) Risk(2) -10 7.84458 0.339558 7.69684 0.446646 8.23873 0.3052 8.04261 0.335542

-9 8.13953 0.351159 8.24881 0.441738 8.05033 0.352051 7.893 0.338219 -8 7.86784 0.343574 7.97742 0.457801 7.6714 0.310109 7.85251 0.355175 -7 7.52651 0.371684 7.81877 0.457354 8.02505 0.323495 7.85946 0.351159 -6 8.53853 0.336881 7.72168 0.469848 8.11871 0.316355 8.18855 0.340004 -5 7.65756 0.332865 7.56109 0.446646 7.34377 0.348928 7.68637 0.340896 -4 7.43475 0.33465 7.84367 0.427459 8.07838 0.314124 8.11959 0.319479 -3 7.80518 0.332865 7.87954 0.430583 8.1955 0.341343 8.34516 0.346697 -2 8.56858 0.335096 8.37585 0.472079 8.57747 0.341343 8.14193 0.326618 -1 7.91385 0.460478 7.92403 0.44843 7.93524 0.44843 7.94223 0.335988

0 - 100 - 100 - 100 1.36185 1.22437

1 7.93524 0.44843 7.92403 0.44843 7.91385 0.460478 7.90843 0.46137 2 8.57747 0.341343 8.37585 0.472079 8.56858 0.335096 8.36567 0.450215 3 8.1955 0.341343 7.87954 0.430583 7.80518 0.332865 8.18423 0.341343 4 8.07838 0.314124 7.84367 0.427459 7.43475 0.33465 8.19274 0.343574 5 7.34377 0.348928 7.56109 0.446646 7.65756 0.332865 7.31571 0.353836 6 8.11871 0.316355 7.72168 0.469848 8.53853 0.336881 8.3993 0.313232 7 8.02505 0.323495 7.81877 0.457354 7.52651 0.371684 7.89439 0.363206 8 7.6714 0.310109 7.97742 0.457801 7.86784 0.343574 8.09445 0.314571 9 8.05033 0.352051 8.24881 0.441738 8.13953 0.351159 8.30781 0.360975 10 8.23873 0.3052 7.69684 0.446646 7.84458 0.339558 8.11903 0.325279

Table 13: Table of f (x) = x⁵+ c, x₀∈ {−1, 0, 1, 2}

(41)

Figure 32: Graphs of f (x) = x⁵+ c, x0= 0

Figure 34: Graphs of f (x) = x⁵+ c, x₀= 2

(42)

Seed 3 4 5

Const. Mean(3) Risk(3) Mean(4) Risk(4) Mean(5) Risk(5) -10 7.92102 0.329741 7.87298 0.321264 7.73273 0.335988

-9 7.99145 0.363206 8.33238 0.339112 8.18732 0.341343 -8 8.17329 0.30877 7.514 0.338665 8.23855 0.34045 -7 8.18183 0.355175 7.59103 0.34982 8.12628 0.346697 -6 8.00834 0.322156 8.09798 0.373469 8.23288 0.368561 -5 7.5268 0.330634 7.42459 0.306985 7.63373 0.360083 -4 7.75271 0.33465 7.76634 0.33465 7.99645 0.35696 -3 8.05457 0.335542 7.71775 0.349374 8.21397 0.325279 -2 7.84405 0.314571 7.58059 0.32751 7.92568 0.343127 -1 7.65367 0.329741 7.56784 0.338665 7.91991 0.313232 0 1.44949 1.17395 1.33542 1.26899 1.42819 1.10881 1 7.83927 0.365884 8.33117 0.331972 7.90995 0.342681 2 8.56072 0.33465 7.84806 0.344912 7.94896 0.332419 3 7.87295 0.435491 7.79628 0.337327 8.11225 0.323048 4 8.07086 0.323048 7.83382 0.425228 7.42707 0.323048 5 7.64857 0.367668 7.33845 0.349374 7.55322 0.437276 6 8.17419 0.328403 8.06713 0.330188 8.11402 0.33108 7 7.64465 0.355175 7.65584 0.361868 8.57486 0.311447 8 8.23999 0.330634 8.11443 0.314571 7.87359 0.327064 9 8.48745 0.340004 8.59994 0.33465 8.25223 0.323495 10 8.16875 0.343127 7.82874 0.314124 8.05605 0.33465

Table 14: Table of f (x) = x⁵+ c, x₀∈ {3, 4, 5}

(43)

Figure 37: Graphs of f (x) = x⁵+ c, x₀= 5

(44)

5 Discussion

In this work our goal was to find a polynomial function with adequate seed which will give us most optimal results. In case of Pollard’s rho method we cannot talk about the best result because we are always balancing between effectiveness and performance. Let’s take a look at the table of best results:

Best Best

Degree mean Risk risk Mean

1 34.5225 0.000446199 0 34.5231

2 0.689403 1.25873 0.99681 1.21383 3 1.32419 1.13959 0.519822 4.10548 4 0.536115 1.48763 1.36359 0.577539

5 1.33542 1.26899 0.3052 8.23873

Table 15: Table of the best results

We can see clearly that the best i.e. lowest mean result is for the the quartic function and it is equal to µ = 0.536115 which is obtained for x₀∈ {−5, 5} and for c = −2. Here the following risk is equal to ϕ = 1.48763 which is higher than the risk for the best result for quadratic function. We also have to take into consideration that computing with a function of a higher degree has longer running time i.e. more operations are needed in order to obtain results.

The best i.e. lowest risk is for linear function. With the linear function we have a problem of a high mean value which is in this case equal to µ = 34.5231 and it is much higher than any other mean value for the functions we were investigating. Linear case can be considered when needed almost 100% certainty of no error. However this function is not recommended for using in Pollard’s rho method because it’s key feature are the functions generating numbers that can be considered random. The running time is so long that it is preferable to use other, more simple methods for factorization, even trial division.

The quadratic function gives optimal results. The mean µ = 0.689403 is the second best and the following risk of failure is ϕ = 1.25873. This value is in the middle of obtained results for risk while choosing the best mean value. It has the fastest running time among other functions. The best result for quadratic function is for x₀ ∈ {−4, 4} and c = −5 i.e. function f (x) = x²− 5. This is an interesting result given that we started researching using the Pollard’s chosen function f (x) = x²+ 1 with seed x₀= 2. The results for that case were µ = 0.744417 and ϕ = 1.18644 which is close to our best results for quadratic function.

We can see that the results of the mean are lower for even degree functions while the risk is little higher. This can be explained with the fact that polynomial functions of even degrees are not injective.

When analysing the values of xk≡ (xk−1)^d+ c (mod n) we can notice that it is possible to get xk = a and xk+h = −a for some h ∈ N or vice versa. In

(45)

case of even d that means that in the next step we will get xk+1= xk+1+h ≡ a^d+ c (mod n). This can help us get a cycle earlier and therefore factorize n faster. As a result, the mean value will be lower.

The results for the cubic and quintic functions are similar. We can see that the best result for mean for cubic function is µ = 1.32419 for x0∈ {−4, 4} and c = 0 while the risk of failure is lowest for x₀= −2, c = 4 and x₀= 2, c = −4 it is ϕ = 0.519822. For quintic function, we can see that the best result for mean is for x₀∈ {−4, 4}, c = 0 and it is µ = 1.33542 while the risk of failure is lowest for x₀= −1, c = 10 and x₀= 1, c = −10, it is ϕ = 0.3052.

In conclusion it is better to use even polynomial functions for Pollard’s rho method. They are more effective as we can factor the given number faster.

Among the even functions it’s better to choose one with lower degree in order to have less operations while computing. We should also keep in mind rela- tionship between effectiveness and performance since those values are inversely correlated.

References

[1] John M. Pollard A Monte Carlo method for factorization. Kluwer Academic Publishers, 1975.

[2] Kenneth H. Rosen Elementary number theory and its applications. AT&T Laboratories and Kenneth H. Rosen, 2005.

[3] Jeffery Hoffstein, Jill Pipher, and Joseph H. Silverman. An Introduction to Mathematical Cryptography. Springer Science+Business Media New York, 2008.

[4] Gunnar Blom. Probability and Statistics Theory and Applications Springer- Verlag New York Inc., 1989

Pollard’s rho method

Degree project