• No results found

Probabilistic Pseudo-random Number Generators

N/A
N/A
Protected

Academic year: 2021

Share "Probabilistic Pseudo-random Number Generators"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Probabilistic Pseudo-random Number Generators

Probabilistiska Pseudoslumptalsgeneratorer AMINO, ROBERT AMINO@KTH.SE BAITAR, JONI BAITAR@KTH.SE

Degree Project in Computer Science, DD143X Supervisor: Austrin, Per

Examiner: Ekeberg, Örjan

(2)
(3)

Abstract

Random numbers are essential in many computer applica-tions and games. The goal of this report is to examine two of the most commonly used random number gener-ators and try to determine some of their strengths and weaknesses. These generators are the Linear Congruen-tial Generator(LCG) and the Mersenne Twister(MT). The main objective will be to determine which one of these is the most optimal for low intensive usage and everyday work.

Although some of the test results were inconclusive, there were some indications that MT is the better Pseudo-random Number Generator(PRNG) and therefore the pre-ferred PRNG. However, be wary that this is not a general guideline and some implementations may differ from this. The final verdict was thus that MT is a more favourable option(mainly due to its speed) for everyday work, both on a practical and theoretical level, if a choice should arise between the two options.

(4)

Referat

Probabilistiska Pseudoslumptalsgeneratorer

Slumptal representerar en viktig komponent i många da-torspel, simulationer och övriga progam. Två av de mest förekommande slumptalsgeneratorerna är Linjärkongruens-generatorn(LKG) samt Mersenne Twister(MT). Huvudfrå-gan som skall besvaras i denna rapport är huruvida, för vardagligt bruk, den ena generatorn är att föredra framför den andra. Ett antal tester kommer att utföras för att försö-ka finna eventuella styrkor samt svagheter med respektive generator.

Baserat på ett fåtal tester är MT att föredra fram-för LKG. Detta stämmer väl överens med teorin. Notera dock att detta inte alltid gäller och att det kan förekomma skiljaktigheter mellan de båda alternativen som strider mot det tidigare påståendet. Detta är främst beroende på vil-ka implementationer som används för respektive generator. Slutsatsen är således att användning av MT ändå rekom-menderas framför LKG, främst på grund av den snabba genereringshastigheten för MT.

(5)

Contents

1 Introduction 1 1.1 Statement of Collaboration . . . 1 1.2 Extent . . . 1 1.3 Purpose . . . 2 1.4 Problem Statement . . . 2 1.5 Definitions . . . 2 1.6 Overview . . . 3 2 Background 5 2.1 What is a PRNG? . . . 5

2.2 How does a PRNG work? . . . 5

2.3 Properties of a RNG . . . 6

2.4 Initial seed value . . . 6

2.5 Specification . . . 6 2.5.1 LCG . . . 6 2.5.2 MT . . . 7 2.6 Statistical tests . . . 7 2.6.1 Chi-square test . . . 7 2.6.2 Kolmogorov-Smirnov test . . . 7 3 Method 9 3.1 Considerations . . . 9 3.2 Procedure . . . 10

3.3 Chi-square goodness-of-fit test . . . 10

3.4 Kolmogorov-Smirnov test . . . 11

4 Results 13 4.1 KS-test . . . 13

4.1.1 Sequence of 50 randomly generated numbers . . . 13

4.2 Chi-square test . . . 14

4.2.1 Null hypothesis rejection rates . . . 14

4.2.2 Large scale random number sequence rejection rates . . . 15

(6)

4.3.1 Table of collected data . . . 15

5 Discussion 17 5.1 LCG . . . 18

5.2 MT . . . 18

5.3 Weaknesses in the method . . . 19

6 Conclusion 21 Bibliography 23 List of Figures 25 Appendices 25 A 27 A.1 Kolmogorov-Smirnov normal distribution tests on small sequence num-bers . . . 27

A.1.1 Sample size 3 . . . 27

A.1.2 Sample size 5 . . . 28

A.1.3 Sample size 10 . . . 28

A.1.4 Sample size 50 . . . 29

A.1.5 Sample size 100 . . . 29

A.1.6 Sample size 1000 . . . 30

B 31 B.1 Chi-square divergence test . . . 31

C 33 C.1 Larger sequence random numbers . . . 33

C.1.1 Sample size 100 . . . 33

C.1.2 Sample size 150 . . . 35

C.1.3 Sample size 300 . . . 36

C.1.4 Sample size 1000 . . . 37

(7)

Chapter 1

Introduction

Random number generators, commonly abbreviated RNG, have countless of practi-cal uses in basipracti-cally any computing environment. Generating true random numbers, however, is quite difficult and sometimes nearly impossible especially if there are high performance requirements or constraints. And for this reason it is more con-venient to generate numbers that appear to be random. This comes naturally due to the deterministic nature and finite state of computers and the eventual arising of patterns. What this means is that to an unknowing human being they appear to be random, but with enough information on the generator and on closer inspec-tion they are in fact not truly random. Therefore, given enough informainspec-tion on a given state of a sequence of random numbers, the upcoming states can success-fully be predicted. The trade off in randomness for performance is important for many applications and systems where speed is vital for the operation or procedure of a specific system, such as e.g. card games. Generators of this kind that pro-duce seemingly random numbers are called pseudo random number generators or short PRNG. Henceforth, to minimize needless repetition, pseudo-random number generators will be referred to as random number generators and vice versa.

1.1

Statement of Collaboration

This project has been a collaboration between Joni and Robert. The workload has been roughly evenly divided between the two. The authors have both edited and added sections independently of one another and at different times. The final version is thus composed of all the small pieces of text that were written along the way. Most of the the testing was performed in the computer labs with both parties present.

1.2

Extent

Since there are a multitude of random number generators, all with different qualities, the extent of this paper will only cover two of the most common random number

(8)

CHAPTER 1. INTRODUCTION

generators. These are the Linear congruential generator(LCG) and the Mersenne Twister(MT).

1.3

Purpose

The main purpose of this paper will be to answer the question whether one of the random number generators is preferred over the other for any ordinary everyday desktop work. Tests will be run to determine if there are sufficient evidence that favors one random number generator over the other.

1.4

Problem Statement

Both random number generators are excellent, each in their respective manner, since they are widely used as the default choice in many applications. But which one should you use given the choice between the two? Which one produces the most random sequence of numbers, as in the highest quality and the most unpredictable sequence?

• Given the choice, which of the two should you use?

• Which one produces the highest quality sequence of random numbers?

1.5

Definitions

Some phrases will be repeatedly used in this paper to refer to some keywords. Here follows the most important ones.

RNG - Random Number Generator. (these will be used interchangeably.)

PRNG - Pseudo-random Number Generator. (these will be used interchangeably.) LCG - Linear Congruential Generator.

MCG - Multiplicative Congruential Generator. MT - Mersenne Twister.

(9)

1.6. OVERVIEW

1.6

Overview

Chapter 2 Background - General information about the subject. Chapter 3 Method - Configuration and procedure.

Chapter 4 Results - The raw data that was collected during testing. Chapter 5 Discussion - Evaluation of the results.

Chapter 6 Conclusion - The outcome. Bibliography Sources will be listed here.

(10)
(11)

Chapter 2

Background

2.1

What is a PRNG?

A pseudo-random number generator(PRNG) is deterministic algorithm for gener-ating a sequence of random numbers. Now these numbers will appear to be com-pletely random, but are in truth not so.[1] There are random number generators and methods that generate true random numbers. But, they usually require additional hardware for generating the numbers from the physical world, where randomness is just about everywhere. These later generators are called hardware random num-ber generators and they offer true random numnum-bers, but usually at the cost of performance.[13]

2.2

How does a PRNG work?

A sequence of random numbers for LCG is generated by the formula listed below in figure (2.1).[11] LCG, which is one of the simplest and most common RNG, uses this algorithm to generate its random number sequence. A single number in itself is not random, but it is the sequence of numbers that is defined as random.

Xk+1 = aXk+ b mod M (2.1)

First a starting number, Xk, called the seed value is picked. For more information

on this, please refer to section “2.4 Initial Seed Value”. This value is then multiplied and added with two large primes, a and b. Lastly, a modulo operation is performed with the value M.[8] This value will determine the cycle or period of the random

number sequence. For LCG this value is usually set to 232. When this value is

reached, it is said that a full cycle has been reached. From there it will repeat the same sequence once again until the seed value is reached and then keep on repeating. This is obviously not very random. The operations up to this point will give a new

value to Xk+1, which is the next number in the sequence. This number will then be

(12)

CHAPTER 2. BACKGROUND

of a, b and M are vital for the quality of the random sequence. If they are chosen poorly, then the sequence will be weak.

2.3

Properties of a RNG

The most important properties of a random number generator can be classified in three distinct properties.[6]

First Property Quality of the random number sequence

This is perhaps the most important aspect since the random number sequence should be as unpredictable as possible, that is random.

Second Property Speed

Applications where speed is critical and the randomness quality is of minor importance, for example a card game, require fast random number generators.

Third Property Periodicity and Cycles

This concerns the periodicity of the sequence, as in how long it will take before the sequence starts repeating itself.

2.4

Initial seed value

The seed value of a RNG determines the level of randomness for a sequence. In most computers, this value is usually set to the current value of the UNIX-timer. The UNIX-timer shows the number of seconds that have elapsed since 00:00:00 GMT on January 1, 1970. This offers a relatively unique seed value for consecutive sequence generation. If the same seed value is selected for a random number sequence, then the random number sequence will be identical for given the same generator. This is the state of the sequence(along with the values of a and b) and from it the sequence of random numbers can be predicted. [5]

2.5

Specification

2.5.1 LCG

Linear Congruential Generator(LCG) is one of the oldest, simplest and most com-mon PRNG. It relies on the formula previously presented in figure (2.1). It is the default generator for Java’s util.Random.[7] It computes each number based on the previous number, starting from a seed. When a full cycle has been reached, it will start over and show the same sequence of numbers produced up to that point. The

short cycle is one of the inherent flaws of the LCG, where its cycle is 232.[2]

(13)

2.6. STATISTICAL TESTS

2.5.2 MT

Mersenne Twister(MT) is a PRNG that was created by Makoto Matsumoto and Takuji Nishimura in 1997.[10] It is of the type Linear feedback shift register gen-erator and was created to specifically address the shortcomings of the older linear

congruential generators. It features a very long period of 219937and is both very fast

and produces random number sequences of high quality. It is the default PRNG in Matlab and it is also one of the most commonly used PRNGs. It is much more com-plex than LCG and requires more memory to run primarily due to its large cycle. Unlike LCG, it uses an algorithm that performs series of bitwise AND, XOR and shift operations. Due to its complexity, MT will not be covered in further detail.

2.6

Statistical tests

There are multiple ways to test the properties of random number generators and their sequences.[4] Frequency tests, such as the Kolmogorov-Smirnov test, checks the expected versus observed frequency of numbers in a sequence and compares them with the known probability distribution. These tests will serve as indicators for how random different sequences of numbers really are. It would take a lot of time and computing power to reach some of the extremely large cycles for the PRNG:s, such as MT. It is therefore these tests that will serve as guidelines to determine the results. Since small sequences are the primary interest of this report, this makes the statistical test all the more suitable for this task.

2.6.1 Chi-square test

The chi-square test is a statistical test that counts the number of occurrences of each number range and places them in a selection of bins.[12] This is then compared with the expected frequency in each bin. Below in figure (2.2) is the formula for the chi-square test

χ2 =X(Oi− Ei)

2

Ei

, 1 ≤ i ≤ n (2.2)

Chi-square tests are more suitable for larger samples because a balanced distribution across all bins is the most optimal to produce a satisfying result.

2.6.2 Kolmogorov-Smirnov test

Kolmogorov-Smirnov is a statistical frequency test that compares the empirical dis-tribution with the cumulative disdis-tribution.[12] By comparing the generated random number sequence with a known probability distribution function, it can be deter-mined if the generated data mimics some specific distribution. Kolmogorov-Smirnov tests are more suitable for smaller samples. Below in figure (2.3) is the formula for the Kolmogorov-Smirnov test

Fn(x) =

1

n

X

(14)
(15)

Chapter 3

Method

Matlab is a powerful tool for running mathematical and technical calculations and simulations. The latest version of Matlab will be used to generate and control random number streams with the RandStream class. The choice of Matlab comes naturally since it has several PRNG:s to choose from and the additional support for controlling them is sufficient for the planned test scenarios. The RandStream class in Matlab supports several features such as stream and sub stream control, state and seed control. [12] Substream support is however not available for the two PRNG that will be tested.

3.1

Considerations

Reaching and testing some of the very large cycles for the random number generators is not an option. Due to limited time, the statistical tests will instead serve as a guidelines for determining the qualities of each PRNG. The default seed value 0 will be used for both of the generators. Also the default ziggurat pseudo random number sampling algorithm will be used for the transform. In these tests a special case of LCG will be used called Multiplicative Congruential Generator(MCG)[3], where the value of b is set to 0. The Multplicative Congruential Generator, as can be seen in figure (3.1), is thus only a special case of LCG where the value of b is set to 0.

Xk+1 = aXk+ b mod M (3.1)

The use of MCG is due to Matlab not offering any options for changing the value of b. In any case, it should not affect the results since they are both very similar. There are a multitude options and configurations to choose from. The authors do not claim that these test will be sufficient nor exhaustive. Due to computational limitations and time constraints, only a few configurations will be analysed.

(16)

CHAPTER 3. METHOD

3.2

Procedure

The tests will be run on Matlab version R2013a. The random number generators with the following designation will be used: mt19937ar (Mersenne Twister) and mcg16807 (Multiplicative congruential generator, MCG).[12] The Matlab stopwatch timer tic-toc will be used to measure time to generate very large sequences of random numbers.

rand Creates a random number sequence of uniform distribution. randn Creates a random number sequence of normal distribution. chi2gof Chi-square goodness of fit test.

kstest Kolmogorov-Smirnov test. tic Stopwatch timer.

3.3

Chi-square goodness-of-fit test

The built in Matlab chi-square goodness-of-fit test will be used to perform chi-square tests on each of the random number generators. This is done by taking every single number of a sequence of random numbers and grouping them in a set of bins.[12] There are multiple sets of bins, each one corresponding to a specific range of values. One bin for values ranging from 0 to 0.1, another for 0.1 to 0.2 and so forth, all the way up to 0.9 to 1.0. The default value of 10 bins will be used. This was deemed as an appropriate value since each of the generated numbers would range from 0 to 1. The test then checks the observed versus expected distribution of numbers in all the bins. If the data follows the normal distribution, then the null hypothesis is rejected at the critical level of 5%. For the null rejection tests, a number of random number streams (specified by n) will be generated and tested using the chi-square goodness-of-fit test. A positive deviation will indicate a higher rejection rate for LCG and lower for MT, and vice versa.

(17)

3.4. KOLMOGOROV-SMIRNOV TEST

3.4

Kolmogorov-Smirnov test

The Matlab function kstest will be used to run a one-sample Kolmogorov-Smirnov test by comparing each vector or sequence of numbers with a reference vector of expected values. Kstest performs the test with the default significance level of 95% which will also be utilized for this test. Unlike the chi-square goodness-of-fit test, the kstest does not group the numbers in specific bins but tests each single number in a sequence on its own.[4] The formula for the kstest is listed below in figure (3.2) [12] D= max x ˆ F (x) − G(x) ! (3.2)

Where ˆF (x) is the empirical cumulative distribution function and G(x) is the

(18)
(19)

Chapter 4

Results

Most of the graphs and figures can be found at the appendices at the end. Only the most interesting results (as in not inconclusive ones) will be presented in this section. For a more comprehensive list of all the tests, please refer to the appendices A-C.

4.1

KS-test

A comparison for LCG and MT for the Kolmogorov-Smirnov test. In the figure (4.1) below, the blue line indicates the empirical observed data. The red line shows the theoretically expected distribution that the numbers should adhere to. In this sample, the MT appears to be more unpredictable by not following the red line (normal distribution) as closely as LCG and MT is therefore more random for this sample size.

4.1.1 Sequence of 50 randomly generated numbers

−2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=50)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=50)

Observed CDF Expected CDF

Figure 4.1. Kolmogorov-Smirnov frequency tests for a sequence of 50 numbers. Cumulative distribution function is denoted as CDF

(20)

CHAPTER 4. RESULTS

4.2

Chi-square test

Chi-square tests for null hypothesis rejection and large sequences. The number of times the tests were run in succession is specified for each one. A significance value of 95% was used for the Chi-square test. For each MT sequence that was rejected by the chi square goodness-of-fit test, the indicated value below in figure (4.2) was decreased. If it was not rejected then there was an increase instead. The same applies for LCG. The figure (4.2) thus shows the relationship of rejection rates between LCG and MT. Values larger than 0 indicate fewer rejections for MT and more for LCG. On the other hand, values smaller than 0 indicate higher rejection rates for MT and lower for LCG. Figure (4.2) thus illustrates that there were a larger number of rejections for LCG and much lesser for MT, proving that MT wins in this test.

4.2.1 Null hypothesis rejection rates

0 50 100 150 200 250 300 350 400 −1 0 1 2 3 4 5 6 7 8

Chi−squared null hypothesis test plot (n=1)

Bin number

Deviation

Figure 4.2. Deviation between LCG and Mersenne Twister for Chi-square test on a

single stream. Positive deviation means fewer rejections for MT and more for LCG.

(21)

4.3. TIME MEASUREMENT TESTS ON LARGE NUMBER SEQUENCES

4.2.2 Large scale random number sequence rejection rates

The figure (4.3) below illustrates the number of rejected chi-square goodness-of-fit tests for each PRNG and for very large amounts of number sequences. The outcome of the test shows that there were far lesser rejections for MT compared to LCG, thus making MT the winner of this test.

Figure 4.3. Number of rejected chi-square null hypothesis test for LCG vs MT at

50k sequences and 100 random numbers long. Light blue is MT and dark blue shows LCG.

4.3

Time measurement tests on large number sequences

In the table (4.1) below is listed the time, in seconds, it takes to generate a very large single sequence of random numbers. Notice how the factor approximately increases accordingly with the sample size. The presented data demonstrates the faster generation time of MT compared to LCG, making MT once again the winner.

4.3.1 Table of collected data

Elapsed Time (s)

n MT LCG

105 0.5 0.7

106 6.0 6.5

107 50.1 53.7

Table 4.1. The time, in seconds, it takes to generate sequences of the size n using

the uniform distribution and the default ziggurat transform. Based on these values, it would take over 13 hours to reach a full cycle using the same computer.

(22)
(23)

Chapter 5

Discussion

It is generally difficult to test random number generators properly. The test results were not as decisive as expected. For some of the tests the outcome yielded in inconclusive results. These tests were primarily for small sequences of numbers. Other tests, such as the pattern checking tests, resulted also in indecisive results. This was mainly due to small sequences and limited resources. If the tests were to reach larger data sets, the differences would be more noticeable. The timing test for the 50k sample sequence yielded some interesting results. On the 50k test, MT was noticeable faster in generating its sequence compared to LCG. But this could however be due to the Matlab implementation of MT and LCG. Larger sequences then this could not be run due to limited resources which caused the computer to freeze. With a midrange desktop PC it would take approximately 13 hours to reach a full cycle for LCG. Prior to performing our tests we theorized that MT would perform considerably better in terms of efficiency and speed compared to LCG. As can be seen on table 5.1, our results confirm this conjecture.

Elapsed Time (s)

n MT LCG

105 0.5 0.7

106 6.0 6.5

107 50.1 53.7

Table 5.1. The time, in seconds, it takes to generate sequences of the size n using

the uniform distribution and the default ziggurat transform.

The results from the Kolmogorov-Smirnov yielded also some very interesting results. Notice how, over time, the numbers start to resemble the normal distribu-tion. As can be seen on figure (4.1) the sequence of randomly generated numbers adhered to the normal distribution. If a human would look at the generated stream of random numbers, no definitive patterns would be recognized. But if we were to

(24)

CHAPTER 5. DISCUSSION

run and compare very large sequences, with the help of the statistical test, obvious patterns would surface.[9] Over a large scale number sequence the true nature of the sequence become apparent. The point of the test is to show that even though these numbers are generated as random numbers in a sequence, they are not random at all. For both LCG and MT the results were the same, as can be seen that the two graphs are practically indistinguishable from one another. The same could be applied to any other distribution, such as uniform distribution, Weibull distribution and so on. Theoretically, MT is the superior RNG in all aspects except requiring more resources due to its large cycle.[7] But this could not be proved for the small sequences and all aspects based on the acquired data.

5.1

LCG

LCG performed similarly to MT in most of the small sequence tests, with the exception of a certain few. There were however no clear indications whatsoever, in any of the tests, that could point to the conclusion of LCG being the superior RNG. Larger sequences that reach the cycle limit of the generator could not be produced. Even for smaller sequences LCG was noticeable slower as can be seen in section 4.3. This could in part be due to the Matlab implementation.

5.2

MT

The positive deviation between LCG and MT for the Chi-square test on a single stream can be seen in section 4.2.1 and figure 4.2. This test indicated that the null hypothesis rejection rates were much fewer on all sequences for MT compared to LCG. MT appears also to be slightly faster on the 50k timing test in section 4.2.2, figure 4.3. The certainty of these result could, however, be questioned. The margins are too small to draw any definitive conclusions, but they are still there. It could be theorized that they would be more evident for even larger sequences that goes beyond a full cycle. However, based on these findings MT is still considered to be the superior PRNG and the same conclusion was reached on a different study done on the same subject.[7]

(25)

5.3. WEAKNESSES IN THE METHOD

5.3

Weaknesses in the method

Due to limited time and computing power this study was confined to only statistical tests. There are, however, some apparent weaknesses in using only Matlab and the predefined random number generators and streams. First off, the results cannot be applied in a general sense due to unknown details of the Matlab implementation. Speed and efficiency were a major concern throughout the testing phase, and pre-sented a substantial obstacle. Running larger sequences also resulted in very large runtimes, which limited the scope of the study. Secondly, the validity of some of the tests and results such as the null hypothesis rejection test can be questioned. In some instances, the quality of the streams was a major concern since there was the possiblity of qualititative deterioration due to multiple running instances. This meant that the states of each sequenes could not be initiatied from a reference point due to no substream support for either of the tested generators. This exemplifies some of the apparent difficulties of testing random number generators.

(26)
(27)

Chapter 6

Conclusion

Generally, both generators performed very close to one another on most of the tests. The conclusion is still that MT is the preferred PRNG over LCG. This is mainly due to the periodicity and the speed of MT. This is true at least for the Matlab implementation of both generators. But it should generally still hold true for any other implementation. In the problem statement we defined two questions.

• Given the choice, which of the two should you use?

• Which one produces the highest quality sequence of random numbers? The answer for the first question was successfully acquired using the generated re-sults. Based on these findings, it is recommended to use the Mersenne Twister instead of the Linear Congruential Generator. Not only is it very fast, but it also has a extremely large cycle after which it repeats itself and thus a much greater margin compared to LCG. It is also theoretically the better choice, and these find-ings confirm some of these premises. As for the quality of the random number sequence, it could not be ascertained based on the tests that were performed. For low sequences of random numbers both LCG and MT seemed to produce relatively qualitative random numbers with no noticeably patterns or deviations. This ques-tion can unfortunately not be answered based on our findings, mainly due to the inconclusive tests. This is also in part due to the difficulty of testing and defining what a good qualitative random number sequence really is.

(28)
(29)

Bibliography

[1] Park, Stephen K, Miller, Keith W. Random Number Generators: Good Ones

Are Hard To Find. 1988.

Available at http://www.cems.uwe.ac.uk/~irjohnso/coursenotes/

ufeen8-15-m/p1192-parkmiller.pdf [Retrieved 13/04/2014]

[2] Dwyer, Gerald, Williams, K. B. Portable Random Number Generators. 1999.

Available at http://www.jerrydwyer.com/pdf/randomsh.pdf [Retrieved

13/04/2014]

[3] Downham, D. Y., Roberts, F. D. K. Multiplicative Congruential Pseudo-random

Number Generators. 1967.

Available at http://comjnl.oxfordjournals.org/content/10/1/74.full. pdf [Retrieved 13/04/2014]

[4] Dwyer, Jerry, Williams, K.B. Testing Random Number Generators. 1996. Available at http://www.drdobbs.com/testing-random-number-generators/ 184403185 [Retrieved 13/04/2014]

[5] Hellekalek, P. Good random number generators are (not so) easy to find. 1998.

Available at http://www.ics.uci.edu/~smyth/courses/ics178/random_

number_generators_article.pdf [Retrieved 13/04/2014] [6] Kroese, Dirk P. Monte Carlo Methods. 2011.

Available at http://www.maths.uq.edu.au/~kroese/mccourse.pdf

[Re-trieved 13/04/2014]

[7] Albrecht, J. A Comparison of Mersenne Twister and Linear Congruential

Ran-dom Number Generators. 2007.

Available at http://people.cs.pitt.edu/~jsa8/math_project.pdf

[Re-trieved 13/04/2014]

[8] Moler, C. Random Numbers. 2013.

Available at http://www.mathworks.com/moler/random.pdf [Retrieved

13/04/2014]

[9] James, F. A review of pseudorandom number generators. 1990.

Available at http://lammps.sandia.gov/threads/pdfFowF57Qu9A.pdf [Re-trieved 13/04/2014]

(30)

BIBLIOGRAPHY

[10] Matsumoto, M, Nishimura, T.

http://www.math.sci.hiroshima-u.ac.jp/ mmat/MT/emt.html. 1996.

Available at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/

ARTICLES/mt.pdf [Retrieved 13/04/2014]

[11] Knuth, D. The Art of Computer Programming - Seminumerical Algorithms. Volume 2. Addison-Wesley. 1997. chapter 3, p1-170.

[12] Mathworks. MATLAB Documentation Center. 2014.

Available at http://www.mathworks.se/help/matlab/index.html [Retrieved 13/04/2014]

[13] Haahr, M. Introduction to Randomness and Random Numbers. 2014. Available at http://www.random.org/randomness/

(31)

List of Figures

4.1 Kolmogorov-Smirnov frequency tests for a sequence of 50 numbers.

Cu-mulative distribution function is denoted as CDF . . . 13

4.2 Deviation between LCG and Mersenne Twister for Chi-square test on a single stream. Positive deviation means fewer rejections for MT and more for LCG. . . 14

4.3 Number of rejected chi-square null hypothesis test for LCG vs MT at 50k sequences and 100 random numbers long. Light blue is MT and dark blue shows LCG. . . 15

A.1 Kolmogorov-Smirnov frequency tests for a sequence of three numbers . . 27

A.2 Kolmogorov-Smirnov frequency tests for a sequence of five numbers . . 28

A.3 Kolmogorov-Smirnov frequency tests for a sequence of ten numbers . . . 28

A.4 Kolmogorov-Smirnov frequency tests for a sequence of 50 numbers . . . 29

A.5 Kolmogorov-Smirnov frequency tests for a sequence of 100 numbers . . 29

A.6 Kolmogorov-Smirnov frequency tests for a sequence of 1000 numbers . . 30

B.1 Chi-square test for sub streams specified as n. Positive deviation indi-cates less rejected null hypotheses for MT and more for LCG . . . 31

C.1 Random numbers from a uniform distribution . . . 33

C.2 Random numbers for different sample sizes . . . 34

C.3 Random numbers for different sample sizes . . . 35

C.4 Random numbers for different sample sizes . . . 36

C.5 Random numbers for different sample sizes . . . 37

(32)
(33)

Appendix A

A.1

Kolmogorov-Smirnov normal distribution tests on

small sequence numbers

A.1.1 Sample size 3

0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=3)

Observed CDF Expected CDF −2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=3)

Observed CDF Expected CDF

(34)

APPENDIX A.

A.1.2 Sample size 5

−0.80 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=5)

Observed CDF Expected CDF −2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=5)

Observed CDF Expected CDF

Figure A.2. Kolmogorov-Smirnov frequency tests for a sequence of five numbers

A.1.3 Sample size 10

−1 −0.5 0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=10)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=10)

Observed CDF Expected CDF

Figure A.3. Kolmogorov-Smirnov frequency tests for a sequence of ten numbers

(35)

A.1. KOLMOGOROV-SMIRNOV NORMAL DISTRIBUTION TESTS ON SMALL SEQUENCE NUMBERS

A.1.4 Sample size 50

−2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=50)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=50)

Observed CDF Expected CDF

Figure A.4. Kolmogorov-Smirnov frequency tests for a sequence of 50 numbers

A.1.5 Sample size 100

−2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=100)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=100)

Observed CDF Expected CDF

(36)

APPENDIX A.

A.1.6 Sample size 1000

−3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=1000)

Observed CDF Expected CDF −4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=1000)

Observed CDF Expected CDF

Figure A.6. Kolmogorov-Smirnov frequency tests for a sequence of 1000 numbers

(37)

Appendix B

B.1

Chi-square divergence test

0 50 100 150 200 250 300 350 400 −1 0 1 2 3 4 5 6 7 8

Chi−squared null hypothesis test plot (n=1)

Bin number Deviation 0 50 100 150 200 250 300 350 400 0 1 2 3 4 5 6 7 8 9

Chi−squared null hypothesis test plot (n=2)

Bin number Deviation 0 50 100 150 200 250 300 350 400 −4 −2 0 2 4 6 8 10

Chi−squared null hypothesis test plot (n=3)

Bin number Deviation 0 50 100 150 200 250 300 350 400 0 2 4 6 8 10 12 14

Chi−squared null hypothesis test plot (n=4)

Bin number

Deviation

Figure B.1. Chi-square test for sub streams specified as n. Positive deviation indicates less rejected null hypotheses for MT and more for LCG

(38)
(39)

Appendix C

C.1

Larger sequence random numbers

C.1.1 Sample size 100 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, MT

(40)

APPENDIX C. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, MT −2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=100)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=100)

Observed CDF Expected CDF

Figure C.2. Random numbers for different sample sizes

(41)

C.1. LARGER SEQUENCE RANDOM NUMBERS C.1.2 Sample size 150 0 0.2 0.4 0.6 0.8 1 0 0.5 10 0.2 0.4 0.6 0.8 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, MT 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, MT −3 −2 −1 0 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=150)

Observed CDF Expected CDF −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=150)

Observed CDF Expected CDF

(42)

APPENDIX C. C.1.3 Sample size 300 0 0.2 0.4 0.6 0.8 1 0 0.5 10 0.2 0.4 0.6 0.8 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, MT 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, MT −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=300)

Observed CDF Expected CDF −4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=300)

Observed CDF Expected CDF

Figure C.4. Random numbers for different sample sizes

(43)

C.1. LARGER SEQUENCE RANDOM NUMBERS C.1.4 Sample size 1000 0 0.2 0.4 0.6 0.8 1 0 0.5 10 0.2 0.4 0.6 0.8 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, MT 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, MT −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=1000)

Observed CDF Expected CDF −4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=1000)

Observed CDF Expected CDF

(44)

APPENDIX C. C.1.5 Sample size 10000 0 0.2 0.4 0.6 0.8 1 0 0.5 10 0.2 0.4 0.6 0.8 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Uniform distribution, MT 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, LCG 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Uniform distribution, MT −4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For MT (n=10000)

Observed CDF Expected CDF −4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Numbers From A Normal Distribution

Cumulative Probability

Kolmogorov−Smirnov Test For LCG (n=10000)

Observed CDF Expected CDF

Figure C.6. Random numbers for different sample sizes

References

Related documents

This allows us to significantly improve the gadget size known for simulation theo- rem of (G¨ o¨ os, Pitassi & Watson 2015; Raz & McKenzie 1999), that uses the indexing

A number of coping strategies were used among the social workers; e.g., to emotionally shut themselves off from work when the situation was too difficult to handle or to resign

We find that empirically random maps appear to model the number of periodic points of quadratic maps well, and moreover prove that the number of periodic points of random maps

We investigate the number of periodic points of certain discrete quadratic maps modulo prime numbers.. We do so by first exploring previously known results for two particular

Karin Zetterqvist Nelson beskriver att läs- och skrivsvårigheter, till skillnad mot dyslexi, har sin grund i sociala, psykologiska eller pedagogiska omständigheter. 24) menar

Keywords: Clinical ethics, life-sustaining treatment, end-of-life decisions, attitudes, nurses, physicians, inter-professional relations, ethics consultation, ethics rounds,

Since these Fano e ffects are not expected to a ffect the absorption spectra, 48 , 61 we conclude that the absorption peaks are better suited for determining the polariton energy

Gottlieb, An integer-valued function related to the number of gener- ators of modules over local rings of small dimension, Communications in Algebra 21 (1993) no..