PSEUDO-RANDOM NUMBER GENERATORS
(REPORT I)
A report submitted in partial fulfilment of the requirements for the degree of Bachelor of Science
by Z. O'Connell (950786535)
Department of Information Systems and Computing,
Brunel University
January 1999
Table of Contents
Chapter 1 - Introduction 3
1.1 Problem Definition 3
1.2 Study Scope 3
1.3 Study Objectives 3
Chapter 2 - Available generators 5
Introduction 5
2.1 Von Neumann mid-square 7
2.2 Congruential Generators 7
2.3 Feedback Shift Registers 9
2.4 Algorithm M 10
2.5 Lagged-Fibonacci 11
Chapter 3 - Analysis of random sequences 12
Introduction 12
3.1 Statistical frequency tests 14
3.2 Non-statistical empirical tests 15
3.3 Marsaglia "Stringent" Tests 17
Chapter 4 - Conclusion 19
References 20
Appendix A: Evaluation 22
A.1 Project Overview 22
A.2 The Approach 22
A.3 Possible Future Work In The Area 23
There are many known algorithms for generating numbers which are, or at least appear to be, random. There are also a large number of tests to check that what is produced by such generators does closely resemble a random stream. There is not, however, a currently freely available suite of generators and testing routines allowing the user to eaisly define new generators and testers for experimental purposes. This report reviews the sources required for implementing such a program and presents some background on the more fundimental aspects of random number generation and testing.
This project studies the speed and efficiency of pseudo-random number generators and tests used to eliminate "flawed" generators. Methods for converting "flat" integer distributions into standard real number distributions are also briefly examined. The ultimate aim of the project is to produce a set of random numbers generators of various types and a battery of tests which can be used to "weed out" any flawed generator, hopefully arriving at a single generator or set of generators for various applications that are both fast and as random as possible.
Cryptanalysis of random number generators in general is too wide a field to be able to study in detail in this project, so this project will concentrate on the algorithmic and simulation modeling applications of random numbers. Despite this, much of the source material comes from cryptography primarily because most of the recent work on PRNGs has been in that area.
The generators we study produce, in general, equally distributed output between zero and a known value. Although mathematical functions exist to convert this output into other distributions using real numbers, such techniques are also not examined here.
The aim of this report is to briefly summarise the sources of information required to produce a suite of random numbers. It is not intended that there should be enough information contained here to do any theoretical analysis of generators or tests or to implement the more complicated tests as such a report would be several times the size of this one.
Generators discussed here are those directly relating to the main area of study - uses of pseudo-random numbers in simulation modeling and probabilistic algorithms. It is intended to analyse more generators than those presented here in the second part of the project in order to compare these generators to near-perfect or "secure" (Unpredictable) but slower cryptograpic and hardware generators, however such generators do not need to be implemented as source code or generator output is freely available.
Knuth [1] offers the only detailed published non-cryptographic discussion on what a random number actually is, although much of the discussion is now out of date. The discussion is, in any case, mathematical and doesn't help us understand what a random number is at a fairly basic level. Most notably, his discussion makes no reference to the use of random numbers for encryption, which is now one the main area of research connected to random numbers.
A good definition [22] of a "True" random number generator is that is should be "Unbiased, unpredictable and unreproducable". As we are concentrating on probabilistic algorithms and simulation modeling applications, we are primarily concerned with the unbiased aspects of random number streams. A pseudo-random number generator will always be predictable and reproducable given knowledge of the algorithm and initial seed value used. In both fields, we are not worried about the predictability of an unbiased algorithm, and unreproducability can be a very undesirable trait if a particular run needs to be repeated.
A computer, being a purely mathematical machine cannot generate truly random number sequences. Instead, it has to rely on mathematical 'tricks' to produce a sequence that appears to be random. There are electronic devices, such as ERNIE and Rairfield et al's integrated circuit [3] based on "...the frequency instability of a free running oscillator". These systems can generate random number sequences which are dependent on entropy in the environment and as they are not practically predictable, are suitable for use in cryptographic systems or where predictable numbers are undesirable, such as lotteries. They are not studied here however as we are interested primarily in generators which are implementatable in software. Cryptographic generators are also not looked at as they sacrifice much of their speed in the interests of security. Similarly, cryptographic hashes of semi-random sources, such as the SHA hash of entropy gained from hardware events used by Linux (See [11], also /usr/src/linux/drivers/char/random.c on any Linux machine) are also not discussed as they require OS-level access to the hardware to generate the initial entropy and as such can not be implemented portably.
It is very easy for anyone to propose a random number generator without properly understanding what they are doing, as Knuth [1] mentions, ("...random numbers should not be generated with a method chosen at random. Some theory is required.") so any generator which doesn't at least have an analysis using the stnadard tests isn't discussed here. For our non-cryptographic purposes, we require generators that are far quicker than these cryptographic ones. There are many combination generators and "shuffling" tecniques which are also mostly not discussed, such as the McGill "Super-Duper" generator mentioned in [21], as there are so many that it would take a long time to study and discuss the relative merits of each. These methods are in general much slower than a single generator and designed to "defeat" cryptographic attacks rather than increase randomness.
One of the most important aspects of a generator is it's peroid and it's tendancy to "degenerate" to sequences such as all zeros. The period of a generator is how long a sequence it generates without repeating. This is mainly a problem with generators, such as the Von Neuman middle-squared method, that use the output of one iteration as the "seed" for the next one and don't maintain any hidden "state". With such a system, a particular seed will always produce the same output sequence, and if that value reoccurs in the output sequence the generator repeats. Evaluating the period of a generator with internal state of more than 32 bits algorithmically requires a prohibitive amount of memory (512Mb, assuming states that have occurred are stored in an array of 232 single-bit boolean values) and knowledge of the internals of the generator, so it is difficult to do. Fortunatly, the serial test picks up on this well and other tests tend to produce poorer results for generators that have a low period or degenerate quickly so we do not need to test for these attributes directly.
The earliest proposed generator, by Von Neuman, is simple and relatively quick but no longer considered suitable for any application because of it's relatively low period and tendancy to degenerate rapidly. The basis of the generator is to take a seed value, square it and take as many digits from the middle of the number as required. Knuth [1] describes this method in details and points out that in most (not all) cases that a Von Neumann generator will loop after a relatively small number of iterations. Although he gives a method for stopping a Von Neumann generator from looping and gives cases with a very long period, there are generators which are almost as fast which are for more random.
Paul & Balmer [15] discuss a modified form of Von Neumann where rather than squaring a number, the product of two numbers is used and the middle digits are taken but this "mid-product" method is also flawed and biased in the same way as Von Neumann, although with longer peroid and less tendancy to degenerate.
Congruential generators, those which in some way use the modulus function, are by far the most common generators in use and are discussed in every book on the topic. There are several types of congruential generator available, however.
Also called a "mixed congruential generator", several texts describe linear congruential methods in detail, although they were originally proposed by Lehmer in 1948, although not published until 1951. [17] Knuth (As usual) presents the best basic discussion on this method, which is based on the formula f(x+1)=(a f(x)+c)modm, where a, c and m are constants chosen to give good speed and period. For this generator, r=m, but as Knuth shows it is a mistake to choose m to be a "convenient" value such as 65536. Knuth [1] and Banks et al. [13] go into quite some detail over the choice of a, c and m which are shown to be very important to choose carefully to get a good generator. In practice these values are usually fixed in a particular implementation as the potential to produce a very poor generator by accidently choosing the wrong vaules is very high. When analysing linear congruential generators each set of choices for values is generally considered as a separate generator as properties such as period and even the overall randomness of the output vary wildly based on them. Schneier [8] gives a table of "good" values which pass the spectral test, a partly-theoretical test used to analyse the choice of values for linear congruential generators. Because of their simplicity, linear congruential generators with good choices of values are still considered good sources of random numbers for non-cryptographic applications. The period of a well-chosen LCG on a 32-bit computer is only 2D109 [13] however, so other generators are required for many uses.
Lehmer, when he originally proposed a congruential generator, actually gave a linear congruential generator but with c=0. This "multiplictive congruential generator" is not often used because computationally c@0 is trivial and greatly increases the randomness of the generator. [1] It is used as the basis for other combined generators however, most notably Wichmann & Hill [5], although they don't say why they chose multiplicative over linear congruential generators.
Additive congruential generators, sometimes called "Fibonacci" generators, of the form f(x)=(f(x-1)+f(x-k))mod m are mentioned briefly by Kleijnen and Groenendaal [17] although are no longer considered good generators in most cases. (The name Fibonacci is due to the similarity of the generator to the fibonacci sequence with k=2 rather than any link with the mathematician of the same name.) The origins of this form of generator are unclear but Knuth mentions it as having been proposed in 1950's with k=2 and then analysed with k@2 by Green, Smith and Klem in 1959. m=235 and k=97 is mentioned in [17] as being a good version of this generator. If these values give a good period and pass the standard randomness tests then this is a very good generator as it is faster then even Von Neumann - it is a single addition operation and a modulus that, on a 64-bit computer, can be implemented in a single AND instruction.
Schneier [8] and Knuth [1] mention, unfortunately without references for their source, polynomial congruential generators of the form f(x+1)=(a f(x)n+b f(x)n-1...+c)modm. Without knowing where these come from it is impossible to make a guess as to good values for the generator. As with all congruential generators, theory is required to select values to produce a good generator. Although the spectral test (See section 3.4) could probably be adapted to do this, such work is beyond the scope of this project.
Wichmann & Hill [5] produced an combined generator based on three multiplictive congruential sources with different periods. (There is no reason not to use more source generators, but Weichman & Hill state that they found three adequate to produce good randomness) Although their original paper is very short (Just over 3 pages with spacing similar to this report) and was initially only an attempt to produce a fast, portable generator for Pascal they managed to find a new form of generator that produces good random numbers. The main attraction of this generator is it's speed as it is designed with microcomputers in mind. (Specifically, the PDP-11) For a congruential generator, it also has a very large period (Wichmann & Hill claim 2.78D1013) which should be sufficient for most applications. Unusually, this generator produces real numbers between 0 and 1 rather than integers. The production of real numbers appears too closely tied in with the theory behind the generator to be removed, and this fact could slow the generator down under certain circumstances as discussed in the next section. Unfortunately, Wichmann & Hill don't say what the resolution of the output is (I.e. How many decimal places can be regarded as random) which we will need to know to do accurate testing, so some theoretical analysis of this generator is probably needed. A serial test conducted on a generator with an assumption of more resolution than is actually present will probably fail.
L'Ecuyer [7] produced what is now considered the best method of combining linear congruential generators, and is the only "non-cryptographic" generator Schneier [8] bothers giving code for and the only combined generator discussed by Banks et al. [13] L'Ecuyer's generator has a far more detailed theoretical base than Wichmann & Hill present. Although more complex than W&H's combined generator, it is still easy to implement and relatively fast compared to some other generators. Schneider claims a period for his implementation of "somewhere in the neighborhood of 1018". This brings us onto the interesting question of speed versus portability. Although L'Ecuyers period is much greater than W&Hs period for most applications W&H is adequate and if it passes randomness tests adequately there seems no reason to use L'Ecuyer if it is slower. L'Ecuyer is more complicated however, so likely to be slower but W&H relys on floating-point calculations, the speed of which can vary wildly between different processor types. Therefor, when considering the speed of generators during implementation they will need to be tested on different platforms to gauge the difference in performance caused by such details.
Although ICGs, which are essentially LCGs that use the multiplicative inverse of f(x-1) rather than the raw value, are usually ignored as they fail the Diehard tests [19], Marsaglias implementation in the diehard suite did not use the best values, [26] which highlights the importance of choosing the correct values for a generator and the fact that a specific implementation should be tested before use and not blindly relied on because it implements a "known good" generator. There appears to have been no "fixed" version of this generator tested using the Marsaglia/Diehard tests so it is not known if this generator performs well or not.
The other major type of generator is the feedback shift register, sometimes called Tausworthe generators. The origin of this generator is unclear - some sources say it was proposed by Selmer in 1965, others by Tausworthe in the same year. The generator works by taking a single bit from the right of a n-bit register, shifting the register right by one bit then calculating the left-most bit based on the contents of the whole register. The main attraction of this style of generator lies with the cryptographic security of certain versions of FSRs and in it's ease of implementation in hardware. However, they are also very easy to implement in software and fast so they are also valuable for other applications. As with congruential generators, there is more than one version of this generator.
The fastest form of FSR is the linear feedback shift register, which uses two or more bits from the register XORed together to give the new bit. Schneier [8] gives a detailed discussion of this method, including details on how to choose which bits to XOR for maximum effect. Schneier claims that a LFSR is non-random if large numbers are generated although doesn't give a convincing argument as to why this should be and Kleijnen and Groenendaal [11] don't consider the LFSR flawed compared to congruential generators. There are certain problems with this style of generator, most notably that the shift register must be large compared to the size of output word to approximate random output. It is also relatively easy for the output to degenerate into all-zeros or other repeating patterns in a similar manner to Von Neumann generators.
Feedback with carry shift register generators are relatively new and were devised for cryptographic uses so there is no study on their practical use in non-cryptographic applications. The basic feedback function is the sum of all the "tapped" bits plus the carry register - that value mod 2 becomes the new bit, and div 2 becomes the new carry register. [8] Like congruential generators with differing values for a, c and m, FCSRs need to have any initial value tested before use to check that they have reasonable period and don't degenerate. Schneier gives a table of "tap sequences" (Which bits in the FSR need to be used) to give maximum periods - the maximum periods are typically in the range 2n where n is the size of the feedback shift register. This gives a maximum period of 3.4D1038 for a 128-bit generator, which is certainly acceptable for most applications.
Mersenne Twisters are a form of generator proposed by Matsumoto and Nishimura in [24] that, they claim, have the rather amazing period 219937-1 and pass "several stringent statistical tests, including diehard". They are based on "Twisted" Feedback-shift-register generators that use Mersenne Primes to produce longer periods (Most generators, particularly congruential and combination generators use normal primes or relatively-prime numbers to achieve a longer period) This generator is very new (Less that one year at the time of writing) and although apparently well received on the Internet, not discussed in other works as yet. The algorithm on initial examination appears to be quite slow but actually lends itself quite well to computer implementations and general Internet research (Comments made on newsgroups) suggest that it probably compares favorably with other non-cryptographic generators.
The above generators are by no means the only FSR generators available. However, they are the only two fast enough to be considered usable for general applications. For example, the FSR-based Blum, Blum & Shub generator is often used in cryptography and considered very good but is far too slow for most uses. [18, NEWS2/94061801.HTM]
Algorithm M is a method originally proposed by MacLaren and Marsaglia for combining two random sequences in a computing-friendly (I.e. fast) way to produce a "considerably more random" output. [1] It is named after it's discussion in Knuth where it is referred to as just "Algorithm M". Unlike combination generators discussed earlier this generator is designed to work with any sequence and not just congruential generators. Knuth mentions that if the two source generators periods are relatively prime with respect to one another that the period of the product of both generators, and this is true in the case of most combination generators. Knuth doesn't actually offer a proof of this, it is left as a (relatively easy) exercise for the reader.
The Lagged-Fibonacci generator is dealt with in great detail by Marsaglia, [21] and as the name implies is closely related to the additive congruential generator discussed above. The generator has the formula f(x)=f(x-r)¿f(x-s) where r and s are the "lag" constants for the generator and ¿ is a mathematical operation. The additive congruential generator follows this formula where r=1, s=k and ¿ is addition followed by mod m. Lagged-Fibonacci generators are typically dealt with separately from congruential generators as the theory behind choosing the correct values is radically different. Marsaglia goes on to show that Lagged-fibonacci generators using multiplication are the only non-combination generator he studies that passes all the "diehard" [19] tests. However, it is not clear whether the comments in [26] were made before or after this comment - see section 2.5.2.
Marsaglia also gives two combination generators in [21], COMBO and NCOMBO, based on two specific cases of the Lagged-Fibonacci combined using simple subtraction. Marsaglia shows that the result is indeed more random as long as neither generator degenerates, based on work done by Brown and Solomon and Marshall and Olkin, however the proof is rather complicated. The theory is applicable to generators other than Lagged-Fibonacci, and Marsaglia also analyses the McGill "Super-duper" MCR/LFSR combination generator in the same paper.
A cross between the lagged fibonacci generator and the Feedback-with-Carry-Shift-Register generators are the "with-carry" generators, and although usually considered separately from Lagged-Fibonacci generators are essentially of the same class. Add-with-carry generators are basic additive congruential generators that add in the concept of carry as used by FCSR generators. Subtract-with-borrow and multiply-with-carry generators follow along the same lines using subtraction and multiplication instead of addition. Add-with-carry and subtract-with-borrow are not particularly powerful generators, however multiply-with-carry has been shown to perform well in tests and have a long period as well as being very fast. These generators have been studied extensively by Marsaglia in [18], [19] and [25]. Knechtel writes: [26] "The best of [the multiply with carry generators] seem to pass all the Diehard tests (See section 3.3), even tests that many common pseudorandom number generators fail." Little independent study of these generators exist yet, however Marsaglia himself is bullish: "I predict it will be the system generator for many future computers", and claims a period of 2 f(x-1)+c mod 232, given if a is chosen "from a large set of easily found integers".
Unfortunately, the comments above are not dated and the diehard suite [19] has evolved over time; the tests the latest version employs are no longer the same as the ones discussed in [21]. As mentioned in [25], Marsaglia didn't examine this generator in paper [21] which may mean he was unaware of it at the time.
The descriptions of tests given here are necessarily brief - most of them would take at least two full pages each to describe in detail. The information here is meant to be sufficient to understand the comments and discussion on the generator and to give the reader some understanding of the analysis routines that will be written as the implementation part of this project. When applying these tests, it is important to remember that it is almost always possible to construct a non-random algorithm which a specific test will not pick up on as non-random - such an approach is not a proof that the test does not work.
Formulae for the statistical tests and values of S(x) (Needed for statistical analysis of the output results) for tests are given, as these are often wildly different between texts, particularly in the case of Kolmogorov-Smirnov, so "standardised" versions are necessary for implementation. Where this isn't possible, due to the complexity of the equations or a requirement for lengthy lookup tables, a specific work with good examples is cited. Most of the tests here are "standard" tests used and discussed by almost every basic textbook on the subject, so specific citation is only given to unique points of note.
Knuth [1] points out that there is no such thing as a "Random Number", as you cannot say whether a single number on it's own is random or not. For example, is the number 3 random or not? We cannot answer this question. It is assumed here that all the generators produce integer values from 0 to r, as this is true in all but one case for the generators we have looked at. Following on from this, is the sequence 3, 3, 3, 3 random? Possibly, as it is just as likely as any other sequence of four numbers. When testing random numbers, we test probabilities - a generator that produces many threes and few other numbers over a long period of time is highly unlikely to be unbiased and, in statistical terms, we "reject the hypothesis" that it is unbiased. As Paul and Balmer [15] note, "You can try to be too perfect. Genuine random sequences contain subsequences that are highly systematic".
There are several tests not considered here. There has been much work published on the spectral test (Also known as Fourier analysis) in many different forms however it has a theoretical base closely related to choosing good values for linear congruential generators. Theoretical tests have also not been discussed as they require human (rather than algorithmic) analysis on the structure of the generator. It is somewhat surprising to see that almost all the tests have their origins in work done by Kendall and Babington-Smith in the 1940s and that Knuth's [1] work in 1969 is still probably the most valuable text on empirical testing - Marsaglia actually refers to the standard tests as "Knuth's" (See quote in section 3.3) rather than Kendall and Babington-Smith's. I have limited statistical tests to just the three most common which measure different properties of the generator as any common statistical test could be applied.
In the interests of readability, the following notation is used throughout, regardless of the notation used in the source documents as almost every book uses different notation.
n - Length of random sequence being analysed. (I.e. Number of values being analysed) "Total number of independent observations" in statistical terminology.
r - Number (Mnemonic: Range) of possible outputs. (For example, r=65536 if possible integer outputs are in the range 0-65535 inclusive)
Or - Observed frequency of value r.
Er - Expected frequency of value r.
F(x) - PRNG as a function, 0Tx<n
S(x) - C.d.f of expected output
p - Probability of a given distribution output, calculated using S(x)
The Chi-squared (æ2)
test verifies that a random sequence is distributed as we would
expect. In all the generators discussed here, the output is initially
evenly distributed real numbers between two values. To do this, a
count of the number of occurrences of each value s
is taken, referred to as Os.
Using the equation
we
can measure the "goodness of fit" of a particular
generator. In all cases studied here, Es=n/r as
the generators raw output are all evenly distributed between two
numbers. Once we have the æ2
value, the value of p can be
calculated to find out how likely such a value is and therefor
if the value is "suspect" or not. Is it possible to
approximate appropriate values of p for æ2,
(p cannot be calculated
exactly) however the formula to do this is very complicated
and most mathematical libraries include code to do this as standard.
If only a "Fail" or "Pass" indication is
required, it is faster to use a table of critical values, such as
those given in most statistical texts, Most books on the subject
assume this is what is required, however for our purposes we will
need the actual values to implement Knuths suggestion of running the
Kolmogorov-Smirnov test on the outputs of many æ2
tests, as discussed below.
Knuth points out that applying this test to a very large sample of output has it's problems however, as although it picks up general long-term bias in the output, it won't always detect what Knuth terms "Locally non-random behavior", for example often-repeated sequences of numbers. These could be picked up if they are frequent enough and biased heavily towards one set of numbers, but they could easily be balanced out long-term by other locally non-random behavior. Although we are constrained by the general requirement that EsU5 for æ2, [1] it is best to keep n as small as possible and repeat the test several times. This applies to all the other tests discussed here as well, although some tests such as the serial test produce a small amount of output for a large r so there is only likely to be enough data in even a large sample for a single æ2 test.
When applying this test many times on one generator, we need a method of measuring the output more formal than simply the standard statistical "reject the hypothesis" method, as with a large enough number of independent tests this is likely to occur at least once. This also checks that the distribution of p is reasonable - i.e. the generator is "globally random". To do this Knuth proposes applying the Kolmogorov-Smirnov test to the values of p. (See section 3.1.2) The æ2 test is inappropriate for this task as it is not well suited to handling relatively small n with infinitely variable real, as opposed to a fixed number of integer, values. Other authors unfortunately assume a single test and don't discuss this.
It is important to note that æ2 requires independent tests. For example, if applying æ2 to the serial test in section 3.3, if c=2 and a=2, x should be sampled as 0, 1, 4, 5, 8, 9... to avoid using any values twice. However, Marsaglia [21] has produced modified versions of many of the standard empirical tests which are suitable for running with "overlapping" values.
The Kolomogorov-Smirnov (KS) test is a measure of the maximum difference between the expected and actual distribution. It is more suited to evaluating real numbers rather than discrete integer values, however it can detect non-randomness that æ2 fails to, so it is a useful test. This gives a formula for KS of D=n½maxsF(x)-S(x)s, which like æ2 produces a values which can be calculated to give p to see if the value is too high or low. Fortunately, p can be approximated for large n using the formula S(x)=1-e(-2x2). Knuth suggests n=1,000, although this only gives 2 decimal places of accuracy. With modern computing hardware, n=100,000 gives a better compromise between speed and accuracy. (3 d.p.) This does mean that a large number of KS/æ2 tests will need to be carried out for each generator unless a table of critical values is used. (The same formula is used for applying KS to KS results) There are many versions of the KS test, so care should be taken to use the correct tables/formula. Knuths formulae from [1] are used here primarily because his descriptions on KS are better and he presents the formula for S(x), although the form of equations he uses are not the most common-version.
Algorithmically, KS is best calculated by ordering the data from X1 to Xn (Ascending order if S(x) increases as x increases and vise-versa) and finding max(max(j/n-S(Xj), max(S(Xj)-(j-1)/n)), 1TjTn. (The c.d.f. for evenly distributed random numbers from 1 to r-1 is S(x)=x/r, 0Tx<r) The requirement to order the data and the use of floating point operations means that the test is slower than æ2, particularly for very large n. There is no reason however to use the same subdivisions of data that æ2 uses, so it is more efficient to divide the data into smaller segments then apply KS again to the results.
Sometimes referred to as the "serial correlation test" or "autocorrelation test" when applied to random number theory, unlike æ2 and KS the correlation test (Specifically, the Pearson Product Moment Correlation Coefficient) checks the relationship between successive values - that is, it is dependent on the ordering of the output values within the test set. The previous two statistical tests are independent of ordering so won't detect non-random sequences like 1, 2, 3... as long as the set has the expected distribution. Banks et al. [13] and Kleijnen and Groenendaal [16] give a full description of this test when applied to random number sets, and fortunately PPMCC is a standard statistical test and details of it's calculation don't vary between texts. The correlation test is not often applied however as it measures the same properties as the serial test discussed in section 3.2.1. (No source referenced considers both the correlation test and the serial test)
The serial tests applies the statistical frequency tests in section 3.2 to tuples of numbers, (F(x), F(x+a), F(x+2a)...F(x+ca)). This is equivalent to evaluating a generator with r=rgeneratorc, so is unfeasible for c>2 for most generators. (Assuming an 8 bit generator, if c =4 then æ2 tests on 21 billion values would be required to satisfy EsU5.) Although even books as recently as 1993 [15] regard the serial test even with c=2 as computationally too difficult to carry out generally, modern hardware should be capable of running tests in a reasonable amount of time.
Marsagla [21] proposes the "Overlapping M-Tuple test", which is a variation of the basic serial test with the m in the title being the c of the serial test, using PPMCC to "modify" the output to be appropriate for æ2 tests despite not being independent values. Marsaglia analyses data as bit-streams rather than discrete numbers and takes only three bits of a number at a time for this test which makes use of a larger c feasible.
The gap test checks the gap between numbers follows the expected distribution, which Banks et al. [13] show is S(x)=1-0.9x+1. As S(x) is non-linear, it is necessary to vary the size of the frequency counts to satisfy EsU5. For example, if n=10,000, E71=0.56, so E71+=5.1 should be used instead. Counting the frequency of each gap size is algorithmically very easy and fast compared to some other tests. Kendall and Babington-Smith's original version only counted the gap between zero digits [5] although with modern computing it should be possible to apply this test to all possible values.
There are two ways to conduct a poker test. Knuth's [1] method involves testing for two, three, four or five of the same number recurring in sets of five random numbers. With a large r, this is not really feasible however. (If r=65536 and n=100000 EpairY1) The more commonly discussed method, such as in Banks [13] counts the reoccurrence of the same digit within a random number. It is important to note however that if analysing in decimal a number produced by a binary generator, the first digit may not be in the range 1-10 and the expected distribution should be adjusted as appropriate. Although only mentioned in an exercise in [1] and not discussed in detail anywhere, the same test could be carried out using hexadecimal, octal or even binary digits instead of decimal.
The coupon collectors test evaluates how long a run is required to get at least one occurrence of every possible number. We cannot apply this test to all generators, as some may always degenerate or loop before they generate every possible value and also requires a very large sample set compared to other tests and may be very time-consuming on slower generators. (For one output value, at least r random numbers will have been generated) Knuth [1] discusses this test in detail but other books (Which all cite Knuth as a reference) have not mentioned it. This is probably due to the fact that the coupon collectors test is difficult to implement and, as Wichmann & Hill [5] point out, measures the same properties as the gap test.
The run test as used by Knuth checks that the length of "runs up" and "runs down" of random numbers matches the expected distribution. The formula to do this is somewhat complicated however, as successive values are not independent of each other so æ2 cannot be used. Knuth[1] gives formula, which are quite long and involve tables so are not given here, which allow the calculation of a statistic which can be subjected to æ2. Kleijnen and Groenendall [16] state "We again use a æ2 statistic to test whether the generator is acceptable. The standard æ2 test, however, does not apply because of the dependence between the successive [run lengths]" but do not give the test which is used.
Kleijnen and Groenendall also have a variation on this test of "Runs above and below the average" which functions exactly as the name implies - checking that the distribution of runs above and below the expected mean. (0.5r in our case) This picks up generators that tend to produce long runs of numbers higher or lower than average, something most other tests are only slightly sensitive to.
Knuth [1] gives an algorithm for counting the frequency of orderings of random numbers. For example, if using sets of three random numbers (a, b, c) together this test counts the frequency of each possible combination of the three numbers when ordered, of which there are six (3!) combinations. (a<b<c, a<c<b, b<a<c, b<c<a, c<a<b and c<b<a) This test, along with the Maximum-of-t test, isn't even mentioned in any other books or papers on the subject which may mean that this test is no longer considered useful. Despite this, I can find no reference that actually shows this specific test to be a poor test. As with the overlapping-permutation test discussed below, it may simply be that this test does not pick up any non-random attributes not found by other tests such as the run test and serial test.
Marsaglia's version of this test is the overlapping-permutation test, which as the name implies is a variation of the permutation test with overlapping sets of data. Marsaglia himself doesn't regard this test as particularly good: "[Overlapping permutation] tests...are not very stringent; most generators seem to pass them..."
The maximum-of-t test, again only discussed by Knuth [1] tests the distribution of the maximum of t random numbers is distributed as expected. This is a simple and quick test to implement although again there is a strange absence of this test from other books.
Although often referred to as a "statistical" test, Maurers test presented in [11] is not a general statistical functions like KS or æ2. It is geared more towards hardware cryptograpical uses, but it does fail more modern flawed software generators that the earlier tests don't pick up, so it still useful in non-cryptographic software situations. Maurer himself states that although the test is designed to defeat hardware generators, a good software generator should certainly pass his test. The basis for the test is trying to detect whether a generator has detectable "memory" (That is, that later values are based in some way on earlier ones) Maurer claims his test will fail any generator that fails another test presented here and although Maurer presents no empirical or theoretical proof of this, although it is fairly likely to do this based on the nature of the test. Being a comparatively recently developed test, there are few good discussions on it's relative merits, although it is generally regarded as a good test.
In [19] and [21], Marsaglia presents a suite of tests apparently of his own devising rather more complicated than those previously studied, which are "more difficult to pass than [Knuth's] tests that have become standard". Although there is little independent discussion of these tests in texts, primarily because they are significantly more difficult to analyse that other tests, they are considered worthwhile tests. Each test is discussed briefly, although the fact the formula presented by Marsaglia are very complicated and the lack of independent discussion on them means that descriptions are somewhat brief. Two of his tests, the M-Tuple test and Overlapping-permutations are modifications of the standard tests and have already been discussed in section 3.2.
Although proof of the tests is given in most cases, Marsaglia doesn't discuss why this particular group of tests as a whole is useful, a discussion which is lacking in most books as well. As more powerful extensions of the "standard" tests they should perform well in practice and passing these tests is considered a very good indication of the suitability of a generator for non-cryptographic uses but it remains to be seen if any of the more obscure generators and particularly Maurer pick out poor generators that these "stringent" tests fail to.
OPSO is an unusual test that tried to overcome the problems of requiring large amounts of memory and a large number of tests to do serial tests with large r and c=2.
This test actually applies a probabilistic algorithm of "parking cars" in an n-dimensional space and compares the result with one from a known-good random source. (There is no theory to solve this problem that is not probabilistic) Although suggested elsewhere, this is the only actual discussion about a specific implementation of a probabilistic problem.
The Birthday spacings test as presented in Marsaglias paper appears to be related to the standard gap test, although the description presented is somewhat confusing and incomplete. Marsaglia appears not to have published a proof or detailed description of it yet, only source code. Apparently, "...a detailed discussion of the test will appear elsewhere" although no indication is given as to when or where this might be.
Marsaglia's description of the "ranks of binary number matricies" test is also unclear, and his test results suggest that this is not a powerful test as it does not fail a generator that is not failed by several other tests. It is, apparently, designed with simulation modeling in mind and the availability of source code in [19] may make this test implementable, even though the paper appears to give insufficient information for this.
At this stage, without any hard empirical results to refer to, it is difficult to draw any conclusions, particularly with regard the the suitability of the tests. All the tests discussed above are "accepted" by the community as valid tests so, barring implementation problems, none should spuriously fail otherwise valid generators. There is some obvious overlap in the generators however, particularly with regard to the standard "Knuth" tests and Marsaliga's work. No suite of tests would be complete without those listed in [1] however, so they have been included nonetheless.
Most of the basic congruential and linear-feedback generators are now considered fairly poor, so the most likely "winner" (The fastest generator that passes all tests) will probably be a form of Lagged-Fibonacci-with-carry generator or one of the two combined congruential generators. Additive Congruential generators have been neglected in recent studies and although likely to fail Maurers test may just prove to be a front-runner despite their simplicity. Lagged-Mersenne Twisters are too new to say much about, although they have been reported to pass all the tests given here except perhaps Maurers.
[1] Knuth, D.E., 1969, The Art of Computer Programming, Volume 2 - Seminumerical Algorithms, Addison-Wesley
[2] Reif, J.H., Tygar, J.D., 1985, Efficient Parallel Pseudo-Random Number Generation, pp.433-446, Advances in Cryptology - Proceedings of CRYPTO '85.
[3] Rairfield, R.C., Mortenson, R.L., Coulthart, K.B., 1984, An LSI Random Number Generator, pp.203-230, Advances in Cryptology - Proceedings of CRYPTO '84.
[4] Newman, T.G., Odell, P.L., 1971, The Generation of Random Variates, Griffin.
[5] Wichmann, B.A., Hill, I.D., 1982, A Pseudo-Random Number Generator, NPL Report DITC 6/82, National Physical Laboratory.
[6] Neave, H., 1972, Random Number Package, University of Nottingham.
[7] L'Ecuyer, P., 1988, Efficient and Portable Combined Random Number Generators, pp.742-749, 774, Communications of the ACM Volume 31, Number 6.
[8] Schneier, B., 1996, Applied Cryptography, John Wiley & Sons.
[9] Kelsey, J., Schneier, B., Wagner, D., Hall, C., 1988, Cryptanalytic Attacks on Pseudorandom Number Generators, pp. 168-188, Fast Software Encryption, Fifth International Workshop Proceedings (March 1998), Springer-Verlag.
[10] Harel, D., 1992, Algorithmics, The Spirit of Computing, 2nd edition, Addison-Wesley.
[11] Maurer, U.M., 1992, A Universal Statistical Test for Random Bit Generators, pp.89-105, Journal of Cryptology, Vol.5, No. 2.
[12] Eastlake 3rd, D., Crocker, S., Schiller, J., 1994, Request for Comments 1750: Randomness Recommendations for Security, Network Working Group.
[13] Banks, J., Carson 2nd, J., Nelson, B., 1996, Discrete-Event System Simulation, Prentice-Hall.
[14] Fishman, G., 1973, Concepts and Methods in Discrete Event Digital Simulation, John Wiley & Sons.
[15] Paul, R., Balmer, D., 1993, Simulation Modelling, Chartwell Bratt.
[16] Kleijnen, J., Groenendaal, W., 1992, Simulation - A Statistical Perspective, John Wiley & Sons.
[17] Lehmer
[18] Ritter, T., Ciphers by Ritter, http://www.io.com/~ritter/
[19] DIEHARD, http://stat.fsu.edu/~geo/diehard.html
[20] Haahr, M., 1998, random.org - introduction, http://www.random.org/essay.html
[21] Marsaglia, G., 1984, A current view of Random Number Generators, Computer Science and Statistics: The Proceedings of the 16th Symposium on the Interface, Atlanta, Elsevier Press.
[22] Author unknown, Random number generaton, RNGs and PRNGs, http://www.helsbreth.org/random/
[23] Moreau, T., 1996, A practical 'perfect' pseudo-random number generator, article submitted to Computers in Physics.
[24] M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator, ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, pp.3-30.
[25] Marsaglia, G., Zaman, A., 1991, A new class of random number generators, Annals of applied probability Vol. 1, No. 3, pp.462-480.
[26] Sullivan, J., 1998, Numerical Analysis & Associated Fields Resource Guide, http://net.indra.com/~sullivan/q10.html
Much of the time taken (Well over one quarter) was spend tracking down appropriate books and papers, for which the Internet has proved an invaluable tool. Unfortunately much of the early work done by Von Neumann and Kendall & Babington-Smith is no longer available and even very large libraries such as the British Library and Library of Congress have no record of the books. The tendency for work in this field to be published purely as conference papers and in journals rather than in books also proved a hinderance. This seems partly due to the success of Knuth's work [1] as no-one seems to have produced any widely-used text on the subject because his work is regarded as being "definitive", despite being considerably outdated. Much confusion was also caused by the lack of standard naming for many tests and generators and wildly varying notation - it is often not possible to tell that two complex tests or generators are identical until some considerable research has been carried out.
Almost all of the time in the initial few weeks was spent narrowing down the areas covered as it was clear from a fairly early stage that certain topics, discussed below in section A.2, were not appropriate or too complex bearing in mind the ultimate implementation aim of the project.
Most of the remainder of the time was spent reading and attempting to understand some of the more complicated tests and generators and following up references given in some of the more useful works. Some of the maths involved required revising old mathematical courses or researching mathematical concepts (Such as Mersenne Primes used by the Mersenne Twister generator) new to the author. References are not given for books used for this purpose as they are generally outside the scope of this project and not useful to the reader.
Once the bulk of the topic had been understood, the actual writing of the report took relatively little time however, as is to be expected, this was hardly a linear effort and new ideas or problems were introduced with almost every writing session, entailing more research.
Many areas, most notably theoretical analysis of generators, were too complicated and would not be useful topics to study for producing a suite of programs for generating and analysing random numbers. Cryptographic areas of study were also dismissed fairly early as being too complicated. More time was initially spent looking at converting the uniformly distributed output produced by the generators here to patterns following other distributions however this is really outside the scope of the project as described in the initial project proposal and was dropped. Although useful in a few specific types of simulation modelling, in general an equally distributed integer generator is sufficient for most applications. Also, most of the work in this area is primarily mathematical and not computer-science based.
The most obvious extension of the work discussed here is theoretical analysis of many of the newer generators, particularly Mersenne Twisters. This is a topic best pursued by a mathematician rather than a computer scientist, however, as proper theoretical study is not a case of simple algorithmic work but requires at times some fairly complex mathematics.
Another approach would be an actual suite of generation and analysis tools, which is the eventual aim of the implementation phase. The only suite currently freely available seems to be Diehard [19], however the choice of tests used appears, in the currently available version, to be limited to tests devised by the author himself. However, the suite is widely referenced and does seem to be regarded as something of a "benchmark" in the field. The basic design is not particularly expandable, having been converted by and large automatically from Fortran source and as such does not lend itself well to experimentation by other researchers.
A somewhat more time-consuming and exhaustive topic would be a full-blown article or book on the subject of non-cryptographic random numbers, rather than just a literary review, including analysis of some of the more recent developments such as work by Fishman, Maurer and Marsaglia - essentially an updated version of Knuth's work in [1]. This is still the most widely used reference in the field but only discusses congruential generators and the Kendall & Babbington-Smith tests and is sadly out of date in many areas.