For this reason, having just acquired a copy of PGP 2.6.3ia (that is 2.6.3 with Stale's patch) and being of a paranoid disposition (just as Phil recommends we should be), I decided to inspect the source code for the Key Generation part to see if it contained anything that might compromise the security of the keys that are generated. For the record, I found nothing of the sort. The code makes strenuous (even paranoid) attempts to be utterly fair. However, I did uncover a few features which, under a false pretence of adding extra security, actually made my task of finding the true security much harder. So I thought it might be useful to document the whole process, pointing out the (mis-)features that caused me concern.

**The whole security of the system depends on the fact that the program
must be capable of generating each and every one of these 2**254 numbers with
more-or-less equal probability.**

Next, from each random number, you generate a prime number. Starting from the
given number, you look at each successive number to see if it might be prime.
Well, obviously you only look at the odd ones, and in fact you only look at
every 4th number because it tries to create "Blum" numbers for some reason or
other. It uses a simple sieve to eliminate obvious cases, using a table of the
first 1028 prime numbers. Anything that survives the sieve is subjected to
Fermat's test (if `((x**(p-1)) mod p) != 1`, then `p` is
not prime) using `x = 2,3,5,7` (not a very original selection!).
See the comments in `genprime.c` for why this is considered more than
a safe test. It insists that the two prime numbers generated differ by at
least 1 part in 128.

Next, you multiply your two (alleged) primes P and Q to give the modulus N,
and from P, Q and N the RSA public and private keys can be derived in the
standard manner. I estimate that for a 512-bit key there are
`8*10**73` possible primes P and Q, giving `1.6*10**147`
possibilities for N. For a 1024-bit key, these numbers are increased to
`7*10*150` and `1.4*10**301`. Which shows why a
brute-force attack on an RSA encription is not feasible.

The system uses an array of (currently) 96 32-bit words (i.e. 384 bytes or
3072 bits) known as `randPool`, initially set to zero. The '96' is
required for technical reasons to be a multiple of 4; otherwise I should much
have preferred a prime number (such as 97). It proceeds by XOR-ing supposedly
random bytes into successive positions; in practice always a multiple of 4
bytes at a time (this is a Bad Thing) derived from some 32-bit word. Some of
these words are quite predictable, some are well correlated with other such
words, and some are more-or-less independent. **The crucial question is
whether a sufficient number of them are truly independent**, so as to
ensure the required number of truly random bits.

A certain pattern of such words (examined in detail below) is XOR-ed in
repeatedly (typically for 73 cycles when generating a 512-bit key). When
`randPool` has been filled in it wraps around. But this could mean
that the same positions in the pattern are being XOR-ed on top of themselves,
which is why I would have preferred a prime number for the size of
`randPool`, and why I would have preferred the pattern itself not to
have been a multiple of 4 bytes. Now actually this problem should not matter,
because the `randPool` is "stirred" between each cycle, and if the
stirring is as effective as claimed it makes all the bits totally "forget"
where they were. But I believe that this is a case where "not only should
security be achieved, but security should be *seen* to be achieved".
Stirring is described below. Typically, in a UNIX system, the
`randPool` will be stirred and wrapped 11 times when generating a
512-bit key.

Different systems behave differently at this point, but they all assume that some form of "tick" is generated by the operating system for use for timing processes, giving the time-of-day, etc. In UNIX systems, there are two ticks: one fine tick (possibly every microsecond) derived from the real-time quartz clock for time-of-day purposes and one coarse tick (typically every 10 milliseconds) for timing processes (each time this tick occurs there is an interrupt, and whatever process is running at that moment is credited with 10msec of CPU time - or something like that). Since the random number generation works by counting ticks, it is important to know how coarse the ticks are.

All systems start off the same way (see `noise.c`):

- A.
- A call of
`clock()`. Gives (supposedly) the number of microseconds of CPU used in this process. But it usually relies on a coarse tick (in Solaris it is always a multiple 0f 10000). Moreover it increases monotonically with successive calls, and successive calls frequently return the same value (this part of the process is not CPU-intensive). NOT a good source of randomness. - B.
- A call of
`time()`. Gives the number of seconds since Jan. 1st 1970. Even if the wily eavesdropper is not online to know the exact time you generated your key, the date on which you generated it is indeed public knowledge. Again, it increases monotonically and slowly (this phase of the process should take no more than half a minute all told). NOT a good source of randomness.

- MSDOS
- C.
- A call of
`pctimer0()`. Seemingly this gives an unsigned (hopefully 32 bits) taken from a clock ticking every .84 µsec (but you will not get .84 µsec resolution). But, apart from being monotonic, the values returned should be as random as your typing.

- MAC
- C.
- A call of
`TMTicks()`. Again an unsigned long and presumably as random as your typing.

- Win32 with the Microsoft compiler
- C.
- A call of
`QueryPerformanceCounter()`. An unsigned (hopefully 32 bits) and presumably as random as your typing.

- VMS
- C.
- A call of
`SYS$GETTIM()`. A unsigned long divided by 100000 (supposedly the ticks per update). Presumably as random as your typing, but you will not get much randomness unless you type rather slowly.

- AMIGA
- C.
- A call of either
`ReadEClock()`(the low-order 32 bits) or`am_GetSysTime()`(32 bits of seconds - presumably since 1970 - and 32 bits of microseconds). With`am_GetSysTime()`the seconds since 1970 is pretty useless, since we already had those bits under 'A.'. I could believe the microseconds or the output of`ReadEClock()`. - D.
- 16 bits of additional noise from the video beam position. Neat!
- E.
- The ExecBase dispatch count, whatever that is.

- ATARI
- C.
- A system call giving the 32-bit output from a 200Hz counter and 32
bits from a 50/60/70Hz Vertical BLank counter. Under the Pure C compiler,
only the second of these is taken (the first is said to give the same as
the earlier call of
`clock()`). Coarse as these ticks are, they can still give true randomness if called often enough.

- UNIX

This is the system I have studied most intensively. It calls every system call it can think of (and would have called the kitchen sink too, had it been available). As I will show, this just creates FUD rather than randomness.- C.
- A call of
`gettimeofday()`. This returns two 32-bit words. The first is just the seconds since 1970 which we have already had, so it adds nothing new. The second is the number of microseconds since the last full second. This should be based on a fine tick size, and is therefore as random as your typing, but it is the only source of true randomness that I would trust from the UNIX systems. - D.
- A call of
`times()`. This gives four 32-bit words:- D1.
- The number of coarse ticks in this process. Typically this is one byte increasing monotonically but slowly. Useless as a source of randomness.
- D2.
- The number of coarse system ticks on behalf of this process. Increases perhaps twice as fast as the previous, but still useless.
- D3.
- The number of coarse ticks by children of this process. Always zero, because PGP never forks.
- D4.
- The number of coarse system ticks by children of this process. Also zero.

- E.
- Also from the same call of
`times()`, the number of elapsed coarse ticks since boot time. The bottom byte changes somewhat faster than in the previous case (it is measuring elapsed rather than process time), but it is totally correlated with the call of`clock()`already made, so it adds nothing new.

It will be seen that the only source of true randomness in all the systems described above is the call labeled 'C.' in each case (also the actual key struck). Everything else is more-or-less predictable FUD. This call measures the instant at which each keystroke occurred (possibly modulo 1000000 microseconds) using as fine a tick size as the system affords. Where the call gives a time in microseconds rather than a tick count (systems AMIGA and UNIX) it endeavours to determine the ticksize by experiment. It usually overestimates it (which is safe) because the length of the measuring loop is longer than the ticksize (for example, on a SPARC1+ running Solaris2 it computes a ticksize of 11 microseconds, whereas the true value is believed to be 1 microsecond).

It now computes a time delta, modulo any computed ticksize, between this call
and the previous one. Now this is a truly random estimate of the time between
keystrokes but, oddly, this value is not incorporated into `randPool`
(but it is represented by the absolute time of the keystroke which is so
included). Since some systems (e.g. VMS and the ATARI) use an extremely coarse
tick, this delta may contain only a few bits; so the bits in the delta are
counted, and it carries on looking at keystrokes until the cumulative value of
these bits is equal to the length of the key being generated (plus a few extra
random bits needed for other purposes). But the maximum number of bits taken
from any one keystroke is 8, so with fast systems it always demands more
keystrokes than it truly needs.

I believe that when counting the bits in delta it counts one more than it
should (the most significant bit should not be included in the count in my
opinion, since it is by definition always '1', and only serves to tell how many
bits there are). Note also that the value of delta will have a minimum (it is
impossible for the time between two keystrokes to be absolutely zero), which
is a further reason to discount one bit when counting the bits of delta (see
`trueRandEvent()`).

Its modus operandi is to encrypt the whole `randPool` using the MD5
algorithm (that's the one used for generating signatures). That is why the
length of the `randPool` had to be a multiple of 128 bits. It starts
from a zero key (I told you it was going to be deterministic). By the time it
has got to the end of the `randPool`, the last bit deposited is a
function of every preceding bit. It then encrypts it all again, so that every
bit in the new `randPool` now potentially depends on every bit in the
original `randPool`. If you start with an empty `randPool`,
it will now appear (to the naked eye) as a complete random mess. If you start
with a `randPool` containing just one bit, it will appear as a
different random mess.

Finally, it takes the first 128 bits of the new `randPool` as the
starting key for the next stirring.

Note that the purpose of stirring is not to generate extra randomness. It is
merely to redistribute the randomness that is already there (derived from your
keystrokes). So it matters not whether the `randPool` is completely
filled by your keystroking, nor that the two sets of 256 bits that you extract
from it in order to make your 512-bit key are not the actual bits that you
randomly generated. Every bit that you take out is affected by every random
bit that you put in (as well as by the many non-random, predictable bits that
the UNIX systems in particular insert). So it is still possible to generate
each and every one of the possible keys, even when starting from a known time
of day and a known number of ticks since boot time.

But I would have been much happier to see a more straightforward system. It
took me three days and much work examining the operation of the program with a
debugger to convince myself that most of the information being put into
`randPool` was mere FUD, and to identify exactly where the true
required number of random bits was actually coming from. In view of the
excellent performance of `randPoolStir()`, I cannot see that any of
the FUD serves any useful purpose. Well, I can see a little merit in
including the time and date just once, in case any extremely regular typist
manages to get *exactly* the same intervals between each keystroke
during her interpretation of William Tell, or unless some other flaw is found
in the system (which is also why I would like to see the wrap-around of the
`randPool` a little less regular, as already mentioned).

So I have at last felt it safe to generate my PGP key. And as a final check against any wily eavesdropper who may have been compiling a complete set of all the popular 1024-bit keys (will the universe last long enough for him to complete his task?) I have set my key length to 820 bits. How's that for paranoia?