Cryptographic Randomness

[8 May 2001]


Cryptographic operations call for randomness, not only for the generation of keys, but also for some protocols.  In each case, the true goal is to have a value that the adversary cannot predict.  If we can get a value that no one (even the proper user) can predict, then we have met that goal.  A truly random value is one no one can predict and would therefore meet our needs in all cases.  Unfortunately, getting true randomness can be extremely difficult.

Many sources offer randomness but in English one tends to use the word ``random'' in a variety of ways.  Technically, these have radically different meanings.  For example, one might define five classes of randomness (being only a little cynical):

  1. Quantum randomness - to the best of our knowledge, this is true randomness.  As far as we know, quantum events are driven by some truly random process.  This could, of course, amount to some lesser form of randomness (one of the next categories), but at the moment we believe these events are truly random.
  2. Secret entropy - this is entropy that is not random but that is hopefully impossible for an enemy to capture and predict.  Such entropy might be elements of the path of a user's mouse when the user is requested to move it around ``randomly''.  It might also include audio samples taken in a room where the enemy has no chance to plant a microphone or to turn some existing device into a microphone.  The LINUX /dev/random driver samples various operating system device states that have been determined (or assumed) difficult for an enemy to sample and uses those states as secret entropy.
  3. Obscure complexity - this is entropy that appears random to a designer, not because it really is quantum randomness or secret entropy but rather because the designer doesn't understand the process that generates it.  Such might include operating system states that appear to be changing arbitrarily only because the designer doesn't know how the operating system works.  To one who does understand the process producing these values, they might not be obscure at all.
  4. random() function - this represents a source of algorithmically generated values that are declared random by a battery of statistical tests.  This is usually provided as a library call in some programming language and is advertized as a source of random numbers.  These values are suitable for things like Monte Carlo simulations but have no cryptographic value.  A random() function can be thought of as a PRBG of no cryptographic value, usually using a very small seed (16 or 32 bits, in many cases).
  5. Gee-wow complexity - this is complexity that looks random to a naive user.  The most obvious example in this category is a chaos equation producing images of astounding complexity.  Some have viewed those images, reacted with the thought that the image is surely random, and then proposed that the equations generating those images must be a source of randomness.  Chaos equations are examples of particularly insecure and rapidly searched PRBGs that were not even designed to be statistically random.
There are a variety of hardware devices one can purchase (either separately or as part of a cryptographic processor) that tap quantum randomness.  Such sources should probably still be conditioned. There are many examples of techniques for gathering secret entropy.  To name a few: Items 3 and above in the list of ``randomness'' types are not worth citing, except to warn people against.

Pseudo-randomness

Pseudo-randomness is algorithmically produced and is therefore subject to von Neumann's famous quote (cited by Knuth in his volume 2: Semi-numerical Algorithms)

``Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.''

A good PRBG (pseudo-random bit generator) does not create any randomness.  Rather, it generates a stream of bits that a computationally limited adversary can not distinguish from truly random bits.  However, the PRBG needs to be initialized with some seed value or key and that value needs to be truly random or at least secret from the adversary.  Any adversary can be assumed to know all PRBG algorithms and to try them with any known initial values.  If the adversary can find a way to know (or search for) the PRNG seed, then he or she can trivially predict the future output of the PRNG.  Similarly, if a PRBG key is reused, the adversary can be expected to discover the repetition of ``random'' values and therefore detect non-randomness and predict future values.

A PRBG is not without value.  If one can get a long enough seed that the adversary cannot search the seed space and that seed is at least secret entropy, one should be able to use a cryptographically strong PRBG's output as if it were truly random.  The problem here is that even that PRBG hasn't solved the original problem, which was to find a source of secret entropy.  What it can often do is provide a string of usable bits at a rate higher than the available source of secret entropy can supply.
 



Carl M. Ellison; cme@acm.org