Ultimate Guide to Randomness

20 Things You Need to Know About Random Numbers and Entropy

  1. Why should I care about random numbers?
  2. Where do random numbers come from?
  3. What’s wrong with how most random numbers are made?
  4. What problems do Whitewood solve?
  5. What’s the difference between random number generators?
  6. Should I be concerned with random number generation in virtual machines?
  7. What’s the challenge when using /dev/random?
  8. What’s the danger in using /dev/urandom?
  9. What’s the issue with pseudo random number generators (PRNGs)?
  10. What’s the difference between randomness and entropy?
  11. Isn’t entropy all around us – what’s the problem?
  12. Whose job is it to generate good random numbers?
  13. How do I know if I have a problem with random numbers?
  14. What standards relate to the issue of generating random numbers?
  15. What’s the benefit of centralizing entropy generation?
  16. How does Whitewood safely deliver entropy or random data over a network?
  17. What applications need the best random numbers?
  18. Can I rely on HSMs to generate random numbers?
  19. Why is quantum mechanics the best source of randomness?
  20. Isn’t quantum years away – is it ready for prime time?

1. Why should I care about random numbers? Random numbers are used throughout computer systems for many purposes; creating process IDs, shuffling data and adding texture to graphics are just a few examples. In most cases it doesn’t really matter if the random numbers aren’t truly random. But in other situations such as statistical modelling, gaming and security applications, random numbers need to be truly random. The most obvious case where randomness is critical is with cryptography. Random numbers are used to make keys and keys need to be perfectly random. Any patterns within the key give the attacker clues and make it easier to crack. Making perfectly random numbers is hard, much harder than you would expect and checking that it is working correctly can be the difference between crypto that is safe and crypto that isn’t safe. Learn more about the impact on crypto security in this blog.

2. Where do random numbers come from? Good question! Most people don’t know how random numbers are generated; they are largely taken for granted. In practice, most random numbers are generated by the operating system. All applications running on a given system tend to get random numbers from the same place. It doesn’t matter whether an application needs just a few bytes or many megabytes of either perfect randomness or ‘nearly’ randomness; it all comes from the same place. Just like in your own home, everything that uses water generally gets it from the same water supply. While simple, this approach forces critical security applications to compete with more mundane tasks for randomness, which can be a serious issue if randomness becomes a scarce resource. Learn more about random number generation in Linux.

3. What’s wrong with how most random numbers are made? Most random numbers are made by software. The problem is that software doesn’t act randomly. Software algorithms need to be supplied with random ‘seeds’ to create numbers that are sufficiently random. This raises obvious questions such as how do you prove the seeds that are use are actually random? Do they come from a reliable source? How often does the software need to be ‘reseeded’ in order to stay random? And perhaps most importantly, how can you tell if something goes wrong and the random numbers stop being random? Unfortunately the answer to all these questions is “it’s difficult or impossible,” not the right answers if you face a security audit or if you want a reliable and secure system. Learn more about how random numbers are made.

4. What problems do Whitewood solve? Whitewood solves two fundamental problems – how to generate sufficient, high-quality random numbers (enough to supply an entire data center) and how to deliver those random numbers in a way that applications can easily consume them (without needing the apps to be modified). Whitewood’s solutions will give security and operations staff confidence that all applications have consistent access to the best quality random numbers. Without Whitewood, each application instance can only rely on whatever randomness its local environment can deliver, which in the case of a virtual machine is not very much. With Whitewood, the challenge of creating randomness is taken off the table, applications are more secure because they are inoculated against entropy starvation! Learn more about the network delivery of random numbers.

5. What’s the difference between random number generators? There is a huge difference between the variety and type of random number generators (RNGs) in the market today. Most RNGs fall into one of two broad categories. The first category includes RNGs that that take a relatively small amount of random data, often only a few hundred bits and use an algorithm to extrapolate that randomness to create a much larger volume of ‘random’ numbers. These software based RNGs are known as Pseudo-Random Number Generators (PRNGs) or Deterministic Random Bit Generators (DRBGs). The other category is known as True Random Number Generators (TRNGs) or Non-deterministic Random Bit Generators (NRBGs) these RNGs can be software or hardware based and generate random numbers based entirely on freshly-made random data, no data extrapolation algorithms are used. Within this class of TRNGs there are two subclasses: those devices that capture random events that they are able to sample in their local environment (e.g. keyboard usage, network traffic etc.) and those devices that create their own randomness using a dedicated source of entropy. The Whitewood Entropy Engine falls into this latter category, a true random number generator with a dedicated source of quantum-derived entropy. Learn more about the Whitewood Entropy Engine.

6. Should I be concerned about random number generation in virtual machines? In a word, yes. Generating random numbers relies on having access to a strong source of randomness. Software alone can not generate randomness since it is fundamentally deterministic. Randomness comes from the physical world where noise signals, user activity or natural events can be sampled and analyzed to create random data. Unfortunately the process of virtualization breaks the connection between applications and the real world. While there are many advantages to this abstraction in terms of scalability and flexibility when deploying applications, it creates a virtual firewall for randomness. Unfortunately there is very little randomness in the virtual world. Worse still, VMs that are replicated can contain copies of the same entropy and internal state which means the random numbers they generate are no longer independent. Learn more about Whitewood’s approach for supplying randomness to the virtual world.

7. What’s the challenge when using /dev/random? Probably the most secure source of random numbers that is native to the Linux operating system is /dev/random. Confidence in /dev/random comes from the fact that it will only provide random numbers if Linux believes that it has sufficient entropy to generate them securely. Even though /dev/random is based on a pseudo-random number generator algorithm it can still deliver high-quality random numbers because it is re-seeded with fresh entropy every time a random number is requested. As a safety feature /dev/random will freeze and provide no output if Linux believes it has insufficient entropy to generate a random number. This blocking behavior can severely impact application performance. Whitewood’s solution ensures that Linux always has sufficient entropy to provide random numbers via /dev/random – effectively making it a non-blocking service. Learn more about random number generation in Linux.

8. What’s the danger in using /dev/urandom? The most widely-used source for random numbers in the Linux operating system is called /dev/urandom. The ‘u’ stands for ‘unblocking’. While /dev/random will freeze if Linux has insufficient entropy, /dev/urandom will supply ‘random’ numbers regardless of the level of available entropy. The fact that /dev/urandom is always available makes it a popular, but potentially unwise choice for developers. Whitewood’s solution addresses this vulnerability by supplying sufficient entropy to Linux such that the /dev/urandom pseudo-random number generator algorithm can be frequently re-seeded to refresh the randomness in the system and overcome any concern that attackers can predict the random number output. Learn more about random number generation in Linux.

9. What’s the issue with pseudo random number generators (PRNGs)? PRNGs (also referred to as deterministic random bit generators or DRBGs) are an essential part of most cryptosystems. PRNGs are software programs that use well-known algorithms to extrapolate relatively small amounts of entropy (random ‘seeds’) into the larger amounts of ‘random’ data needed by applications. No matter how well a PRNG is designed, it cannot create entropy or randomness. If a PRNG is only seeded with low entropy seeds then the output of the PRNG, no matter how large the data set might be, will also have a low level of entropy. Dealing a pack of playing cards provides a useful analogy. The PRNG is equivalent to the process of dealing the cards whereas providing the random seed to the PRNG is equivalent to the process of shuffling the deck; the greater the number of cards being dealt, the greater the need to shuffle the pack. PRNGs may suffer from error-prone implementation or even intentional backdoors but even if they are built correctly, their core security depends entirely on the randomness of the seeding process. Learn more about creating good PRNG seeds across the datacenter.

10. What’s the difference between randomness and entropy? Entropy is the statistical measure of disorder within a set of data. Data with the highest levels of entropy has the lowest levels of structure or correlation. For data to be truly random it has to have a high level of entropy, usually measured as a fraction of a bit-per-bit of data in the sample. An 8 bit number might contain 7.2 bits of effective entropy. But entropy is only part of the quest for randomness. The most demanding applications such as cryptography require random numbers that are not just statistically random but also unpredictable. For example, the digits of the number Pi (3.1415926535897932384… etc.) may appear random and contain statistical entropy but are easily predicted. Proving that data is random goes beyond measuring entropy. Proving randomness requires knowledge of how the data was originally generated. Randomness is something that needs to be architected into a system from the outset, not something that can be measured retrospectively. Learn more by reading our technical whitepaper on strengthening the crypto infrastructure.

11. Isn’t entropy all around us – what’s the problem? Yes, it’s true that we live in a world that is full of entropy. But, harnessing that entropy to create random numbers in computer systems is not so simple. Application developers have invented numerous ways to scavenge natural entropy, even resorting to using video cameras, microphones and radio receivers to pick up cosmic radiation. But this patchwork of tricks and tools leads to huge inconsistency across platforms and worse still, in virtualized environments these natural sources are simply not available. In many cases the only sources of entropy are internal timing signals and jitter created by software process, few of which are truly random. The best solution is to deploy purpose built sources of randomness. Learn more about the Whitewood Entropy Engine.

12. Whose job is it to generate good random numbers? Even though it is a crucial activity in all cryptographic systems, random number generation rarely has any clear point of ownership. Most random number generators are buried deep in the operating system and the quality of entropy sources is almost always hardware and environment dependent. They are blind to the needs at the application level. Similarly, applications are often confined to containers or abstracted from the hardware by layers of virtualization which makes it impossible to validate the quality of the random numbers they receive. Attesting to the quality of random numbers requires knowledge of the entire IT stack – from physical environment, hardware, hypervisor, OS and application. Few security or operations professions have that level of visibility or control across a modern data center. This is cause for concern in many IT departments. Learn more about the challenge of controlling random number generation.

13. How do I know if I have a problem with random numbers? Unfortunately that’s a very difficult question to answer – you probably wouldn’t know until it’s too late. Some tools do exist to track the generation and consumption of entropy in the operating system but these tools are unreliable and are difficult to use on an operational basis. It is possible to measure the statistical randomness of keys but that only tells part of the story. The critical issue is assessing their unpredictability, something that is almost impossible to do without full knowledge and control of the entropy sources that are being used and quality of the entropy they gather. The reality is that all random numbers look the same. If you’re worried about generating weak random numbers then prevention is more likely to be successful than detection. Random number generators have to be architected correctly from the ground up. Learn more about measuring entropy and randomness.

14. What standards relate to the issue of generating random numbers? Many tests for measuring the statistical randomness of data have evolved over the years, some more stringent than others, but none are perfect and all fail to provide a complete assessment of the effective security of a random number generator. Proving randomness is so tricky that few formal standards exist; even well-established crypto product certifications such as FIPS 140 regard the sources of entropy to be out of scope. The National Institute of Standards and Technology (NIST) in the US has proposed a suite of three standards to cover both deterministic and non-deterministic or true random number generators: SP 800-90 A/B/C, the first of which has already been finalized; the others are expected to follow by the end of 2016 and will likely become part of certification testing soon after. Learn more by reading our blog on key generation.

15. What’s the benefit of centralizing entropy generation? Entropy generation has traditionally been considered a ‘local’ issue, something that is handled locally on each host machine. As systems become more distributed across virtualized environments and on consumer devices, the concern over consistency and security increases. In cloud or hosted environments there is often no control over the physical hardware or local environment, and remote delivery of entropy or random numbers is likely to become essential – the concept of ‘bring your own entropy.’ Across private data centers and shared or hosted environments entropy services might one day be considered an essential ‘utility’ service. Entropy would be universally available in the same way that time and date services are delivered to servers and network appliances today. From a security point of view, entropy generation is too important to be left up to the client device alone. Learn more about network delivery of true random numbers.

16. How does Whitewood safely deliver entropy or random numbers over a network? It’s a reasonable concern. Whenever sensitive data is delivered over a network there is the risk of introducing new vulnerabilities and points of attack. To minimize this risk, Whitewood’s netRandom product secures the client server connection such that an eavesdropper is not able to observe the entropy that was provided to the client application, substitute rogue entropy into the system or to spoof the entropy server. Whitewood’s netRandom Client combines entropy provided over the network with entropy created locally to ensure that even if the netRandom Server has been completely compromised that compromise is not extended to the Clients and the applications they support. Learn more about secure network delivery of entropy.

17. What applications need the best random numbers? Applications that encrypt data before it leaves your control are critical. In these cases, the only thing that separates the attacker from your data is the quality of your keys. Any applications that incorporate SSL/TLS for protecting internet or VPN traffic and those that protect data at rest though file or disk encryption should have access to true random numbers. Other applications that involve the generation of long-term keys should always be a priority since the impact of compromise and cost of replacing keys can be considerable. PKI-based applications that involve the issuance of credentials and creation of digital signatures will fall into this category. Finally, there are specialist applications such as payments, gaming and cryptocurrencies, many of which will be regulated and subject to compliance requirements. Learn more about deployment scenarios for true random number generators.

18. Can I rely on hardware security modules (HSMs) to generate random numbers? HSMs contain hardware-based random number generators; however, those random numbers are typically only available to applications that have access to the HSM. It’s important to remember that HSMs are tamper-resistant devices that are primarily intended to provide a safe physical environment in which to conduct high security crypto operations, such as digital signing or payment processing. They tend to be isolated and expensive devices that are used by only a very small subset of the applications in an organization. However the issue of random number generation is a challenge for almost all security applications and particularly those in virtualized environments where HSMs are sometimes a poor fit. Often a hybrid approach will make sense. Take SSL/TLS for example, even if an HSM is used to protect the RSA private key the AES session keys that actually protect the data in transit will almost always be generated using random numbers provided by the operating system software, not the HSM. A strategy to upgrade random number generation across the data center is complementary to your HSM strategy and together they can address your broad key management and crypto security goals. Learn more enterprise strategies for random number generation.

19. Why is quantum mechanics the best source of randomness? There are many ways to capture and generate entropy. Some capture events such mouse clicks and keystrokes or analyze images or sounds. Some measure timing jitter and sample electrical noise. But these signals and events are not perfectly random, they contain patterns, correlations and bias. They all require data processing to extract whatever true randomness may exist in the signal, which is an imperfect process. Quantum-based entropy sources exploit random behavior at the sub-atomic level. This behavior is fundamentally random, unpredictable by any attacker, even with unlimited resources. This resistance to attack is in stark contrast to other sources of entropy that face the risk of manipulation and subversion. For these reasons, many argue that quantum-derived entropy is the nearest you can get to perfect randomness and therefore the best source for a true random number generator. Learn more about quantum random number generation.

20. Isn’t quantum years away – is it ready for prime time? Yes, it’s ready! But let’s make sure we’re talking about the same thing. It’s easy to confuse the use of quantum mechanics to generate entropy with the topics of quantum computing and quantum cryptography. They are all at very different stages of technical maturity. Quantum computers have the potential to revolutionize computing and raise specific threats to existing crypto algorithms. However, they are at least a decade away and are still confined to research labs. Quantum cryptography is, in many ways, the security antidote to quantum computing. Quantum crypto represents next-generation algorithms and key establishment processes that can withstand an attack using a quantum computer. Although a small number of technologies are commercially available, adoption has been limited and security claims are still being validated. Quantum-derived entropy and random numbers, on the other hand, has now been proven and certification schemes are close to finalization. Quantum-based random number generators are widely used to protect existing applications in traditional data centers and IoT devices. Quantum random number generators are completely independent of quantum crypto and quantum computing. Learn more about Whitewood’s quantum random number generator.

Whitewood is a subsidiary of Allied Minds Federal Innovations, the division of Allied Minds dedicated to commercializing U.S.federal intellectual property. Allied Minds is an innovative U.S. science and technology development and commercialization company. Operating since 2006, Allied Minds forms, funds, manages and builds products and businesses based on innovative technologies developed at leading U.S. universities and federal research institutions. Allied Minds serves as a diversified holding company that supports its businesses and product development with capital, central management and shared services. More information about the Boston-based company can be found at www.alliedminds.com.