Americas

  • United States
sandra_henrystocker
Unix Dweeb

Unix: How random is random?

How-To
Jul 17, 20177 mins
LinuxSecurity

On Unix systems, random numbers are generated in a number of ways and random data can serve many purposes. From simple commands to fairly complex processes, the question “How random is random?” is worth asking.

EZ random numbers

If all you need is a casual list of random numbers, the RANDOM variable is an easy choice. Type “echo $RANDOM” and you’ll get a number between 0 and 32,767 (the largest number that two bytes can hold).

$ echo $RANDOM
29366

Of course, this process is actually providing a “pseudo-random” number. As anyone who thinks about random numbers very often might tell you, numbers generated by a program have a limitation. Programs follow carefully crafted steps, and those steps aren’t even close to being truly random. You can increase the randomness of RANDOM’s value by seeding it (i.e., setting the variable to some initial value). Some just use the current process ID (via $$) for that. Note that for any particular starting point, the subsequent values that $RANDOM provides are quite predictable.

$ RANDOM=$$;echo $RANDOM; echo $RANDOM;echo $RANDOM
7424
28301
30566
$ RANDOM=$$;echo $RANDOM; echo $RANDOM;echo $RANDOM
7424
28301
30566

If you need random numbers fairly frequently, maybe another seed would work better. Here we’re using the number of seconds since the Unix epoch.

$ RANDOM=`date +%s`;echo $RANDOM;echo $RANDOM;echo $RANDOM
32077
1397
32029
$ RANDOM=`date +%s`;echo $RANDOM;echo $RANDOM;echo $RANDOM
16116
16487
11588

You can also use the shuf command to generate pseudo-random numbers. In the command below, we’re generating 10 numbers between 0 and 32,767. The shuf command should start each sequence with a different number (no need for seeding).

$ shuf -i 0-32767 -n 10
32157
16611
24087
28301
9088
4662
12780
30518
7549
12830

More complex random data

For more serious requirements for random data, such as its use in encryption, some more truly random data comes into play. The /dev/random and /dev/urandom files get beyond the predictability of programming by making use of environmental noise gathered from device drivers and other system sources and stored it in an “entropy pool”.

Pseudo-random number generation (often referred to as “PRNG”) on Unix systems makes use of these two files. From the command line, these files look like this:

crw-rw-rw- 1 root root 1, 8 Jun 18 13:24 random
crw-rw-rw- 1 root root 1, 9 Jun 18 13:24 urandom

Like most, if not all, of the files in /dev, these files are both zero-length files and, like /dev/null, provide a special service that isn’t obvious by looking at a file listing. The /dev/random and /dev/urandom files can be used to generate numbers that at least approach approximate random values and random numbers are key to encrypting content in order to prevent it from being predictable.

Using the stat command, you can get a more descriptive listing than ls provides for either of these files. Here’s the listing for /dev/urandom. Notice the zero length and the date stamps. This file was generated when the system last booted.

$ stat /dev/urandom
  File: /dev/urandom
  Size: 0               Blocks: 0          IO Block: 4096   character special file
Device: 6h/6d   Inode: 1056        Links: 1     Device type: 1,9
Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-07-15 13:24:49.719736172 -0400
Modify: 2017-07-15 13:24:49.719736172 -0400
Change: 2017-07-15 13:24:49.719736172 -0400
 Birth: -

Examining entropy

To get an idea how much entropy is available on a system, you can look at the special file named “entropy_avail” – /proc/sys/kernel/random/entropy_avail to be more precise. Note that this file lives in the /proc file system—not a file system like those we generally work in, but a file system related to the kernel and running processes. The entropy_avail file will look like it’s empty, but displaying its contents tell you what you need to know.

-r--r--r-- 1 root root 0 Jul 15 16:01 entropy_avail

To get a feel for how much pseudo-random data is available in your entropy pool, you can run this command:

$ cat /proc/sys/kernel/random/entropy_avail
2684

The number shown appears to represent the number of bits of entropy that have been collected. Even 2,684 might not seem like much in a world in which we routinely speak in terms of terrabytes, but numbers above 100 are said to be a good sign. In addition, the number will change frequently. Check three times in a row, and you might see something like this.

$ cat /proc/sys/kernel/random/entropy_avail
2683
$ cat /proc/sys/kernel/random/entropy_avail
2684
$ cat /proc/sys/kernel/random/entropy_avail
2493

The two files — /dev/random and /dev/urandom — consume the entropy pool and work nearly the same except for one important distinction — /dev/random will block when it runs out of entropy and might halt a process while /dev/urandom will never block, but might have less entropy. The /dev/urandom file appears to be the more reliable choice today.

Randomness vs. entropy

Now that the word “entropy” has entered the discussion, let’s consider the relationship of the words randomness and entropy. While tightly related, they don’t mean exactly the same thing. Entropy is reminiscent of a coin toss and is a measure of the uncertainty of an outcome, while randomness is related to a probabilistic distribution. For computer folk, the terms are often used as if they mean exactly the same thing.

Generating files with random data

You can create a file of pseudo-random data if you need one. In this command, we create a 1 gigabyte file called “myfile” and then examine the first line with an od command just to get a feel for what was created.

Creating the file:

$ head -c 1G  myfile

Looking at the file:

$ ls -l myfile
-rw-rw-r-- 1 shs shs 1073741824 Jul 14 15:10 myfile
$ head -1 myfile | od -bc
0000000 210 365 102 233 332 203 075 262 302 064 255 110 265 372 365 176
        210 365   B 233 332 203   = 262 302   4 255   H 265 372 365   ~
0000020 274 243 116 012
        274 243   N  n

Generating random numbers

You can use /dev/urandom to generate pseudo-random numbers on the command line like this.

$ od -vAn -N4 -tu 

Commands like this that pull data from /dev/urandom and use od to process it can generate nearly random numbers. Run the same command numerous times and you’ll see that you get a range of numbers.

$ od -vAn -N4 -tu 

Two of the options used with these of commands are particularly interesting. The -N controls the size of the output as the number in bytes. So, -N4 means the resultant number should be four bytes long. This doesn’t mean the resultant number provided can’t be a small number like 12—just that it will use four bytes. The largest numbers you will see will have ten digits. Switch to -N5 and you’ll get two numbers—one using 4 bytes and one using 1. Omit the -N option, and you’ll get a continuous stream of numbers—at least until you get tired of looking at them and hit ^C.

$ od -vAn -N5 -tu 

Do this same thing with /dev/random and you’re likely to run out of steam fairly quickly.

$ od -vAn -tu 

Note that the ^Z was used to suspend the process when it hung on the command line. Also keep in mind that the entropy pool gets used up and regenerates.

Beyond /dev/urandom

Given the limitations of /dev/random and /dev/urandom, there are also some interesting options. There are now more than a dozen a hardware random number generators (also known as “true random number generators”, often referred to by the acronym “TRNG”) available today. In addition, “Entropy as a Service” (EaaS) is available as an option that could dramatically change the nature of randomness on systems close to you. More on this soon!

To get a quick introduction to EaaS, check out NIST’s introduction—and stay tuned for some additional insights here on NetworkWorld.

sandra_henrystocker
Unix Dweeb

Sandra Henry-Stocker has been administering Unix systems for more than 30 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She lives in the mountains in Virginia where, when not working with or writing about Unix, she's chasing the bears away from her bird feeders.

The opinions expressed in this blog are those of Sandra Henry-Stocker and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.