WARNING   The author of this routine has been writing random-number generators for many years and has never been known to write one that worked.
Unix Third Edition (V3) manual page

What?

Select a Uniformly Distributed random number from interval , and this interval may change dynamically.

We will assume that RNG will output a uniformly distributed random number in

To do it correctly in C++.

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    std::uniform_int_distribution<uint32_t> dist(0, range-1);
 
    return dist(rng);
}

Biased Approach

Classic Modulo

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    return rng() % range;
}
  • Biased because rand() produces numbers in the range [0..2^32), for numbers that can not perfectly divide 2^32, some of them have a lower change to be selected.
  • Eg: 52 does not perfectly divide 2^32, and 2^32 mod 52 = 48, so 49..52 have a lower change to be selected.
  • One other concern we might have is that some generators have weak low-order bits. That is, the lower bit does not pass the statistical tests (for every bit it must appear 50/50). When we perform % 52, we are essentially passing the lowest bit straight through the output.

Floating Point Multiply

static uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    double zeroone = 0x1.0p-32 * rng();
    return range * zeroone;
}
  • Note that 0x1.0p-32 is a binary floating point constant for  
  • This approach is just as biased as the classic modulo approach, but the bias manifests itself differently. For example, if we were choosing numbers in the range [0..52), the numbers 0, 13, 26 and 39 would appear once less often than the others.
  • That is, it does not uniformly map input to an even output.

Int Multiply

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    uint32_t x = rng();
    uint64_t m = uint64_t(x) * uint64_t(range);
    return m >> 32;
}
  • It might appear that this version requires 64-bit arithmetic, but on x86 CPUs, a good compiler will compile this code to a 32-bit mult instruction (which yields two 32-bit outputs, one of which is the return value). We should expect this version to be fast, but biased in exactly the same way as the floating-point multiplication method.
  • Biased for the exact same reason as the floating-point multiplication method.

Unbiased

The most common method to achieve fairness is through rejection.

Division with Rejection

Instead of calculating x * range / 2**32, we calculate x / (2**32 / range), then reject the uneven condition, essentially reject the last skewed biased (last 48 values).

For example, in the case of drawing a card using a 32-bit PRNG, we can generate a 32-bit number and divide it by 2^32/52 = 82,595,524 to choose our card. This method works when the random value from the 32-bit PRNG is less than 52 × 82595524 = 2^32/32 – 48. Since division do not have remainder.

If the random value from the PRNG is one of the final 48 values at the top end of the range of the generator, it must be rejected and another sought.

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    // calculates divisor = 2**32 / range
    uint32_t divisor = ((-range) / range) + 1;
    if (divisor == 0) // overflow, it's really 2**32
        return 0;
    for (;;) {
        uint32_t val = rng() / divisor;
        if (val < range)
            return val;
    }
}
  • Note that range is unsigned. For an unsigned integer, the negation of a number n is .
  • Dividing this by the range gives a value just less than , so we add 1 to it to get the correct divisor.

Debiased Modulo (Twice)

  • Reject the first 48 value (that are biased)
uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    // calculates 2**32 % range
    uint32_t t = (-range) % range;
    for (;;) {
        uint32_t r = rng();
        if (r >= t)
            return r % range;
    }
}

The code for OpenBSD’s arc4random_uniform (which is also used on OS X and iOS) adopts this strategy.

Debiased Modulo (Once)

Java adopts a different approach to generating a number in a range that manages to use only one modulo operation except in the relatively rare occasions when rejection occurs. The code is

static uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    uint32_t x, r;
    do {
        x = rng();
        r = x % range;
    } while (x - r > (-range));
    return r;
}

It may take a little thought to figure out why this variant works. Unlike the previous version based on remainders, which removes bias by removing some of the lowest values from the underlying generation engine, this version first uses x - r to make it multiple of range, then filter out value that lands in 2**32 - range as before to make it unbiased.

Debiased Integer Multiplication — Lemire’s Method

Much as we removed bias from the modulo technique, we can also remove bias from the integer multiplication method. This method was devised by Lemire.
The highlight is that it does not require any modulo arithmetic, which makes it use less CPU cycles and should be fast enough. This is also the algorithm used by Go’s rand package.

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    uint32_t t = (-range) % range;
    do {
        uint32_t x = rng();
        uint64_t m = uint64_t(x) * uint64_t(range);
        uint32_t l = uint32_t(m);
    } while (l < t);
    return m >> 32;
}

Bitmask with Rejection (Unbiased) — Apple’s Method

Our final approach avoids division and remainder operations entirely. Instead, it uses a simple masking operation to get a random number in the range [0..2^k) where k is the smallest value such that 2^k is greater than the range. If the value is too large for our range, we discard and try another. Code below:

uint32_t bounded_rand(rng_t& rng, uint32_t range) {
    uint32_t mask = ~uint32_t(0); // all bits to 1
    --range; // Decrement the range by 1 because the range [0..range-1] needs to be inclusive.
	// clz: count leading zero
	// range|1 so we have at least 1 bit mask
	// right shift effectively creating a mask that matches the bit length of range.
    mask >>= __builtin_clz(range|1); 
    uint32_t x;
    do {
	    // since rng() return bits uniformly, x should also have uniformly distributed bits
        x = rng() & mask;
    } while (x > range);
    return x;
}

This approach was adopted by Apple when (in the macOS Sierra release) they made their own revision to the code for arc4random_uniform.

Real World Cases

Go: Seeding Responsibility

Long story short, for modern implementation we should seed it automatically.

  • Long story:

A bigger problem with Seed is that responsibility for seeding the global generator was unclear. Most users don’t use Source and Rand directly. Instead, the math/rand package provides a global generator accessed by top-level functions like Intn. Following the C standard library, the global generator defaults to behaving as if Seed(1) is called at startup. This is good for repeatability but bad for programs that want their random outputs to be different from one run to the next. The package documentation suggests using rand.Seed(time.Now().UnixNano()) in that case, to make the generator’s output time-dependent, but what code should do this?

Probably the main package should be in charge of how math/rand is seeded: it would be unfortunate for imported libraries to configure global state themselves, since their choices might conflict with other libraries or the main package. But what happens if a library needs some random data and wants to use math/rand? What if the main package doesn’t even know math/rand is being used? We found that in practice many libraries add init functions that seed the global generator with the current time, “just to be sure”.

Library packages seeding the global generator themselves causes a new problem. Suppose package main imports two packages that both use math/rand: package A assumes the global generator will be seeded by package main, but package B seeds it in an init func. And suppose that package main doesn’t seed the generator itself. Now package A’s correct operation depends on the coincidence that package B is also imported in the program. If package main stops importing package B, package A will stop getting random values. We observed this happening in practice in large codebases.

In retrospect, it was clearly a mistake to follow the C standard library here: seeding the global generator automatically would remove the confusion about who seeds it, and users would stop being surprised by repeatable output when they didn’t want that.

Linear Congruential Generator

Linear Congruential Generator

  • Benefits:
    • Constants can be chosen such that they emit every possible output value once before repeating
  • Problems:
    • The high bits of the state do not affect the low bits at all

Algorithm

Link to original

Random Shuffle: Knuth Shuffle

Knuth Shuffle

It shuffles an array of size so that possible permutations are equiprobable.

Require: array A made of n elements indexed from 0 to n − 1
1: for i = (n − 1)..1 do
2:     j ← random integer in [0, i]
3:     exchange A[i] and A[j]
4: end for
  • We count i down because we need a new seed every time we call rand

image.png

Proof

  • We start with an array x = [1, 2. ..., N]
    • There are possible permutation.
  • We consider an arbitrary array x' = [b1, b2, ..., bN], where b1, ..., bN] are distinct integers between 1 and n.
    • Intuition: Every element gets shuffled
    • What’s the Probability that x transferred to x’ ?
    • QED

Optimize?

It’s a nicer? simpler? solution to the shuffling problem:

  1. Loop through each card in the deck.
  2. Swap the current card with another randomly chosen card.
// CAVET: BIASED
for (int i = 0; i < cards.Length; i++)
{
  int n = rand.Next(cards.Length);
  Swap(ref cards[i], ref cards[n]);
}

The evil lies in rand.Next, where it is always rand.Next(3), giving us results. But it should only have results. Compared to the original method.

for (int i = cards.Length - 1; i > 0; i--)
{
  int n = rand.Next(i + 1);
  Swap(ref cards[i], ref cards[n]);
}

image

Link to original

PRNG VS RNG

  • Pseudo: the input space (seed) is limited. Thus, we can predict the result.
  • CPRNG: crypto-strength version of PRNG. It takes a 256-bit seed rather than a 32-bit seed.