ryvnf - 11 months ago 45

C Question

I was reading the C FAQ and found out in a question that it recommends me to use

`rand() / (RAND_MAX / N + 1)`

`rand() % N`

The reasoning for that is that when

`N`

`rand() % N`

`rand()`

I tested the different approaches with

`N`

`2`

`#include <stdio.h>`

#include <stdlib.h>

#include <time.h>

#define N 2

int main(void)

{

srand(0);

printf("rand() %% N:\n");

for (int i = 0; i < 40; ++i) {

printf("%d ", rand() % N);

}

putchar('\n');

srand(0);

printf("rand() / (RAND_MAX / N + 1):\n");

for (int i = 0; i < 40; ++i) {

printf("%d ", rand() / (RAND_MAX / N + 1));

}

putchar('\n');

return 0;

}

The output is this (on my gnu/linux machine):

`rand() % N:`

1 0 1 1 1 1 0 0 1 1 0 1 0 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0

rand() / (RAND_MAX / N + 1):

1 0 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 1

Both alternatives seem perfectly random to me. It even seems like the second approach is worse than

`rand % N`

Should I use

`rand() % N`

`rand() / (RAND_MAX / N + 1)`

Answer Source

If `N`

is a power of two, using the remainder technique is usually safe (`RAND_MAX`

is usually a power of two minus 1, so the entire range has a power of two length). More generally, `N`

has to divide the range of `rand()`

in order to avoid the bias.

Otherwise, you run into this problem, regardless of the quality of `rand()`

. In short, the problem is that you're chopping that range into a number of "parts" each of length `N`

, if `N`

does not divide the range then the last part will not be complete. The numbers that got "cut off" from that part are therefore less likely to occur, since they have one fewer "part" they can be generated from.

Unfortunately `rand() / (RAND_MAX / N + 1)`

is also broken (in almost the same way), so the real answer is: don't use either of them.

The problem as outlined above is really fundamental, there is no way to evenly distribute X different values over Y results unless Y divides X. You can fix it by rejecting a part of the random samples, to make Y divide the new X.