lulyon lulyon - 22 days ago 6
C Question

glibc rand function implementation

I'm reading c standard library rand() function implementation with glibc source code.
stdlib/random_r.c, line 359

int
__random_r (buf, result)
struct random_data *buf;
int32_t *result;
{
int32_t *state;

if (buf == NULL || result == NULL)
goto fail;

state = buf->state;

if (buf->rand_type == TYPE_0)
{
int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;
}
else
{
int32_t *fptr = buf->fptr;
int32_t *rptr = buf->rptr;
int32_t *end_ptr = buf->end_ptr;
int32_t val;

val = *fptr += *rptr;
/* Chucking least random bit. */
*result = (val >> 1) & 0x7fffffff;
++fptr;
if (fptr >= end_ptr)
{
fptr = state;
++rptr;
}
else
{
++rptr;
if (rptr >= end_ptr)
rptr = state;
}
buf->fptr = fptr;
buf->rptr = rptr;
}
return 0;

fail:
__set_errno (EINVAL);
return -1;
}


I don't understand how random_r generate random number when
(buf->rand_type != TYPE_0)
, anyone please explain? Thanks.

Answer

glibc rand() has two different generator implementations:

  1. A simple linear congruential generator (LCG), defined by the following equation:

    val = ((state * 1103515245) + 12345) & 0x7fffffff

    (& 0x7fffffff throws away the least random most significant bit)

    This is a very simple, single state LCG. It has some drawbacks. The most important one is that, because it is a single state generator, it does not generate a fully pseudorandom number on each rand() call. What it really does is that it traverses the whole range (2^31) in a pseudorandom order. It has a meaningful implication: when you obtain some number it means that you will not obtain that number again in the present period. You will obtain that number again in the next 2^31 rand() call, no sooner, no later.

    This generator is called the TYPE_0 in the glibc source.

  2. A slightly more advanced, additive feedback generator. That generator has many states, which means that it does not have the "property of traversing" described above. You can get the same number twice (or more times) during the same period.

    You can find an excellent description of that algorithm here.

    This generator is called the TYPE_1, TYPE_2, TYPE_3 or TYPE_4 in the glibc source.

    Coming back to your question, that is how it generates values:

    seeding_stage() // (code omitted here, see the description from above link)
    
    for (i=344; i<MAX; i++)
    {
        r[i] = r[i-31] + r[i-3];
        val = ((unsigned int) r[i]) >> 1;
    }
    

    The code after the else in your question is simply the above code, but written in a different way (using pointers to the array containing previous values).

Which generator is used depends on the size of the initial state set with the initstate() function. The first (LCG) generator is used only when state size is 8 bytes. When it is bigger, the second generator is used. When you set your seed using srand() the size of the state is 128 bytes by default, so the second generator is used. Everything is written in comments in the glibc source file referenced by you in your question.