DevNull DevNull - 3 months ago 17
C Question

Generating (very) large non-repeating integer sequence without pre-shuffling

Background



I have a simple media client/server I've written, and I want to generate a non-obvious time value I send with each command from the client to the server. The timestamps will have a fair bit of data in them (nano-second resolution, even if it's not truly accurate, due to limitations of timer sampling in modern operating systems), etc.

What I'm trying to do (on Linux, in C), is to generate a one-to-one sequence of n-bit values (let's assume data is store in 128bit array-of-int elements for now) with no overlapping/colliding values. I would then take a pseudo-random 128bit value/number as a "salt", apply it to the timestamp, and then start sending off commands to the server, incrementing the pre-salted/pre-hashed value.

The reason the timestamp size is so large is because the timestamp may have to accommodate a very large duration of time.




Question



How could I accomplish such a sequence (non-colliding) with an initial salt value? The best approach that sounds along the lines of my goal is from this post, which notes:


If option 1 isn't "random" enough for you, use the CRC-32 hash of said
global (32-bit) counter. There is a 1-to-1 mapping (bijection) between
N-bit integers and their CRC-N so uniqueness will still be guaranteed.


However, I do not know:


  • If that can (efficiently) be extended to 128-bit data.

  • If some sort of addition-to/multiplication-by salt-value to provide the initial seed for the sequence would disrupt it or introduce collisions.






Follow-up



I realize that I could use a 128bit random hash from
libssl
or something similar, but I want the remote server, using the same salt value, to be able to convert the hashed timestamps back into their true values.

Thank you.

Answer

You could use a linear congruential generator. With the right parameters, it is guaranteed to produce non-repeating sequences [unique] sequences with a full period (i.e. no collisions).

This is what random(3) uses in TYPE_0 mode. I adapted it for a full unsigned int range and the seed can be any unsigned int (See my sample code below).

I believe it can be extended to 64 or 128 bits. I'd have a look at: https://en.wikipedia.org/wiki/Linear_congruential_generator to see about the constraints on parameters to prevent collisions and good randomness.

Following the wiki page guidelines, you could produce one that can take any 128 bit value as the seed and will not repeat until all possible 128 bit numbers have been generated.

You may need to write a program to generate suitable parameter pairs and then test them for the "best" randomness. This would be a one time operation.

Once you've got them, just plug these parameters into your equation in your actual application.


Here's some code of mine that I had been playing with when I was looking for something similar:

// _prngstd -- get random number
static inline u32
_prngstd(prng_p prng)
{
    long rhs;
    u32 lhs;

    // NOTE: random is faster and has a _long_ period, but it _only_ produces
    // positive integers but jrand48 produces positive _and_ negative
#if 0
    rhs = jrand48(btc->btc_seed);
    lhs = rhs;
#endif

    // this has collisions
#if 0
    rhs = rand();
    PRNG_FLIP;
#endif

    // this has collisions because it defaults to TYPE_3
#if 0
    rhs = random();
    PRNG_FLIP;
#endif

    // this is random in TYPE_0 (linear congruential) mode
#if 0
    prng->prng_state = ((prng->prng_state * 1103515245) + 12345) & 0x7fffffff;
    rhs = prng->prng_state;
    PRNG_FLIP;
#endif

    // this is random in TYPE_0 (linear congruential) mode with the mask
    // removed to get full range numbers
    // this does _not_ produce overlaps
#if 1
    prng->prng_state = ((prng->prng_state * 1103515245) + 12345);
    rhs = prng->prng_state;
    lhs = rhs;
#endif

    return lhs;
}