groenhen - 6 months ago 65

C Question

I already have a 64 bit hash function in a library (C coding), but I only need 48 bits. I need to trim down the 64 bit hash value to a 48 bit value, yet it has to be in a safe manner in order to minimize collision.

The hash function is a very good 64 bit hash function. It has been tested with SMHasher (the "DieHarder" hash testing) and proved better than Murmur2. According to my colleagues, the algorithm implemented in the lib for 64-bit hashing is xxHash, tested with SMHasher and got a Q.Score of 10! For those who want to see it, the source code for xxHash is available on github.com : github.com/Cyan4973/xxHash/releases/latest.

The basic idea is to have all bits in the 64-bit hash value (or part of them) have an effect on the resulting 48-bit hash value. Is there any way to do that?

Answer

If the 64-bit hash is good, then selecting any 48 bits will also be a good hash. @Lee Daniel. Of course, information is lost and not reversible.

```
unsigned long long Mask48 = 0xFFFFFFFFFFFFu;
unsigned long long hash48 = hash64 & Mask48;
```

If 64-bit hash function is weak, then mod by the largest prime just under `pow(2,48)`

. Some buckets will be lost. This will not harm a good hash, yet certainly make weak hashes better.

```
unsigned long long LargestPrime48 = 281474976710597u; // FFFFFFFFFFC5
unsigned long long hash48 = hash64 % LargestPrime48;
```