Tyler Gotto - 4 months ago 38

C Question

I am using the djb2 hash function for c, when I run a name through it I am getting hash numbers in the hundreds of thousands, I would like to get to be able to put this in a hash table using an array of a few thousand or something smaller at least inside a long. I am confused about how to get the function to give me smaller hashes while still having the integrity of the hash. Also I am confused about how to decide on the proper size of array to use for my hash table. Thank you in advance.

unsigned long hash(char* str)

{

unsigned long hash = 5381;

int c;

`for (int i = 0; i < strlen(str); ++i)`

{

c = (int) str[i];

hash = ((hash << 5) + hash) + c;

}

return hash;

}

Answer

Assuming that your version of `djb2`

returns an `unsigned long`

(call the return variable `foo`

, say), taking the modulus of that result modulo `n`

using the expression

`foo % n`

will constrain the result from `0`

to and including `n - 1`

. This ought to have similar desirable statistical properties to the original hash value, and ought to be superior to a result obtained by integer division.