Tyler Gotto - 1 year ago 134
C Question

Hash Function giving me extremely large numbers

I am using the djb2 hash function for c, when I run a name through it I am getting hash numbers in the hundreds of thousands, I would like to get to be able to put this in a hash table using an array of a few thousand or something smaller at least inside a long. I am confused about how to get the function to give me smaller hashes while still having the integrity of the hash. Also I am confused about how to decide on the proper size of array to use for my hash table. Thank you in advance.

unsigned long hash(char* str)
{
unsigned long hash = 5381;
int c;

``````    for (int i = 0; i < strlen(str); ++i)
{
c = (int) str[i];
hash = ((hash << 5) + hash) + c;
}
return hash;
}
``````

Assuming that your version of `djb2` returns an `unsigned long` (call the return variable `foo`, say), taking the modulus of that result modulo `n` using the expression
`foo % n`
will constrain the result from `0` to and including `n - 1`. This ought to have similar desirable statistical properties to the original hash value, and ought to be superior to a result obtained by integer division.