Java Question

How to Develop a Hash function for traffic license numbers?

Develop a hash function to generate an index value between 0-4999 inclusive for a given traffic license number. Your hash function should generate as few as possible collisions. Hash function should use the properties of license numbers. Hash method should take the license number as a single String and return an index value. We assume that the license numbers to be in the following format: City code is a number between 10 and 99 inclusive. Three letters are any letter combination from English alphabet with 26 chars. Two digits number is a number between 10 and 99 inclusive.

I wrote something about this question but, collisions are a lot (1800 for 5k)

static long printValue(String s) {
long result = 0;
for (int i = 0; i < s.length(); i++) {
result += Math.pow(27, MAX_LENGTH - i - 1) * (1 + s.charAt(i) - 'A');
}

result = result % 5009;


return (int) result;
}

public int hashF(String str) {

String a = str.substring(0, 2);
String b = str.substring(5, 7);
String middle = str.substring(2, 5);

int q = (int) printValue(middle);

String last = a + q + b;

int index = Integer.parseInt(last);
index = index % 5009;
return index;




}


Link for orjinal file of licence numbers.

These are some examples from file of traffic licence number. Collisions must be 300 (maximum).

65HNM25
93DTV23
94WPX23
31RKK46
15YXX90
31MDV74
45BOG99
65JRM50
77VXR55
39TKY41
80MJU73
63QYE57
38FCO80
45ORI16
17CHN73
70SXR63
87CVM74
27EEE85
32PFJ91
50PBA66
70TVK72
15YLS20
80MPM74
21ZRN20
36VVE84
58IDW24
77VDC89
19BVK93
28SUF63

Answer

Your problem is not your code, but mathematics. Even a (perfect for you, but not very useful) hash code that produces consecutive hashes that are then mod 5000, ie

10AAA10 -> 0
10AAA11 -> 1
... etc
99ZZZ99 -> 600 (90 * 26 * 26 * 26 * 90) % 5000

will statistically produce over 1800 collisions and is no better than the simplest implementation, which is to use String's hashCode:

 int hash = Math.abs(number.hashCode() % 5000);

It's a silly exercise, as it has no real world use.