Jane Wayne Jane Wayne - 2 months ago 10
Java Question

How does overriding equals and hashCode impacts storing data in a Map in Java with a bad hashing function?

I have a class,

Employee
, let's say, and my
hashCode
function for this class is really bad (let's say it always return a constant). My code looks like the following.

public class Employee {
private String name;

public Employee(String name) {
this.name = name;
}

@Override
public int hashCode() { return 1; }

@Override
public boolean equals(Object object) {
if(null == object || !(object instanceof Employee)) {
return false;
}
Employee other = (Employee)object;
return this.name.equals(other.name);
}
}


Let's say I want to use
Employee
as the key in a
Map
, and so I can do something like the following.

public static void main(String[] args) {
Map<Employee, Long> map = new HashMap<>();
for(int i=0; i < 1000; i++) {
map.put(new Employee("john"+i, 1L));
}
System.out.println(map.size());
}


How come when I run this code, I always get 1,000 as the size?

Using
Employee
as a key seems to be "good" in the following sense.


  • It is immutable

  • Two employees that are
    equals
    always generate the same hash code



What I expected was that since the output of
hashCode
is always 1, then
map.size()
should always be 1. But it is not. Why? If I have a
Map<Integer,Integer>
, and I do
map.put(1, 1)
followed by
map.put(1, 2)
, I would only expect the size to be 1.

The
equals
method must somehow be coming into play here, but I'm not sure how.

Any pointers are appreciated.

Answer

Your loop

for(int i=0; i < 1000; i++) {
    map.put(new Employee("john"+System.currentTimeMillis(), 1L));
}

executes within a couple of milliseconds, so System.currentTimeMillis() will be returning the same value for the vast majority of the iterations of your loop. So, several hundred of your johns will have the exact same name + number.

Then, we have java's retarded Map which does not have an add() method, (which one would reasonably expect to throw an exception if the item already exists,) but instead it only has a put() method which will either add or replace items, without failing. So, most of your johns get overwritten by subsequent johns, without any increase in the map size, and without any exception being thrown to give you a hint about what you are doing wrong.

Furthermore, you seem to be a bit confused as to exactly what the effect of a bad hashCode() function is on a map. A bad hashCode() simply results in collisions. Collisions in a hashmap do not cause items to be lost; they only cause the internal structure of the map to not be very efficient. Essentially, a constant hashCode() will result in a degenerate map which internally looks like a linked list. It will be inefficient both for insertions and for deletions, but no items will be lost due to that.

Items will be lost due to a bad equals() method, or due to overwriting them with newer items. (Which is the case in your code.)

Comments