Himanshu Himanshu - 12 days ago 8
Java Question

String IdentityHashMap vs HashMap performance

Identity HashMap is special implementation in java which compares the objects reference instead of equals and also uses identityHashCode instead of hashCode. In addition it uses

linear-probe hash table
instead of
Entry list
.

Map<String,String> map = new HashMap<String,String>();

Map<String,String> iMap = new IdentityHashMap<String,String>();


Does that mean for the String keys IdentifyHashMap will be usually faster if tune correctly ?

Added some basic code

public class Dictionary {

public static void main(String[] args) throws IOException {

BufferedReader br = new BufferedReader(new FileReader("/usr/share/dict/words"));

String line;
ArrayList<String> list = new ArrayList<String>();


int index=0;
while( (line = br.readLine()) != null){
list.add(line);
}
System.out.println("list.size() = " + list.size());
Map<String,Integer> iMap = new IdentityHashMap<String,Integer>(list.size());
Map<String,Integer> hashMap = new HashMap<>(list.size());

long iMapTime=0,hashMapTime=0;

long time=0;
for(int i=0; i< list.size(); i++){
time= System.currentTimeMillis();
iMap.put(list.get(i),i);
time = System.currentTimeMillis()-time;
iMapTime += time;
time= System.currentTimeMillis();
hashMap.put(list.get(i),i);
time = System.currentTimeMillis()-time;
hashMapTime += time;
}

System.out.println("iMapTime = " + iMapTime + " hashMapTime = " +hashMapTime);

}


}

Tried very basic performance check. I am reading dictionary words (235K) & pushing into the both maps. It prints below.

list.size() = 235886
iMapTime = 101 hashMapTime = 617


I think this is very good improvment to ignore, unless I am doing something wrong here.

Answer

How does IdentityHashMap<String,?> work?

To make IdentityHashMap<String,?> work for arbitrary strings, you'll have to String.intern() both the keys you put() and potential keys you pass to get(). (Or use an equivalent mechanism.)

Note: unlike stated in @m3th0dman's answer, you don't need to intern() the values.

Either way, interning a string ultimately requires looking it up in some kind of hash table of already interned strings. So unless you had to intern your strings for some other reason anyway (and thus already paid the cost), you won't get much of an actual performance boost out of this.

So why does the test show that you can?

Where your test is unrealistic is that you keep the exact list of keys you used with put() and you iterate across them one by one in list order. Note (the same could be achieved by inserting the elements into a LinkedHashMap and simply calling iterator() on its entry set.

What's the point of IdentityHashMap then?

There are scenarios where it is guaranteed (or practically guaranteed) that object identity is the same as equals(). Imagine trying to implement your own ThreadLocal class for example, you'll probably write something like this:

public final class ThreadLocal<T> {
   private final IdentityHashMap<Thread,T> valueMap;
   ...
   public T get() {
       return valueMap.get( Thread.currentThread() );
   }
}

Because you know that threads have no notion of equality beyond identity. Same goes if your map keys are enum values and so on.