AbstractConcurrentMap is a core class in Groovy, it is used for storing dynamic properties added to Groovy classes during runtime. I'm using Grails 2.1.2 with Groovy 1.8.8, but I think the issue is present in all Groovy versions (the linked source code is for Groovy version 2.4.3).
The problem happens in inner class Segment's put() method (line 105):
- When the current count is greater then map's threshold a rehash() happens. Now the tricky part is, that the Map holds soft references to objects and rehash() validates those references. So when GC discards soft references, the resulting segment doesn't expand (as it is assumed in the put() method).
- in the last line of rehash() the Segmen't internal counter is updated
(which is the number of "alive" undiscarded references and can be smaller than previous count as described above)
count = newCount
- after rehash() is done, the put() method continues, however the buggy part is, that it disregards the previous setting of the internal
and it sets the previous count+1 value in every case on lines 124, 143 and 159
So the following steps are happening:
- Map state:
threshold = 786432; count=786432
- New element is inserted into the map:
count = 786433; threshold = 786432
- since new count would be greater than threshold rehash() happens
- rehash() finds out, that most of the objects are garbage collected, thus it doesn't increase the size of the Segment, however anyway it copies all the objects from one table to another (System.arrayCopy()).
- rehash() sets the internal count to new value, which is smaller, because many objects were garbage collected (soft references), lets say:
count = 486 000
- The put() continues, disregards the
and sets the count to
count = 486 000
count = 786433
- Another element is inserted, however in this state, the count is still greater than threshold so the rehash happens again
- From now on every element added to map will trigger a rehash(), which has a huge performance impact.
When this happens in multithreaded environment, all the other threads are waiting (parked) for lock() until the rehash() and put() is done (and then the next one is doing the rehash() again). You can imagine what performance impact this is...
I don't understand how this bug could survive so many versions with nobody noticing despite the class is extensively used. Maybe I'm missing something ?
Update the c variable after rehash is done.
Between lines 105 and 106 add:
c = count + 1