Karussell Karussell - 2 months ago 5
Java Question

Concurrent byte array access in Java with as few locks as possible

I'm trying to reduce the memory usage for the lock objects of segmented data. See my questions here and here. Or just assume you have a byte array and every 16 bytes can (de)serialize into an object. Let us call this a "row" with row length of 16 bytes. Now if you modify such a row from a writer thread and read from multiple threads you need locking. And if you have a byte array size of 1MB (1024*1024) this means 65536 rows and the same number of locks.

This is a bit too much, also that I need much larger byte arrays, and I would like to reduce it to something roughly proportional to the number of threads. My idea was to create a

ConcurrentHashMap<Integer, LockHelper> concurrentMap;


where
Integer
is the row index and before a thread 'enters' a row it puts a lock object in this map (got this idea from this answer). But no matter what I think through I cannot find an approach that is really thread-safe:

// somewhere else where we need to write or read the row
LockHelper lock1 = new LockHelper();
LockHelper lock = concurrentMap.putIfAbsent(rowIndex, lock1);
lock.addWaitingThread(); // is too late
synchronized(lock) {
try {
// read or write row at rowIndex e.g. writing like
bytes[rowIndex/16] = 1;
bytes[rowIndex/16 + 1] = 2;
// ...
} finally {
if(lock.noThreadsWaiting())
concurrentMap.remove(rowIndex);
}
}


Do you see a possibility to make this thread-safe?

I have the feeling that this will look very similar like the
concurrentMap.compute
contstruct (e.g. see this answer) or could I even utilize this method?

map.compute(rowIndex, (key, value) -> {
if(value == null)
value = new Object();
synchronized (value) {
// access row
return value;
}
});
map.remove(rowIndex);


Is the value and the 'synchronized' necessary at all as we already know the compute operation is atomically?

// null is forbidden so use the key also as the value to avoid creating additional objects
ConcurrentHashMap<Integer, Integer> map = ...;

// now the row access looks really simple:
map.compute(rowIndex, (key, value) -> {
// access row
return key;
});
map.remove(rowIndex);


BTW: Since when we have this compute in Java. Since 1.8? Cannot find this in the JavaDocs

Update: I found a very similar question here with userIds instead rowIndices, note that the question contains an example with several problems like missing
final
, calling
lock
inside the
try-finally
-clause and lack of shrinking the map. Also there seems to be a library JKeyLockManager for this purpose but I don't think it is thread-safe.

Update 2: The solution seem to be really simple as Nicolas Filotto pointed out how to avoid the removal:

map.compute(rowIndex, (key, value) -> {
// access row
return null;
});


So this is really less memory intense BUT the simple segment locking with
synchronized
is at least 50% faster in my scenario.

Answer

Is the value and the synchronized necessary at all as we already know the compute operation is atomically?

I confirm that it is not needed to add a synchronized block in this case as the compute method is done atomically as stated in the Javadoc of ConcurrentHashMap#compute(K key, BiFunction<? super K,? super V,? extends V> remappingFunction) that has been added with BiFunction since Java 8, I quote:

Attempts to compute a mapping for the specified key and its current mapped value (or null if there is no current mapping). The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.

What you try to achieve with the compute method could be totally atomic if you make your BiFunction always returns null to remove the key atomically too such that everything will be done atomically.

map.compute(
    rowIndex, 
    (key, value) -> {
        // access row here
        return null;
    }
);

This way you will then fully rely on the locking mechanism of a ConcurrentHashMap to synchronize your accesses to your rows.