kms333 kms333 - 2 months ago 10
Java Question

Java thread executing remainder operation blocks all other threads

The following code snippet executes two threads, one is a simple timer logging every second, the second is an infinite loop that executes a remainder operation:

public class TestBlockingThread {
private static final Logger LOGGER = LoggerFactory.getLogger(TestBlockingThread.class);

public static final void main(String[] args) throws InterruptedException {
Runnable task = () -> {
int i = 0;
while (true) {
i++;
if (i != 0) {
boolean b = 1 % i == 0;
}
}
};

new Thread(new LogTimer()).start();
Thread.sleep(2000);
new Thread(task).start();
}

public static class LogTimer implements Runnable {
@Override
public void run() {
while (true) {
long start = System.currentTimeMillis();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// do nothing
}
LOGGER.info("timeElapsed={}", System.currentTimeMillis() - start);
}
}
}
}


This gives the following result:

[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1003
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=13331
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1006
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1003
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004
[Thread-0] INFO c.m.c.concurrent.TestBlockingThread - timeElapsed=1004


I don't understand why the infinite task blocks all other threads for 13.3 seconds. I tried to change thread priorities and other settings, nothing worked.

If you have any suggestions to fix this (including tweaking OS context switching settings) please let me know.

Answer

After all the explanations here (thanks to Peter Lawrey) we found that the main source of this pause is that safepoint inside the loop is reached rather rarely so it takes a long time to stop all threads for JIT-compiled code replacement.

But I decided to go deeper and find why safepoint is reached rarely. I found it a bit confusing why the back jump of while loop is not "safe" in this case.

So I summon -XX:+PrintAssembly in all its glory to help

-XX:+UnlockDiagnosticVMOptions \
-XX:+TraceClassLoading \
-XX:+DebugNonSafepoints \
-XX:+PrintCompilation \
-XX:+PrintGCDetails \
-XX:+PrintStubCode \
-XX:+PrintAssembly \
-XX:PrintAssemblyOptions=-Mintel

After some investigation I found that after third recompilation of lambda C2 compiler threw away safepoint polls inside loop completely.

UPDATE

During the profiling stage variable i was never seen equal to 0. That's why C2 speculatively optimized this branch away, so that the loop was transformed to something like

for (int i = OSR_value; i != 0; i++) {
    if (1 % i == 0) {
        uncommon_trap();
    }
}
uncommon_trap();

Note that originally infinite loop was reshaped to a regular finite loop with a counter! Due to JIT optimization to eliminate safepoint polls in finite counted loops, there was no safepoint poll in this loop either.

After some time, i wrapped back to 0, and the uncommon trap was taken. The method was deoptimized and continued execution in the interpreter. During recompilation with a new knowledge C2 recognized the infinite loop and gave up compilation. The rest of the method proceeded in the interpreter with proper safepoints.

There is a great must-read blog post "Safepoints: Meaning, Side Effects and Overheads" by Nitsan Wakart covering safepoints and this particular issue.

Safepoint elimination in very long counted loops is known to be a problem. The bug JDK-5014723 (thanks to Vladimir Ivanov) addresses this problem.

The workaround is available until the bug is finally fixed.
You can try using -XX:+UseCountedLoopSafepoints (it will cause overall performance penalty and may lead to JVM crash JDK-8161147). After using it C2 compiler continue keeping safepoints at the back jumps and original pause disappears completely.

Or you can rewrite your code by adding safepoint manually. For example Thread.yield() call at the end of cycle will also fix pause.

Comments