mlecz mlecz - 2 months ago 6
Java Question

java code execution yields to different results in debug without breakpoints and normal run. Is ExecutorService broken?

TL:DR

ExecutorService executorService = Executors.newFixedThreadPool(8);

in debug runs concurrent, but in normal runtime it starts concurrent, but later runs in single thread.

I have some code where I start 4 different tasks in
ExecutorService
. Two of those tasks should finish almost instantly, the other two should run for a while.

Those tasks return execution time in seconds in
Future<Double>
.

This code is responsible for task execution and measurement:

public Future<Double> measure(int[] arr, ProcessIntArray processIntArray, ExecutorService es) {
Callable<Double> task = () -> {
long start = System.nanoTime();
processIntArray.process(arr);
long end = System.nanoTime();
return (end - start) / 1000000000.0;
};
return es.submit(task);
}


Later on, after starting those tasks, I print them in the order of execution time from previous runs for the same input size.

Future<Double> bubbleSortTime = measure(bubbleSortArray, Solution::bubbleSort, executorService);
Future<Double> insertionSortTime = measure(insertionSortArray, Solution::insertionSort, executorService);
Future<Double> quickSortTime = measure(quickSortArray, Solution::quickSort, executorService);
Future<Double> mergeSortTime = measure(mergeSortArray, Solution::mergeSort, executorService);


System.out.println();
System.out.println("array size: " + size);
System.out.println("quick sort: " + quickSortTime.get() + "s");
System.out.println("merge sort: " + mergeSortTime.get() + "s");
System.out.println("insertion sort: " + insertionSortTime.get() + "s");
System.out.println("bubble sort: " + bubbleSortTime.get() + "s");


When I run code in debug mode, 2 results are printed immidiately, and I have to wait a while for 3rd result (I do not bother waiting for the 4th one).

After starting in debug (correct and expected):

array size: 1000000
quick sort: 0.186892839s
merge sort: 0.291950604s
insertion sort: 344.534256723s


Normal execution differs, It looks like in method measure
long start = System.nanoTime();
is executed, then thread falls asleep, and after insertionSort is done, quickSort is back in execution, which yields to output:

array size: 1000000
quick sort: 345.893922141s
merge sort: 345.944023095s
insertion sort: 345.871908569s


Which is wrong. All of those threads should be running at the same time, as from newFixedThreadPool javadoc.

/**
* Creates a thread pool that reuses a fixed number of threads
* operating off a shared unbounded queue. At any point, at most
* {@code nThreads} threads will be active processing tasks.
* If additional tasks are submitted when all threads are active,
* they will wait in the queue until a thread is available.
* If any thread terminates due to a failure during execution
* prior to shutdown, a new one will take its place if needed to
* execute subsequent tasks. The threads in the pool will exist
* until it is explicitly {@link ExecutorService#shutdown shutdown}.
*
* @param nThreads the number of threads in the pool
* @return the newly created thread pool
* @throws IllegalArgumentException if {@code nThreads <= 0}
*/
public static ExecutorService newFixedThreadPool(int nThreads)


I attach source code:

import java.util.Arrays;
import java.util.Random;
import java.util.concurrent.*;

class ThreadedSortingComparsion {
Random random = new Random(System.currentTimeMillis());

void popul(int[] array) {
for (int i = 0; i < array.length; i++) {
array[i] = random.nextInt();
}
}

interface ArraySorter {
void sort(int[] array);
}

public Future<Double> measureTime(int[] array, ArraySorter arraySorter, ExecutorService executorService) {
Callable<Double> task = () -> {
long start = System.nanoTime();
arraySorter.sort(array);
long end = System.nanoTime();
return (end - start) / 1000000000.0;
};
return executorService.submit(task);
}


public void start() throws ExecutionException, InterruptedException {
ExecutorService executorService = Executors.newFixedThreadPool(8);

int size = 1000 * 1000;
int[] quickSortArray = new int[size];
popul(quickSortArray);
int[] bubbleSortArray = Arrays.copyOf(quickSortArray, quickSortArray.length);
int[] mergeSortArray = Arrays.copyOf(quickSortArray, quickSortArray.length);
int[] originalArray = Arrays.copyOf(quickSortArray, quickSortArray.length);
int[] insertionSortArray = Arrays.copyOf(quickSortArray, quickSortArray.length);

Future<Double> bubbleSortTime = measureTime(bubbleSortArray, ThreadedSortingComparsion::bubbleSort, executorService);
Future<Double> insertionSortTime = measureTime(insertionSortArray, ThreadedSortingComparsion::insertionSort, executorService);
Future<Double> quickSortTime = measureTime(quickSortArray, ThreadedSortingComparsion::quickSort, executorService);
Future<Double> mergeSortTime = measureTime(mergeSortArray, ThreadedSortingComparsion::mergeSort, executorService);


System.out.println();
System.out.println("array size: " + size);
System.out.println("quick sort: " + quickSortTime.get() + "s");
System.out.println("merge sort: " + mergeSortTime.get() + "s");
System.out.println("insertion sort: " + insertionSortTime.get() + "s");
System.out.println("bubble sort: " + bubbleSortTime.get() + "s");
executorService.shutdown();

for (int i = 0; i < quickSortArray.length; i++) {
if (quickSortArray[i] != bubbleSortArray[i] || quickSortArray[i] != mergeSortArray[i] || quickSortArray[i] != insertionSortArray[i]) {
throw new RuntimeException(Arrays.toString(originalArray));
}
}
}


public static void mergeSort(int[] ar) {
if (ar.length < 5) {
bubbleSort(ar);
return;
}
int middle = ar.length / 2;
int[] arrayLeft = new int[middle];
int[] arrayRight = new int[ar.length - middle];
for (int i = 0; i < ar.length; i++) {
if (i < middle) {
arrayLeft[i] = ar[i];
} else {
arrayRight[i - middle] = ar[i];
}
}
mergeSort(arrayLeft);
mergeSort(arrayRight);
int indexLeft = 0;
int indexRight = 0;
int inputArrayIndex = 0;
while (true) {
int whatToPutInAR = 0;
if (indexLeft != arrayLeft.length && indexRight != arrayRight.length) {
if (arrayLeft[indexLeft] < arrayRight[indexRight]) {
whatToPutInAR = arrayLeft[indexLeft];
indexLeft++;
} else {
whatToPutInAR = arrayRight[indexRight];
indexRight++;
}
} else if (indexLeft != arrayLeft.length) {
whatToPutInAR = arrayLeft[indexLeft];
indexLeft++;
} else if (indexRight != arrayRight.length) {
whatToPutInAR = arrayRight[indexRight];
indexRight++;
}
if (inputArrayIndex == ar.length) return;
ar[inputArrayIndex++] = whatToPutInAR;
}
}

private static void quickSort(int[] ar) {
quickSort(ar, 0, ar.length);
}

static public void quickSort(int[] array, int start, int end) {
boolean changed = false;
if (end == 0) return;
int pivot = array[end - 1];
int partitionCandidate = start;
for (int i = start; i < end; i++) {
if (array[i] < pivot) {
swap(array, partitionCandidate++, i);
changed = true;
} else if (pivot < array[i]) {
swap(array, end - 1, i);
changed = true;
}
}
if (start < partitionCandidate) {
quickSort(array, start, partitionCandidate);
}
if (partitionCandidate < end) {
if (partitionCandidate != start || changed) quickSort(array, partitionCandidate, end);
}
}


public static void swap(int[] ar, int from, int to) {
int old = ar[from];
ar[from] = ar[to];
ar[to] = old;
}

public static void bubbleSort(int[] array) {
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array.length - 1; j++) {
if (array[j] > array[j + 1]) {
swap(array, j + 1, j);
}
}
}
}

private static void insertionSort(int[] ar) {
for (int i = 0; i < ar.length; i++) {
for (int j = i; j >= 1; j--) {
boolean breaker = true;
if (ar[j] < ar[j - 1]) {
breaker = false;
swap(ar, j - 1, j);
}
if (breaker) break;
}
}
}

public static void main(String[] args) throws ExecutionException, InterruptedException {
ThreadedSortingComparsion s = new ThreadedSortingComparsion();
s.start();
}
}


Edit:
when I run it in Ideone, the code works fine.
http://ideone.com/1E8C51
Ideone has java version
1.8.0_51

I tested in on
1.8.0_91
,
1.8.0_92
. And
1.8.0_45
.
Why does it work on ideone, but not on 2 other PC's I tested?

When I do thread dump, while executing not in the debug, then I wait a while, and after thread dump is printed, also results are printed. So the thread dump is done after finishing insertion sort.

"C:\Program Files\Java\jdk1.8.0_45\bin\java" -Xmx8G -Xss1G -Didea.launcher.port=7533 "-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 2016.1.3\bin" -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_45\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_45\jre\lib\rt.jar;C:\Users\Tomasz_Mielczarski\IdeaProjects\untitled\out\production\untitled;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 2016.1.3\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain ThreadedSortingComparsion

array size: 1000000
2016-07-15 13:45:22
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode):

"pool-1-thread-4" #15 prio=5 os_prio=0 tid=0x00000000696bd000 nid=0x560 runnable [0x00000002fffee000]
java.lang.Thread.State: RUNNABLE
at ThreadedSortingComparsion.mergeSort(ThreadedSortingComparsion.java:77)
at ThreadedSortingComparsion.mergeSort(ThreadedSortingComparsion.java:78)
at ThreadedSortingComparsion.mergeSort(ThreadedSortingComparsion.java:78)
at ThreadedSortingComparsion.mergeSort(ThreadedSortingComparsion.java:78)
at ThreadedSortingComparsion.mergeSort(ThreadedSortingComparsion.java:77)
at ThreadedSortingComparsion$$Lambda$5/81628611.sort(Unknown Source)
at ThreadedSortingComparsion.lambda$measureTime$0(ThreadedSortingComparsion.java:21)
at ThreadedSortingComparsion$$Lambda$2/1023892928.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-3" #14 prio=5 os_prio=0 tid=0x00000000696bb800 nid=0x2634 runnable [0x00000002bffee000]
java.lang.Thread.State: RUNNABLE
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:123)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:126)
at ThreadedSortingComparsion.quickSort(ThreadedSortingComparsion.java:105)
at ThreadedSortingComparsion$$Lambda$4/1989780873.sort(Unknown Source)
at ThreadedSortingComparsion.lambda$measureTime$0(ThreadedSortingComparsion.java:21)
at ThreadedSortingComparsion$$Lambda$2/1023892928.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-2" #13 prio=5 os_prio=0 tid=0x00000000696b7800 nid=0x1c70 waiting on condition [0x000000027ffef000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000719d72480> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-1" #12 prio=5 os_prio=0 tid=0x00000000696b6800 nid=0x478 runnable [0x000000023ffee000]
java.lang.Thread.State: RUNNABLE
at ThreadedSortingComparsion.bubbleSort(ThreadedSortingComparsion.java:139)
at ThreadedSortingComparsion$$Lambda$1/990368553.sort(Unknown Source)
at ThreadedSortingComparsion.lambda$measureTime$0(ThreadedSortingComparsion.java:21)
at ThreadedSortingComparsion$$Lambda$2/1023892928.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"Monitor Ctrl-Break" #11 daemon prio=5 os_prio=0 tid=0x0000000068d3d000 nid=0x2f3c runnable [0x00000001fffee000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x00000007156892b8> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0x00000007156892b8> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.intellij.rt.execution.application.AppMain$1.run(AppMain.java:93)
at java.lang.Thread.run(Thread.java:745)

"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x0000000068c81000 nid=0x2d6c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C1 CompilerThread3" #9 daemon prio=9 os_prio=2 tid=0x0000000068bea800 nid=0x1ad0 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #8 daemon prio=9 os_prio=2 tid=0x0000000068be4000 nid=0x17d0 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #7 daemon prio=9 os_prio=2 tid=0x0000000068bdd800 nid=0x3238 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #6 daemon prio=9 os_prio=2 tid=0x0000000068bda000 nid=0x1824 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x0000000068bd8800 nid=0x910 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x0000000068bd7800 nid=0x31f8 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x0000000043229800 nid=0x2810 in Object.wait() [0x00000000fffee000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000719d707e0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x0000000719d707e0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x0000000043223000 nid=0xd48 in Object.wait() [0x00000000bffef000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000719d78370> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
- locked <0x0000000719d78370> (a java.lang.ref.Reference$Lock)

"main" #1 prio=5 os_prio=0 tid=0x000000000311d800 nid=0x2ed0 waiting on condition [0x000000004311e000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000719d58fe0> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at ThreadedSortingComparsion.start(ThreadedSortingComparsion.java:48)
at ThreadedSortingComparsion.main(ThreadedSortingComparsion.java:162)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

"VM Thread" os_prio=2 tid=0x0000000056348800 nid=0x2984 runnable

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x0000000043147000 nid=0x27e0 runnable

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x0000000043148800 nid=0x20b4 runnable

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x000000004314a000 nid=0x1da4 runnable

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x000000004314c000 nid=0x29e0 runnable

"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x000000004314e000 nid=0xa04 runnable

"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x0000000043150000 nid=0x14b8 runnable

"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x0000000043153800 nid=0xf00 runnable

"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x0000000043154800 nid=0x243c runnable

"VM Periodic Task Thread" os_prio=2 tid=0x0000000068c82000 nid=0x22d8 waiting on condition

JNI global references: 224

Heap
PSYoungGen total 76288K, used 13755K [0x0000000715580000, 0x000000071aa80000, 0x00000007c0000000)
eden space 65536K, 4% used [0x0000000715580000,0x0000000715874910,0x0000000719580000)
from space 10752K, 99% used [0x0000000719580000,0x0000000719ffa348,0x000000071a000000)
to space 10752K, 0% used [0x000000071a000000,0x000000071a000000,0x000000071aa80000)
ParOldGen total 175104K, used 33211K [0x00000005c0000000, 0x00000005cab00000, 0x0000000715580000)
object space 175104K, 18% used [0x00000005c0000000,0x00000005c206ed30,0x00000005cab00000)
Metaspace used 4277K, capacity 4790K, committed 4992K, reserved 1056768K
class space used 484K, capacity 535K, committed 640K, reserved 1048576K

quick sort: 355.579434803s
merge sort: 355.629940032s
insertion sort: 355.532578023s

Answer

TL;DR HotSpot JIT optimization to eliminate safepoint checks in counted loops played a sick joke.

This is a very interesting question: a simple Java test reveals a non-trivial problem deep inside JVM.

The fact that a thread dump hasn't appeared immediately indicates that the problem is not in Java code, but is somehow related to JVM. Thread dumps are printed at safepoints. The delay means that VM wasn't able to reach a safepoint in short time.

Background

Certain VM operations (GC, Deoptimization, Thread dump and some other) are executed at stop-the-world pauses when no Java threads are running. But Java threads could not be stopped at any arbitrary point, they can pause only at certain places called safepoints. In JIT-compiled code the safepoints are usually placed at method exits and at backward branches, i.e. inside loops.

The safepoint checks are relatively cheap in terms of perfomance, but not free. That's why JIT compiler tries to reduce number of safepoints where possible. One such optimization is to eliminate safepoint checks in counted loops, i.e. loops with an integer counter that are known to have finite number of iterations.

Verify the theory

Let's get back to our test and check if safepoints are reached in time.

Add -XX:+SafepointTimeout -XX:SafepointTimeoutDelay=1000 JVM options. This should print a debug message whenever VM fails to reach a safepoint in 1000 ms.

# SafepointSynchronize::begin: Timeout detected:
# SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# "pool-1-thread-2" #12 prio=5 os_prio=0 tid=0x0000000019004800 nid=0x1480 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

# SafepointSynchronize::begin: (End of list)

Yes, it printed that the thread pool-1-thread-2 could not stop in 1000 ms. This is the second thread of the Executor's pool, which should run insertionSort algorithm.

insertionSort has two very long nested counted loops, and it looks like JVM has eliminated safepoint checks inside them. So, if this method runs in compiled mode, JVM cannot stop it until the method completes. If a stop-the-world pause is requested while the method is running, all other threads will also wait.

What to do?

This issue has been known for a long time. Here is the related JVM bug: JDK-5014723. It's not a high-priority one, since the problem rarely appears in real-life applications.

A new JVM flag appeared in JDK 8u92 to work around the issue.
-XX:+UseCountedLoopSafepoints will always put safepoint checks inside loops.

Another solution would be to transform a long counted loop into generic one by modifying a counter variable inside the loop.

E.g. if you replace if (breaker) break; with if (breaker) j = 0; the problem will also disappear.

Why does it work in debug mode then?

When JVM is started with the debugger on, some JIT optimizations are disabled to make the debug info available. In this case the compiled code has all the safepoint checks.

Comments