bsiamionau bsiamionau - 3 months ago 12
Java Question

Is LinkedList really faster than ArrayList in the case of insertion in the middle of list?

- What is the difference between

LinkedList
and
ArrayList
? When is it preferable to use a
LinkedList
?


I think every Java developer has heard this question at interview at least once.

- Linked list is preferable if you want to be able to insert items in the middle of the list.

It is a common answer to this question. Everybody knows it. Every time you ask a question about the difference between List implementations you get such answers as:


When should I use LinkedList? When do you need efficient removal in between elements or at the start?


From here


Forgot to mention insertion costs. In a LinkedList, once you have the correct position, insertion costs
O(1)
, while in an ArrayList it goes up to
O(n)
- all elements past the insertion point must be moved.


From here


Linked lists are preferable over arrays when you want to be able to insert items in the middle of the list (such as a priority queue).


From here


ArrayList is slower because it needs to copy part of the array in order to remove the slot that has become free. LinkedList just has to manipulate a couple of references.


From here

And more...

But have you ever tried to reproduce it by yourself? I tried yesterday and got these results:

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;

public class Test {
public static void main(String... args) {
final int MAX_VAL = 10000;
List<Integer> linkedList = new LinkedList<Integer>();
List<Integer> arrayList = new ArrayList<Integer>();
for(int i = 0; i < MAX_VAL; i++) {
linkedList.add(i);
arrayList.add(i);
}
long time = System.nanoTime();
for(int i = 0; i < MAX_VAL; i++) {
linkedList.add(MAX_VAL/2, i);
}
System.out.println("LL time: " + (System.nanoTime() - time));
time = System.nanoTime();
for(int i = 0; i < MAX_VAL; i++) {
arrayList.add(MAX_VAL/2, i);
}
System.out.println("AL time: " + (System.nanoTime() - time));
}
}


Output:


LL time: 114098106

AL time: 24121889


So what is it? Why does LinkedList suck so much? Maybe we should try the remove operation instead of add? Ok, let's try:

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;

public class Test {
public static void main(String... args) {
final int MAX_VAL = 10000;
List<Integer> linkedList = new LinkedList<Integer>();
List<Integer> arrayList = new ArrayList<Integer>();
for(int i = 0; i < MAX_VAL; i++) {
linkedList.add(i);
arrayList.add(i);
}
long time = System.nanoTime();
for(int i = 0; i < MAX_VAL/2; i++) {
linkedList.remove(MAX_VAL/2);
}
System.out.println("LL time: " + (System.nanoTime() - time));
time = System.nanoTime();
for(int i = 0; i < MAX_VAL/2; i++) {
arrayList.remove(MAX_VAL/2);
}
System.out.println("AL time: " + (System.nanoTime() - time));
}
}


Output:


LL time: 27581163

AL time: 3103051


Oh, ArrayList is still faster than LinkedList. What is the reason? Was this myth busted? Or maybe I'm wrong?

enter image description here

Answer

BUSTED

Not really. Here

for(int i = 0; i < MAX_VAL; i++) {
    linkedList.add(MAX_VAL/2, i);
}

you don't just insert the item; you pay the cost of iterating from the beginning to i every time. Naturally, that's O(i).

On the other hand, the list must be quite large before you'll actually witness the performance benefit of mid-list insertion. System.arraycopy is a blazing-fast operation and, on the other end, each insertion into a LinkedList requires the allocation of a node instance.

In summary, ArrayList is a better choice for 99% or more of real-world cases, and leveraging the narrow advantage of a LinkedList requires great care.

General notes on microbenchmarking the JVM

I should also warn you that your benchmarking code is badly deficient. There is quite a sizable checklist of things to watch out for when microbencharking on the JVM, for example:

  • always warm up the code to let the JIT compiler get to it;
  • be very careful about interpreting nanoTime results due to accuracy/precision issues. Make the reading grow at least into milliseconds (millions of nanoseconds) to ensure reliability;
  • control the spurious side-effects of the Garbage Collector;
  • etc.

Therefore the advice is to use a ready-made microbenchmarking framework such as OpenJDK's jmh.

Comments