ouphi ouphi - 1 month ago 8
C Question

CPU Cache disadvantages of using linked lists in C

I was wondering what were the advantages and disadvantages of linked-list compared to contiguous arrays in C. Therefore I read a wikipedia article about linked-lists.
https://en.wikipedia.org/wiki/Linked_list#Disadvantages

According to this article, the disadvantages are the following:

• They use more memory than arrays because of the storage used by their pointers
• Nodes in a linked list must be read in order from the beginning as linked lists
are inherently sequential access.
• Difficulties arise in linked lists when it comes to reverse traversing.
For instance, singly linked lists are cumbersome to navigate backwards and while
doubly linked lists are somewhat easier to read, memory is wasted in allocating.

• Nodes are stored incontiguously, greatly increasing the time required to access
individual elements within the list, especially with a CPU cache.


I understand the first 3 points but I am having a hard time with the last one:

Nodes are stored incontiguously, greatly increasing the time required to access individual elements within the list, especially with a CPU cache.

The article about CPU Cache does not mention anything about non contiguous memory arrays. As far as I know CPU Caches just caches frequently used adresses for a total 10^-6 cache miss.

Therefore, I do not understand why the CPU cache should be less efficient when it comes to non contiguous memory arrays.

Answer

CPU caches actually do two things.

The one you mentioned is caching recently used memory.

The other however is predicting which memory is going to be used in near future. The algorithm is usually quite simple - it assumes that the program processes big array of data and whenever it accesses some memory it will prefetch few more bytes behind.

This doesn't work for linked list as the nodes are randomly placed in memory.

Additionally, the CPU loads bigger blocks of memory (64, 128 bytes). Again, for the int64 array with single read it has data for processing 8 or 16 elements. For linked list it reads one block and the rest may be wasted as the next node can be in completely different chunk of memory.

And last but not least, related to previous section - linked list takes more memory for its management, the most simple version will take at least additional sizeof(pointer) bytes for the pointer to the next node. But it's not so much about CPU cache anymore.