Comment by pushrax

5 years ago

If your primary operation is inserting at a random location in the list, linked lists are faster than arrays at large sizes. You avoid having to move all the memory after the index you are modifying (to make space for the inserted element).

A linked list also avoids copying the entire array when you need to insert more elements than you have allocated space for.

> If your primary operation is inserting at a random location in the list, linked lists are faster than arrays at large sizes. You avoid having to move all the memory after the index you are modifying (to make space for the inserted element).

This is false. Big O notation says it should be true, you'll get marked wrong if you say arrays are faster in your algorithms & data structures final, but when you're running on actual hardware the array is faster at all sizes of n and as n becomes larger so does the gap in performance.

Here is a talk[0] by Bjarne Stroustrop (Creator of C++) that even includes imaginary graphs demonstrating this phenomenon. If you want a visual for what the missing graph was supposed to look like, here's a similar one.[1]

Here's another video[2] by Scott Meyers (Author of Effective C++) that goes into more detail about why this happens.

[0]: https://www.youtube.com/watch?v=YQs6IC-vgmo [1]: https://airspeedvelocity.files.wordpress.com/2015/08/pasted_... [2]: https://www.youtube.com/watch?v=WDIkqP4JbkE

  • Summary of the Stroustrop video - inserting or removing an item at a point in an array of 100k items needs a linear scan from the start of the data structure to get to the point - an array does that very quickly, a linked list is much slower, lots of random memory indirection, a pointer for every list node blowing out the cache - in practice this linear scan to to the change point dominates the runtime, and arrays come out much faster.

    While the array change does need on average 50k items to be shuffled up to make room or close the gap, modern caches are very good at that.

    If the array is sorted it can be binary searched to get to the change point, which improves its performance even more, linked lists can’t do that.

    Interesting.

    • I can definitely see in modern cpu array scanning being faster than pointer chasing but I wouldn't have expected that to survive insertion with 50K move wow!

      And if you not doing ordered insertion you wouldn't have to move the data in the array anyway, you would keep track of the size and jump to the end, so not sure I understand the binary search comment.

      The next question is at what level of growth the waste of empty space in the array becomes too much. Some kind of data structure (tree/linked list) with largish (whatever size applicable for modern cpu) as probably mentioned in other comments does seem the most versatile approach while keeping the performance. Or perhaps the handling of that data structure might overwhelm the array advantage?

      3 replies →

  • I think it's also worth noting that this is specific to (common) CPU architecture where cache is sufficiently big and orders of magnitude faster than main memory, and where main memory reads and writes have similar cost.

    The tradeoff in moving memory at the end of the array vs scanning to the point of linked list insertion could be different on some existing and some hypothetical systems where these concepts may be relevant. For example, storage write cycle minimization could be a priority in rare cases.

    That said this is an excellent point about the majority of use cases. I hadn't considered the dominance of the time it takes to locate the element to modify. I've never had a reason to benchmark a linked list since even a basic idea of the effect of cache/prefetch precludes its use in every use case I've actually come across.

    Probably useful for CS education to start comparing linked lists to bubble sort. There's always a better algorithm, but here's a simple one to learn so you understand how to compare data structure performance.

  • Doesn’t this depend on the size of each element? What if each element is 1 MiB?

    • Are you reading the whole 1 MiB in linear order? If so, then the prefetcher should help in exactly the same way.

      If you're not, then why is all of that data packed together? Is there an alternate layout you could use instead where you iterate over the exact values you need? If performance is important to you in that context it might be worth it.

      2 replies →

Depends on element size and some other stuff, but if it is a singly linked list, and truely random insertion location, iterating to that location is N/2 on average, where inserting and copying the rest of the array is also N/2. Small elements and the array could still be much faster since they are all prefetched during the copy, vs jumping all over memory during the list iteration and potentially stalling on each for ~200 instructions waiting on main memory.

  • You can imagine a linked list whose “API” is “take this node and insert after it”.

    • If a bunch of insertions happen like that, based on some other data structure, and then a periodic iteration eventually happens at some point, you could also imagine an array version being batching together all the array inserts in another simple array and applying them all at once before iteration.