Comment by ashleyn

5 years ago

I suspect linked lists are not used because they are notorious for wrecking cache performance.

Never head of that, though I don't use C much. Are you referring to the CPU cache?

  • Yeah, linked lists are bad for the data cache since each element is in some totally random area of memory and thus less likely to be in a cache. Whereas for a linear array the data is next to each other and can be cached effectively and accesses can be easily predicted.

    • Some 20 years ago, when I was a kernel developer with high performance requirements, my favorite data structure was a linked list of smallish arrays. Each array was small enough to be cheap to completely rewrite on insert/delete, if you needed to e.g. preserve insertion order; most of the time, it was enough to preserve order only across the linked list, treat each array like a hash table slot, or sort at consumption time.

      4 replies →

    • You can also force-align (with padding if necessary) your structures to be along whatever boundaries make sense for your processor's cache lines.

      Ensuring an oft-loaded structure always fits inside a single cache line (instead of splaying across ~1.9 cache lines on average) is good not only for fitting lots of those structures into your L1 but for not blowing out cache bandwidth (read _and_ write, if you're modifying them).

      20 replies →

    • Expanding on this: usually linked lists allocate each element with a separate malloc() on the heap, so they end up all over the place in memory.

      You could, in principle, allocate your linked list out of contiguous memory in such a way that sequentially reading the list doesn't cause so many page accesses. That would itself be a complicated data structure to maintain, so simply using a linear array is often the right tradeoff.

      2 replies →