Comment by skavi
5 years ago
I’ve considered making a vector sorta thing implemented as a linked list of arrays. Assuming you sized the arrays to fit well inside the CPU cache, are there really many tradeoffs?
5 years ago
I’ve considered making a vector sorta thing implemented as a linked list of arrays. Assuming you sized the arrays to fit well inside the CPU cache, are there really many tradeoffs?
Even if the array overflows the cache, it's not too bad for most access patterns. CPUs will easily prefetch data on a linear scan, so the "second half of the array" will most definitely be in cache by the time you get there. And a linear scan-once shouldn't even evict too much other stuff out of the cache.
In practice, I would let benchmarks decide the array size. Modern CPUs are surprisingly good at prefetching even linked lists, and x86 works weird out-of-order magic and speculation while waiting for memory. My rules of two thumbs were 1) where performance really matters, don't trust your intuition, program and benchmark alternative implementations 2) don't trust microbenchmarks, e.g. total icache pressure changes whether inlining is good or not.
That's how Clojure implements vectors except its a tree of 8-element arrays. This has nice functional programming properties because it makes it easy to make immutable copies of large vectors with good performance and memory characteristics.
It sounds like you are talking about an unrolled linked list[1]. I'd argue in most cases that while they are better than linked lists in just about every way (except data objects larger than the cache line), in most cases there is a better data structure to use. I'd find that the hashed array tree[2] would likely be a better/more versatile data structure solely due to the improved lookup time.
[1]: https://en.wikipedia.org/wiki/Unrolled_linked_list
[2]: https://en.wikipedia.org/wiki/Hashed_array_tree