← Back to context

Comment by yencabulator

5 years ago

Even if the array overflows the cache, it's not too bad for most access patterns. CPUs will easily prefetch data on a linear scan, so the "second half of the array" will most definitely be in cache by the time you get there. And a linear scan-once shouldn't even evict too much other stuff out of the cache.

In practice, I would let benchmarks decide the array size. Modern CPUs are surprisingly good at prefetching even linked lists, and x86 works weird out-of-order magic and speculation while waiting for memory. My rules of two thumbs were 1) where performance really matters, don't trust your intuition, program and benchmark alternative implementations 2) don't trust microbenchmarks, e.g. total icache pressure changes whether inlining is good or not.