Comment by coolThingsFirst

5 hours ago

Why doesn’t the machine fill up the other cache lines as well why is 64 bytes only and then a miss?

They will absolutely do that (prefetching, they can even eagerly load what’s on the other side of a pointer).

However it requires additional hardware to recognize patterns which benefit from prefetching, and every time the CPU prefetches data which ends up not being used it has both burned energy and memory bandwidth, and evicted data from the cache which might be needed (cache pollution).

A cache line is simply the unit of data a CPU cache works with (generally 64 bytes, because someone somewhere has probably determined that that is the best line size for general use), much like there are units of data like bytes (8 bits nowadays, but there have been weird ones historically), pages (varies between hardware as well, and may be OS-configurable), etc.

As TFA mentions, a CPU does some predictions about what cache lines to prefetch, e.g. when you do sequential reads. Moreover, the x86_64 instruction set provides a prefetch instruction through which you are able to give the CPU a hint "hey, I'm gonna be using this soon, prepare accordingly, pretty please".

Still, the utility of prefetching is diminished if you only use a single byte from each cache line, because the mechanism generally depends on you doing other work while the next cache line is being fetched. So really the best case scenario is to take as much time as possible to work with what is already fetched, so that there is time for the next unit of data to be fetched in the meantime.

It might sometimes prefetch the surrounding lines as well, but ultimately cache space is limited, so there is a trade-off. Every time you fill a line, you are throwing away something else that was cached there previously, which you may need again in the near future.