Comment by brabel

3 hours ago

Doing maths is extremely fast. You need a lot of maths to get to the same amount of time as a single memory access that is not cached in L1 or L2.

And you need to burn even more cycles before you’ve amortized the cost of using a cache line that could have benefitted some other work.