Comment by furyofantares

3 days ago

FWIW, which may be not much - I had codex cli try to verify the results. On my M2 Macbook Air only the first example (False Sharing) did anything - a 23x speedup compared to the article's 6x speedup. All the others didn't produce any speedup at all.

Of course I didn't verify the results I got either - I'm not about to spend hours trying to figure out if this is just slop. But I think it is.

Could you share the benchmark source code of the first example?