Comment by furyofantares
3 days ago
FWIW, which may be not much - I had codex cli try to verify the results. On my M2 Macbook Air only the first example (False Sharing) did anything - a 23x speedup compared to the article's 6x speedup. All the others didn't produce any speedup at all.
Of course I didn't verify the results I got either - I'm not about to spend hours trying to figure out if this is just slop. But I think it is.
Could you share the benchmark source code of the first example?
Here's the one that showed a lot more speedup than the article:
https://pastebin.com/v9tczpus
Looks like the LLM invented somewhat different test for it than the article had. I tried again and have this with the same data structure as in the article:
https://pastebin.com/SDdcchZG
That gave similar results to the article.
All the other tests still give little-to-no speedup on my machine.
Many thanks for providing the source. It also works on my machine.
TIL.
3 replies →