Comment by diggan
1 day ago
Statements like these are useless without sharing exactly all the models you've tried. Sonnet beats O1 Pro Mode for example? Not in my experience, but I haven't tried the latest Sonnet versions, only the one before, so wouldn't claim O1 Pro Mode beats everything out there.
Besides, it's so heavily context-dependent that you really need your own private benchmarks to make head or tails out of this whole thing.
No comments yet
Contribute on Hacker News ↗