Comment by CSMastermind
14 hours ago
Man I don't know if I'm living in a crazy bubble or something but GPT 5.5 is lightyears better than Opus 4.8 for me to the point where I'm honestly wondering how you're evaluating them or what kind of work you're doing.
There's specific tasks that Opus does better on like Frontend Dev and Design but for anything else 5.5 just laps it.
Yeah I’ve been consistently underwhelmed by anthropic models, but then I don’t use their harness so maybe that’s it
In my experience, for more mechanical refactoring work (like splitting a big source code file into multiple smaller ones), GPT 5.5 runs way faster than any of the Claude models. But for other tasks that require deeper reasoning, it's not that clear who is the winner.
It's just too funny to see people arguing about "no, it's my religion that's the right one!" on HackerNews.
You guys are all a lost cause.
How is attempting to benchmark llms like religion?
Re-read the comment I'm replying to, it's not talking about benchmarks, just models.