Comment by nostrebored
2 hours ago
Hmm, I don’t know, maybe the fact that 4.6, 4.7, 5.3, 5.4, 5.5, 3.0, 3.1 are all marginal improvements?
2 hours ago
Hmm, I don’t know, maybe the fact that 4.6, 4.7, 5.3, 5.4, 5.5, 3.0, 3.1 are all marginal improvements?
I think people's opinion of "marginal improvement" is based on their relative ability. A 2000 elo chess player is going to think the jump from 500 to 1000 is marginal. They're both floundering around not doing anything resembling common sense. A 1000 elo chess player is going to find the jump from 2000 to 2500 marginal. They're both playing far better moves for incomprehensible reasons, and the only reason you know the 2500 player is better is due to benchmarking. It is only when you are evaluating systems about at your level that you can feel the improvement.
I, personally, found the past two years to be a much larger improvement than the previous two years.
I think this is a pretty ridiculous take. 2024-2025 was filled with huge improvements. 2025-2026 has not been, outside of open source.
The idea that we’re at the point where it’s superseded our ability to tell just makes no sense. I’ll be happy if we can get to a point where I don’t have to tell Claude not to tail every bash command or make a job that writes throughout instead of once at the end. I’ll be happy if “continue this interaction naturally, you are taking over from an independent subagent” works.
But I’m not holding my breath. It’s still really cool that any of this stuff is possible.
Equally marginal?
No, the anthropic releases have felt marginally negative