Comment by gallerdude

17 hours ago

The weirdest thing about this AI revolution is how smooth and continuous it is. If you look closely at differences between 4.6 and 4.5, it’s hard to see the subtle details.

A year ago today, Sonnet 3.5 (new), was the newest model. A week later, Sonnet 3.7 would be released.

Even 3.7 feels like ancient history! But in the gradient of 3.5 to 3.5 (new) to 3.7 to 4 to 4.1 to 4.5, I can’t think of one moment where I saw everything change. Even with all the noise in the headlines, it’s still been a silent revolution.

Am I just a believer in an emperor with no clothes? Or, somehow, against all probability and plausibility, are we all still early?

If you've been using each new step is very noticeable and so have the mindshare. Around Sonnet 3.7 Claude Code-style coding became usable, and very quickly gained a lot of marketshare. Opus 4 could tackle significant more complexity. Opus 4.6 has been another noticable step up for me, suddenly I can let CC run significantly more independently, allowing multiple parallel agents where previously too much babysitting was required for that.

  • I think this is where there's a huge distinction between ability/performance/benchmark figures and utility. You can have smooth improvements to performance, but marked step changes in utility as they cross thresholds where you're able to use them for new tasks.

In terms of real work, it was the 4 series models. That raised the floor of Sonnet high enough to be "reliable" for common tasks and Opus 4 was capable of handling some hard problems. It still had a big reward hacking/deception problem that Codex models don't display so much, but with Opus 4.5+ it's fairly reliable.

I had not used Claude much until an hour ago since probably before GPT5. I had only been using Gemini the last 3 months.

Sonnet 4.6 extended on the free plan is just incredible. I am just complete floored by it. The conversation I just had with it was nuts. It was from Dario mentioning something like a 20% chance Claude is conscious or something crazy like that. I have always tried that conversation with previous models but it got boring so fast.

There is something with the way it can organize context without getting lost that completely blows Gemini away.

Maybe even more so that it was the first time it felt like a model pushed back a little and the answers were not just me ultimately steering it into certain answers. For the free plan that is nuts.

In terms of being conscious, it is the first time I would say I am not 100% certain it is just a very useful, very smart , stochastic parrot. I wouldn't want to say more than that but 15-20% doesn't sound so insane to me as it did 2 hours ago.

Honestly, 4.5 Opus was the game changer. From Sonnet 4.5 to that was a massive difference.

But I'm on Codex GPT 5.3 this month, and it's also quite amazing.