← Back to context

Comment by qsort

3 days ago

It seems to me like this is yet another instance of just reading vibes, like when GPT 5 was underwhelming and people were like "AI is dead", or people thinking Google was behind last year when 2.5 pro was perfectly fine, or overhyping stuff that makes no sense like Sora.

Wasn't the consensus that 3.0 isn't that great compared to how it benchmarks? I don't even know anymore, I feel I'm going insane.

> It seems to me like this is yet another instance of just reading vibes, like when GPT 5 was underwhelming and people were like "AI is dead"

This might be part of what you meant, but I would point out that the supposed underwhelmingness of GPT-5 was itself vibes. Maybe anyone who was expecting AGI was disappointed, but for me GPT-5 was the model that won me away from Claude for coding.

I have a weakly held conviction (because it is based on my personal qualitative opinion) that Google aggressively and quietly quantizes (or reduces compute/thinking on) their models a little while after release.

Gemini 2.5 Pro 3-25 benchmark was by far my favorite model this year, and I noticed an extreme drop off of quality responses around the beginning of May when they pointed that benchmark to a newer version (I didn't even know they did this until I started searching for why the model degraded so much).

I noticed a similar effect with Gemini 3.0: it felt fantastic over the first couple weeks of use, and now the responses I get from it are noticeably more mediocre.

I'm under the impression all of the flagship AI shops do these kinds of quiet changes after a release to save on costs (Anthropic seems like the most honest player in my experience), and Google does it more aggressively than either OpenAI or Anthropic.

  • This is a common trope here the last couple of years. I really can't tell if the models get worse or its in our heads. I don't use a new model until a few months after release and I still have this experience. So they can't be degrading the models uniformly over time, it would have to be a per-user kind of thing. Possible, but then I should see a difference when I switch to my less-used (wife's) google/openAI accounts, which I don't.

  • It's the fate of people relying on cloud services, including the complete removal of old LLM versions.

    If you want stability you go local.

  • I can definitely confirm this from my experience.

    Gemini 3 feels even worse than GPT-4o right now. I dont understand the hype or why OpenAI would need a red alert because of it?

    Both Opus 4.5 and GPT-5.2 are much more pleasant to use.