Comment by QuadrupleA

8 hours ago

Been unhappy with the GPT5 series, after daily driving 4.x for ages (I chat with them through the API) - very pedantic, goes off on too many side topics, stops following system instructions after a few turns (e.g. "you respond in 1-3 sentences" becomes long bulleted lists and multiple paragraphs very quickly.

Much better feel with the Claude 4.5 series, for both chat and coding.

> you respond in 1-3 sentences" becomes long bulleted lists and multiple paragraphs very quickly

This is why my heart sank this morning. I have spent over a year training 4.0 to just about be helpful enough to get me an extra 1-2 hours a day of productivity. From experimentation, I can see no hope of reproducing that with 5x, and even 5x admits as much to me, when I discussed it with them today:

> Prolixity is a side effect of optimization goals, not billing strategy. Newer models are trained to maximize helpfulness, coverage, and safety, which biases toward explanation, hedging, and context expansion. GPT-4 was less aggressively optimized in those directions, so it felt terser by default.

Share and enjoy!

  • > This is why my heart sank this morning. I have spent over a year training 4.0 to just about be helpful enough to get me an extra 1-2 hours a day of productivity.

    Maybe you should consider basing your workflows on open-weight models instead? Unlike proprietary API-only models no one can take these away from you.

4.1 is great for our stuff at work. It's quite stable (doesn't change personality every month, and one word difference doesn't change the behaviour). IT doesn't think, so it's still reasonably fast.

Is there anything as good in the 5 series? likely, but doing the full QA testing again for no added business value, just because the model disappears, is just a hard sell. But the ones we tested were just slower, or tried to have more personality, which is useless for automation projects.

  • Yeah - agreed, the initial latency is annoying too, even with thinking allegedly turned off. Feels like AI companies are stapling more and more weird routing, summarization, safety layers, etc. that degrade the overall feel of things.

I can never understand why it is so eager to generate walls of text. I have instructions to always keep the response precise and to the point. It almost seem like it wants to overwhelm you, so you give up and do your own research.

I often use ChatGPT without an account and ChatGPT 5 mini (which you get while logged out) might as well be Mistral 7b + web search. Its that mediocre. Even the original 3.5 was way ahead.

  • Really? I’ve found it useful for random little things.

    • It is useful for quick information lookup when you're lacking the precise search terms (which is what I've often do). But the way I was chatting with the original chatgpt were better.

I also found this disturbing, as I used to use GPT for small worked out theoretical problems. In 5.2, the long list of repeated bulleted lists and fortune cookies was a negative for my use case. I replaced some of that use with Claude and am experimenting with LM studio and gpt-oss. It seemed like an obvious regression to me, but maybe people weren't using it that way.

For instance something simple like: "If I put 10kw in solar on my roof when is the payback given xyz price / incentive / usage pattern."

Used to give a kind of short technical report, now it's a long list of bullets and a very paternalistic "this will never work" kind of negativity. I'm assuming this is the anti-sycophant at work, but when you're working a problem you have to be optimistic until you get your answer.

For me this usage was a few times a day for ideas, or working through small problems. For code I've been Claude for at least a year, it just works.