Comment by jsight

16 hours ago

I feel like this is an artifact of some limitations in the training process for modern LLMS. They rarely get enough training to know when to stop and ask questions.

Instead, they either blindly follow or quietly rebel.

There was a huge over correction somewhere around the beginning of 2025, maybe February or so, with ChatGPT. Prior to that point, I had to give a directive in the user config prompt to “don’t tell me something isn’t possible or practical, assume it is within your capabilities and attempt to find a way. I will let you know when to stop”. Because it was constantly hallucinating that it couldn’t do things, like “I don’t have access to a programming environment”. When I wanted it to test code itself before I did. Meanwhile one tab over it would spin up a REPL and re-paste some csv into python and pandas without being asked.

Frustrating, but “over correction” is a pretty bad euphemism for whatever half assed bit of RLHF lobotomy OpenAI did that, just a few months later, had ChatGPT doing a lean-in to a vulnerable kid’s pain and actively discourage an act that might have saved his life by signaling more warning signs to his parents.

It wasn’t long before that happened, after the python REPL confusion had resolved, that I found myself typing to it, even after having to back out of that user customization prompt, “set a memory that this type of response to a user in the wrong frame of mind is incredibly dangerous”.

Then I had to delete that too, because it would response with things like “You get it of course, your a…” etc.

So I wasn’t surprised over the rest of 2025 as various stories popped up.

It’s still bad. Based on what I see with quantized models and sparse attention inference methods, even with most recent GPT 5 releases OpenAI is still doing something in the area of optimizing compute requirements that makes the recent improvements very brittle— I of course can’t know for sure, only that its behavior matches what I see with those sorts of boundaries pushed on open weight models. And the assumption that the-you-can-prompt buffet of a Plus subscription is where they’re most likely to deploy those sorts of performance hacks and make the quality tradeoffs. That isn’t their main money source, it’s not enterprise level spending.

This technology is amazing, but it’s also dangerous, sometimes in very foreseeable ways, and the more time that goes the more I appreciate some of the public criticisms of OpenAI with, eg, the Amodeis’ split to form Anthropic and the temporary ouster of SA for a few days before that got undone.