← Back to context

Comment by Orygin

8 hours ago

It didn't seem that deep to me. They just saw an issue with Goblins, dissected the word from the model, then it appeared again in the next version without them knowing exactly how or why.

Goes to show it's all vibes when making these models. The fix is literally a prompt that says not to talk about goblins...

I’m not sure how that was your takeaway..?

> We retired the “Nerdy” personality in March after launching GPT‑5.4. In training, we removed the goblin-affine reward signal and filtered training data containing creature-words, making goblins less likely to over-appear or show up in inappropriate contexts. Unfortunately, GPT‑5.5 started training before we found the root cause of the goblins.

The prompt is just a short term hotfix/hack because they couldn’t get the proper fix in in time.

  • Then maybe stop training and make a real fix?

    If you need to put baby guardrails on your model because the training is effed up, maybe you should rethink how you make these models and how much control you really have on it.