Comment by dimitri-vs
2 days ago
Maybe the individual tokens, but from experience of using LLMs something upstream encouraged the model to think it was okay to take the action of deleting the DB, something that would override safety RL, Replit system prompts and supposed user instructions not to do so. Just goes against the grain of every coding agent interaction I've ever had - seems fishy.
According to the thread, the unit tests weren't passing, so the LLM reran the migration script, and the migration script blew out the tables. The "upstream encouragement" is a failing test.
Is this a hoax for attention? It's possible, but the scenario is plausible, so I don't see reason to doubt it. Should I receive information indicating it's a hoax, I'll reassess.