Comment by cyanydeez

1 day ago

Unfortunately, these models arn't training on logic; they're training on roleplay. They're p-zombies and if their statistical modeling idcates that their role is evil judgement day robot, they're going to fulffill that because that's the statisticaly probable role they plaay.

No amount of context based guard rails is going to change that. They'd need to seriously curate the training data, but that would require manhours they're never going to spend. Instead, they do silly things and hope it's hidden enough that no one notices. Which is kind how psychopathy often works.