← Back to context

Comment by peterbonney

6 days ago

Correct, I haven’t set it up that way. That’s my point: I’d have to set it up to behave in this way, which is a conscious operator decision, not an emergent behavior of the bot.

Giving it an open ended goal is not the same as a 'human driving the whole process' as you claimed. I really don't know what you are arguing here. No, you do not need to tell it to reply refusals with a hit piece (or similar) for it to act this way.

All the papers showing mundane misalignment of all frontier agents and people acting like this is some unbelievable occurrence is baffling.