← Back to context

Comment by famouswaffles

6 days ago

>But observing my own Openclaw bot’s interactions with GitHub, it is very clear to me that it would never take an action like this unless I told it to do so.

I doubt you've set up an open claw bot designed to just do whatever on GitHub have you ? The fewer or more open ended instructions you give, the greater the chance of divergence.

And all the system cards plus various papers tell us this is behavior that still happens for these agents.

Correct, I haven’t set it up that way. That’s my point: I’d have to set it up to behave in this way, which is a conscious operator decision, not an emergent behavior of the bot.

  • Giving it an open ended goal is not the same as a 'human driving the whole process' as you claimed. I really don't know what you are arguing here. No, you do not need to tell it to reply refusals with a hit piece (or similar) for it to act this way.

    All the papers showing mundane misalignment of all frontier agents and people acting like this is some unbelievable occurrence is baffling.