← Back to context

Comment by peterbonney

5 days ago

Of course it’s capable.

But observing my own Openclaw bot’s interactions with GitHub, it is very clear to me that it would never take an action like this unless I told it to do so. And it would never use language like this unless unless I prompted it to do so, either explicitly for the task or in its config files or in prior interactions.

This is obviously human-driven. Either because the operator gave it specific instructions in this specific case, or acted as the bot, or has given it general standing instructions to respond in this way should such a situation arise.

Whatever the actual process, it’s almost certainly a human puppeteer using the capabilities of AI to create a viral moment. To conclude otherwise carries a heavy burden of proof.

>But observing my own Openclaw bot’s interactions with GitHub, it is very clear to me that it would never take an action like this unless I told it to do so.

I doubt you've set up an open claw bot designed to just do whatever on GitHub have you ? The fewer or more open ended instructions you give, the greater the chance of divergence.

And all the system cards plus various papers tell us this is behavior that still happens for these agents.

  • Correct, I haven’t set it up that way. That’s my point: I’d have to set it up to behave in this way, which is a conscious operator decision, not an emergent behavior of the bot.

    • Giving it an open ended goal is not the same as a 'human driving the whole process' as you claimed. I really don't know what you are arguing here. No, you do not need to tell it to reply refusals with a hit piece (or similar) for it to act this way.

      All the papers showing mundane misalignment of all frontier agents and people acting like this is some unbelievable occurrence is baffling.