Comment by ranguna

2 months ago

Impossible?

You just have to add a human in the loop for destructive calls. Add an additional TOTP parameter to destructive calls that's generated from the agent UI that requires a human to click a button, which generates a code that's sent to the model and used in the call.

Why do you think this is impossible?

1 comment

ranguna

postexitus 2 months ago

Impossible without a human in the loop.

Having said that - even categorisation of destructive and non destructive calls is inherently not safe, unless you have very strict os level / VM like setup (everything read only, world access is through MCPs so it is not LLM deciding the destructive calls but the MCP etc. )