Comment by ranguna
2 months ago
Impossible?
You just have to add a human in the loop for destructive calls. Add an additional TOTP parameter to destructive calls that's generated from the agent UI that requires a human to click a button, which generates a code that's sent to the model and used in the call.
Why do you think this is impossible?
Impossible without a human in the loop.
Having said that - even categorisation of destructive and non destructive calls is inherently not safe, unless you have very strict os level / VM like setup (everything read only, world access is through MCPs so it is not LLM deciding the destructive calls but the MCP etc. )