Comment by kxbnb

1 month ago

The middleware proxy approach unop mentioned is the right pattern - you need an enforcement point the agent can't bypass.

At keypost.ai we're building exactly this for MCP pipelines. The proxy evaluates every tool call against deterministic policy before it reaches the actual tool. No LLM in the decision path, so no reasoning around the rules.

Re: chrisjj's point about "fix the program" - the challenge is that agents are non-deterministic by design. You can't unit test every possible action sequence. So you need runtime enforcement as a safety net, similar to how we use IAM even for well-tested code.

The human-in-the-loop suggestion works but doesn't scale. What we're seeing teams want is conditional human approval - only trigger review when the action crosses a risk threshold (first time deleting in prod, spend over $X, etc.), not for every call.

The audit trail gap is real. Most teams log tool calls but not the policy decision. When something goes wrong, you want to know: was it allowed by policy, or did policy not cover this case?

0 comments

kxbnb

No comments yet

Contribute on Hacker News ↗