Ask HN: Is there any tool that can stop LLM calls at runtime (not just monitor)?

13 hours ago

I’ve been running into cases where LLM/agent systems make unexpected or repeated calls and costs spike quickly.

Most tools I’ve found focus on observability (logs, traces, dashboards), but not actual enforcement.

Is there anything that can:

- stop or cut off a call mid-execution (based on budget, tokens, or conditions)?

- enforce limits at runtime instead of just alerting after the fact?

Curious if people here are solving this in practice, or just handling it at the application level.