Comment by Nizoss

16 hours ago

Exactly! I don’t babysit TDD anymore. I have another agent that does that for me and honestly sometimes catches things I would have missed if I was the babysitting.

Hooks do wonders here. The payload contains a lot of information about the pending action the agent wants to make. Combine that with the most recent n events from the agent’s session history and you have a rich enough context to pass to another agent to validate the action through the SDK.

This way the validation uses the same subscription you’re logged in to, whether you’re using Claude Code, Codex, or Copilot. The validation agent responds with a json format that you can easily parse and return, allowing you to let the action through or block it with direction and guidance. I’m genuinely impressed by how well this works considering how simple it is.

You can find my approach here: https://github.com/nizos/probity