← Back to context

Comment by DeathArrow

5 days ago

First thought: But why do we need statewright.ai external api? Why can't we do everything locally?

Second thought: enforcing tools is useful and I built myself a Pi extension to deny access to particular tools in some workflows.

But we need somehow to force agents obey the rules.

For example I have rules when using Pi to ask main agent to dispatch implementer agents in parallel using git worktrees. Some time it uses git worktrees, sometimes not.

The thoughts are like this: "the user asked me to use git worktrees so let me start using git worktrees. But wait, the task is simple so maybe I don't need git worktrees..."

If I ask why it didn't follow the rules, it says something like: "The user is right, I should have followed the rules..."

you're hitting the nail on the head... rules in prompts are suggestions the model can rationalize away.

"the task is so simple that maybe I don't need worktrees" is the model overriding your intent with its own judgement and that's a pattern I'm seeing more and more as these models mature. statewright provides the guardrails... strong suggestions up front on what it can do in X state via injection and if it still wants to try and outsmart that, it gets hit in the post hook and the model gets the message "oh, you're right I shouldn't do it that way" ... instead of you course correcting, it's the state machine

to your first question: the engine is Apache 2.0 and runs locally. the managed service adds the visual editor, run history and plugin install. the enforcement itself doesn't require the cloud, I run the exact same engine on the backend

the MCP server is just the way to get statewright in the hands of a wide array of existing use cases, claude code included. not all agentic clients are created equal and Pi is actually the experience I want to hone next (also the most extensible)

> example I have rules when using Pi to ask main agent to dispatch implementer agents in parallel using git worktrees. Some time it uses git worktrees, sometimes no

I've taken the approach that whenever this happens, it's my fault. The instructions were not clear enough, not direct enough, or more often, there's just too many of them.

I'm now at the point where my pi system prompt + agents + skills + tools starts out at just 7k context. It's all very clear and concise. I almost never have ambiguous responses like this, at least not bear the start of a session.

Combined with instructions to keep the main session as a coordinator and use subagents for all non trivial work, I can get a lot of work done before hitting 100k context and basically never go over 150k.

It's a stark contrast with Claude code where I was starting at about 35k context even after trimming my stuff down. It's hardly surprising if an agent doesn't know what to do if you dump 30k+ of context with all kinds of rules and workflows, most of them unrelated to the current tasks, before you even do anything.

  • you're not wrong and trimming context is legitimately the first thing that everyone should do. even with context trimming and a tight prompt the model still makes judgement calls about which tools to use and when to stop.

    that's fine 90% of the time... the state machine for the other 10% where the model's judgement call costs you an hour of debugging later (confidently fixed wrong, or overzealously) or stops a mostly automated thing because it got stuck on the wrong path.