Comment by azurewraith

5 days ago

Hey it's me again. Some things that didn't fit in the README or the original post -- less about features, more about where this goes.

The plan/implement/test workflow is very basic and represents the most common agentic use case. But the state machine pattern applies to any multi-step work where agents are useful but susceptible to death spirals, hallucinations, or other non-deterministic quirkiness. This also enables Claude Desktop and other non-coding agents to perform useful constrained work.

I've been building a content pipeline for tabletop publishing and tested it a bit earlier yesterday. A research phase gathers lore and game details from a compendium, a drafting phase generates structured content including schema-specific JSON validation (so my Lua+LaTeX templates work without iterating). A review gate has me editing content directly (tmux+neovim dialog is great for this). The agent shapes the content, makes sure it conforms to JSON validation and content requirements, then I write it. Before I adapted the state machine to it, the agent tried to do everything all at once — calling multiple agents is sometimes effective but details get lost and you definitely lose visibility in the summarization. The state machine runs everyone serially (for now) but chaining and parallelization are on the roadmap.

While working with statewright on a different workflow over the weekend and Claude (as Claude does) attempted to write an intricate bash script to work around a guardrail... and statewright blocked it! I think that was when I knew there was some real power behind what's been built here. Enforcement has to be structural, not advisory.

Also, being generally useful for things besides coding you can start to think about things like SOC 2 change management. Every change needs a plan, a human review gate, audited implementation, pull request, review, human approval, and then finally a human to approve a production deployment. Today teams enforce this with checklists and hope. An agent constrained by a workflow that won't let it deploy without all the prerequisite pieces is enterprise delivery with an auditable paper trail and humans injected for approvals where they need to be - not managing each change's lifecycle.

The piece I'm most excited about is agent-generated workflows. You solve a problem once and maintain your context, then point the agent at the JSON schema and it creates and uploads a new workflow to statewright automatically that you can use immediately. No fine-tuning, no exhaustive prompt engineering, no dozens of agents... best-fit lightweight guardrails that agents help build themselves, compiling your intent into structure the models can't weasel their way out of. This is a fundamentally different reality than what the current state of the art is practicing. I think that's a big deal.

0 comments

azurewraith

No comments yet

Contribute on Hacker News ↗