← Back to context

Comment by Agent_Builder

7 hours ago

This mirrors what we ran into pretty quickly.

The agent wasn’t failing because it couldn’t write code. It failed because “code-only” still leaves a lot of implicit authority. Once it’s allowed to reason freely across steps, it starts making assumptions that were never explicitly approved.

What helped us was forcing the workflow to be boring. Each step declares what it can touch, what tools it can use, and what kind of output is allowed. When the step ends, that authority disappears.

The agent becomes less clever, but way more predictable. Fewer surprising edits, fewer cascading mistakes.

We ended up using GTWY for this style of step-gated agent work, and it made long-running agents feel manageable instead of fragile.