← Back to context

Comment by azurewraith

6 days ago

Exactly right, and the next step is making it enforceable rather than aspirational. Restrict the plan phase to read-only tools so the agent literally can't edit during planning. Restrict the impl phase to the edit tools the plan identified. Even this basic formalization: same model, same task... yielded a dramatic capability improvement on sub-20B models in my testing.