Comment by DavyJone
21 days ago
I think im missing something, don't all harnesses (opencode, pi, etc) already do stuff like "retry"? As far as I can see, when a tool call fails in either, the model gets the error back to correct.
21 days ago
I think im missing something, don't all harnesses (opencode, pi, etc) already do stuff like "retry"? As far as I can see, when a tool call fails in either, the model gets the error back to correct.
Yes and no.
Harnesses do have retry mechanisms. In opencode in particular, I think they return the error as-is to the model in the next turn. But that's slightly different. Harness retries come mostly in two flavors:
1) provider-layer: HTTP requests to cloud retries, with or without exponential backoff. It covers you for transient network hiccups or rate limits, and a big Opus model really doesn't need more than that.
2) sort of a hope-and-pray retry. Tool ran, returned an error string of some kind, gets fed into model as-is, and the model is expected to read the error message and self-correct with no guidance. This is fine for frontier, and even some of the large oss models. They have the context-following capabilities needed. For smaller models, this won't be enough, not reliably over many turns.
- if model outputs malformed json, provider will reject it before it even reaches the tool, the error loop is broken. A rescue parser handles that - can be ~5-15% of calls on a small model sometimes.
- model calls the wrong tool, correctly, then proceeds confidently with context that won't help it. step enforcement can help here.
- model terminates prematurely, thinking it's done. prerequisite enforcement can help here (say, forcing the model to call pytest before declaring the feature built).
- Escalating nudge messages, that specifically nudge. Just returning error messages doesn't tell the model what to do, it just tells it it was wrong. A message that spells out "tool X does not exist, call one of the available tools: A, B, C" is more helpful to a small model than "error: X not found".
So, in short - yes, retries exist in harnesses, but rely on top-tier model interpretation of the error messages. When working with top models, there's likely no real difference, or a minor one (see Opus bare vs Opus reforged). But Forge provides a more hardened suite of guardrails that are effectively necessary for small models.