← Back to context

Comment by zambelli

5 hours ago

Forge does have tiered compaction, and it's configurable! Defaults are currently probably a bit on the high side for catching effective attention, but that might be a part of the code that interests you the most.

src/forge/context/ - specifically TieredCompact in strategies.py. That's the furthest I took it. The tool-call collapse in particular has been useful in agentic coding, but I haven't formalized/generalized it yet. I think within forge it'll be a callable tool that will rely on the model knowing when to trigger it (as you said - "I'm done with the task, can collapse"). That's the part I need to abstract out of my bespoke implementation.

At the moment TieredCompact is naive. It uses context thresholds the consumer determines and fires when those thresholds are hit. It just does different things at different threshold levels.

Your idea of using task shape to dynamically set those thresholds (or even move to model-triggered) I think is the key but is a trickier implementation. That's what I haven't gotten around to yet.

Definitely on my todo list but happy to check out a PR if you have something in mind.

Some additional info on my current public hack is also at: https://github.com/antoinezambelli/forge/blob/main/docs/USER...

  • Honestly probably not a PR from me right now — I'm in the middle of shipping something else — but the design idea I keep returning to is splitting the trigger into two signals:

    1. Runtime-computed "context pressure" — tokens-since-last-compaction, depth of tool-call nesting, response/call ratio in recent turns. The runtime computes this; the model never sees it.

    2. Model-emitted "natural breakpoint" — a tool call the model fires when it perceives it's done with a thread (file closed, task complete, branch abandoned).

    Compaction fires on the AND of both. Keeps the model from compacting mid-reasoning-chain, and keeps the runtime from waiting until 90% context for the model to notice on its own.

The "model triggers it" pattern is exactly the right shape, but there's a subtle failure mode in it: models are notoriously bad at perceiving their own context pressure. Asking "are you done with that thread?" lands well; asking "would compacting now help you?" doesn't, because the model lacks a reliable internal signal for "I'm starting to skim." You almost have to tie the compaction trigger to task-shape signals (file closed, test passed, agent reports a milestone hit) rather than self-assessment.

Going to actually go read TieredCompact tonight — curious whether you've ended up tying triggers to task signals or kept them on model self-report.

  • That's a very insightful observation. How could you explain that using the analogy of a pancake breakfast?