← Back to context

Comment by tontinton

20 hours ago

Is it similar to rtk? Where the output of tool calls is compressed? Or does it actively compress your history once in a while?

If it's the latter, then users will pay for the entire history of tokens since the change uncached: https://platform.claude.com/docs/en/build-with-claude/prompt...

How is this better?

We do both:

We compress tool outputs at each step, so the cache isn't broken during the run. Once we hit the 85% context-window limit, we preemptively trigger a summarization step and load that when the context-window fills up.

  • > we preemptively trigger a summarization step and load that when the context-window fills up.

    How does this differ from auto compact? Also, how do you prove that yours is better than using auto compact?