Comment by kuboble

21 hours ago

I wonder what is the business model.

It seems like the tool to solve the problem that won't last longer than couple of months and is something that e.g. claude code can and probably will tackle themselves soon.

Claude code still has /compact taking ages - and it is a relatively easy fix. Doing proactive compression the right way is much tougher. For now, they seem to bet on subagents solving that, which is essentially summarization with Haiku. We don't think it is the way to go, because summarization is lossy + additional generation steps add latency

Don't tools like Claude Code sometimes do something like this already? I've seen it start sub-agents for reading files that just return a summarized answer to a question the main agent asked.

  • There is a nice JetBrains paper showing that summarization "works" as well as observation masking: https://arxiv.org/pdf/2508.21433. In other words, summarization doesn't work well. On top of that, they summarize with the cheapest model (Haiku). Compression is different from summarization in that it doesn't alter preserved pieces of context + it is conditioned on the tool call intent