Comment by rmonvfer

1 month ago

Looks cool but as others have said, it’s really hard to just try all similar projects because all of them promise the same thing but I haven’t seen any of them provide any benchmarks.

Claude Code keeps all the conversation logs stored on-disk right? Why not parse them asynchronously and then use hooks to enrich the context as the conversation goes? (I mean in the most broad and generic way, I guess we’d have to embed them, do some RAG… the whole thing)

1 comment

rmonvfer

christinetyip 1 month ago

Yep, parsing logs + async RAG works fine if you’re staying inside a single tool.

The issue we ran into when building agent systems was portability. Once you want multiple agents or models to share the same evolving context, each tool reconstructing its own memory from transcripts stops scaling.

We’re less focused on “making agents smarter” and more on avoiding fragmentation when context needs to move across agents, tools, or people — for example, using context created in Claude from Codex, or sharing specific parts of that context with a friend or a team.

That’s also why benchmarks are tricky here. The gains tend to show up as less duplication and less state drift rather than a single accuracy metric. What would constitute convincing proof in this space for you?