Comment by westoncb

1 month ago

Interesting that compaction is done using an encrypted message that "preserves the model's latent understanding of the original conversation":

> Since then, the Responses API has evolved to support a special /responses/compact endpoint (opens in a new window) that performs compaction more efficiently. It returns a list of items (opens in a new window) that can be used in place of the previous input to continue the conversation while freeing up the context window. This list includes a special type=compaction item with an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation. Now, Codex automatically uses this endpoint to compact the conversation when the auto_compact_limit (opens in a new window) is exceeded.

19 comments

westoncb

icelancer 1 month ago

Their compaction endpoint is far and away the best in the industry. Claude's has to be dead last.

nubg 1 month ago
Help me understand, how is a compaction endpoint not just a Prompt + json_dump of the message history? I would understand if the prompt was the secret sauce, but you make it sound like there is more to a compaction system than just a clever prompt?
- FuckButtons 1 month ago
  
  They could be operating in latent space entirely maybe? It seems plausible to me that you can just operate on the embedding of the conversation and treat it as an optimization / compression problem.
  
  8 replies →
- Art9681 1 month ago
  
  Their models are specifically trained for their tools. For example the `apply_patch` tool. You would think it's just another file editing tool, but its unique diff format is trained into their models. It also works better than the generic file editing tools implemented in other clients. I can also confirm their compaction is best in class. I've imlemented my own client using their API and gpt-5.2 can work for hours and process millions of input tokens very effectively.
- EnPissant 1 month ago
  
  Maybe it's a model fine tuned for compaction?
kordlessagain 1 month ago

Yes, agree completely.

swalsh 1 month ago

Is it possible to use the compactor endpoint independently? I have my own agent loop i maintain for my domain specific use case. We built a compaction system, but I imagine this is better performance.

__jl__ 1 month ago

Yes you can and I really like it as a feature. But it ties you to OpenAI…
westoncb 1 month ago

I would guess you can if you're using their Responses api for inference within your agent.

jswny 1 month ago

How does this work for other models that aren’t OpenAI models

westoncb 1 month ago

It wouldn’t work for other models if it’s encoded in a latent representation of their own models.