Comment by radicality
1 day ago
Huh, I don’t know what “long context performance” means exactly in these tests, so completely anecdotally , my experience with gpt5.4 via codex cli vs Claude code opus, gpt5.4 seems to do significantly better in long contexts I think partly due to some special context compaction stored in encrypted blobs. On long conversations opus in Claude code will for me lose memory of what we were working on earlier, whereas one of my codex chats is already at >1B tokens and is still very coherent and remembers things I asked of it at the beginning of the convo.
This isn’t talking about compaction. This refers to performance as the model is loaded with 500k to 1m tokens.
Ah, thanks, makes sense, I’ll read more about this