← Back to context

Comment by ashirviskas

2 months ago

It's closer to <30k before performance degrades too much for 3.5/3.7. 200k/64k is meaningless in this context.

Is there a benchmark to measure real effective context length?

Sure, gpt-4o has a context window of 128k, but it loses a lot from the beginning/middle.