Comment by techsystems
5 months ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
5 months ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
No comments yet
Contribute on Hacker News ↗