Comment by techsystems
16 hours ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
16 hours ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
No comments yet
Contribute on Hacker News ↗