Comment by techsystems
2 days ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
2 days ago
How does the context length scaling at 256K tokens compare to Llama's 1M in terms of performance? How are the contexts treated differently?
No comments yet
Contribute on Hacker News ↗