Comment by Der_Einzige
2 years ago
I hope that the long context length models start getting better. Claude 1 and GPT-4-128K both struggle hard once you get past about 32K tokens.
Most of the needle in a haystack papers are too simple of a task. They need harder tasks to test these long context length models for if they are truly remembering things or not.
No comments yet
Contribute on Hacker News ↗