Comment by gmueckl
11 hours ago
This comparison is only meaningful with comparable numbers of parameters and context window tokens. And then it would mainly test the efficiency and accuracy of the information encoding. I would argue that this is the main improvement over all model generations.
No comments yet
Contribute on Hacker News ↗