Comment by gmueckl
4 hours ago
This comparison is only meaningful with comparable numbers of parameters and context window tokens. And then it would mainly test the efficiency and accuracy of the information encoding. I would argue that this is the main improvement over all model generations.
No comments yet
Contribute on Hacker News ↗