Comment by riku_iki
3 days ago
> the real question is why other LLMs didn't do as well in this benchmark.
they do. There is a cycle for each major model:
- release new model(Gemini/ChatGPT/Grock N) which beats all current benchmarks
- some new benchmarks created
- release new model(Gemini/ChatGPT/Grock N+1) which beats benchmarks from previous step
No comments yet
Contribute on Hacker News ↗