Comment by snemvalts
2 hours ago
What about other benchmarks? Benchmarks where the contents are freely available have become useless for evaluating models.
2 hours ago
What about other benchmarks? Benchmarks where the contents are freely available have become useless for evaluating models.
No comments yet
Contribute on Hacker News ↗