Comment by chvid
5 months ago
As far as I can tell the only way of doing a comparison of two models, that cannot be easily gamed, is being having them in open weights form and then running them against a benchmark that was created after both of the two models were created.
No comments yet
Contribute on Hacker News ↗