Comment by tedsanders
4 hours ago
Are you referring to FrontierMath?
We had access to the eval data (since we funded it), but we didn't train on the data or otherwise cheat. We didn't even look at the eval results until after the model had been trained and selected.
No comments yet
Contribute on Hacker News ↗