Comment by aithrowawaycomm
6 months ago
> The confusion I see here is that people are equating a hold out set to a blind set. That's a set of data to test against that the model developers (and model) cannot see.
What on earth? This is from Tamay Besiroglu at Epoch:
Regarding training usage: We acknowledge that OpenAI does have access to a large fraction of FrontierMath problems and solutions, with the exception of a unseen-by-OpenAI hold-out set that enables us to independently verify model capabilities. However, we have a verbal agreement that these materials will not be used in model training.
So this "confusion" is because Epoch AI specifically told people it was a blind set! Despite the condescending tone, your comment is just plain wrong.
Your quote literally says hold-out set.
It also literally says "unseen-by-OpenAI".
Curious why you would be responding on an old post. Are you an alt account?
Your comment doesn't contradict what I said.