Comment by aithrowawaycomm

1 year ago

> The confusion I see here is that people are equating a hold out set to a blind set. That's a set of data to test against that the model developers (and model) cannot see.

What on earth? This is from Tamay Besiroglu at Epoch:

  Regarding training usage: We acknowledge that OpenAI does have access to a large fraction of FrontierMath problems and solutions, with the exception of a unseen-by-OpenAI hold-out set that enables us to independently verify model capabilities. However, we have a verbal agreement that these materials will not be used in model training.

So this "confusion" is because Epoch AI specifically told people it was a blind set! Despite the condescending tone, your comment is just plain wrong.

3 comments

aithrowawaycomm

EagnaIonat 1 year ago

Your quote literally says hold-out set.

Kbelicius 1 year ago
It also literally says "unseen-by-OpenAI".
- EagnaIonat 1 year ago
  
  Curious why you would be responding on an old post. Are you an alt account?
  Your comment doesn't contradict what I said.