Comment by Racing0461

2 years ago

How do we know the model wans't pretrained on the evaluations to get higher scores? In general but especially for profit seeking corporations, this measure might become a target and become artificial.

3 comments

Racing0461

scarmig 2 years ago

Most engineers and researchers at big tech companies wouldn't intentionally do that. The bigger problem is that public evals leak into the training data. You can try to cleanse your training data, but at some point it's inevitable.

Racing0461 2 years ago
Yeah, i not saying it was intentional (misleading shareholders would be the worse crime here). Having these things in the training data without knowing due to how vast the dataset is is the issue.
- FergusArgyll 2 years ago
  
  > We filter our evaluation sets from our training corpus.
  Page 5 of the report (they mention it again a little later)
  https://storage.googleapis.com/deepmind-media/gemini/gemini_...