← Back to context

Comment by Racing0461

2 years ago

How do we know the model wans't pretrained on the evaluations to get higher scores? In general but especially for profit seeking corporations, this measure might become a target and become artificial.

Most engineers and researchers at big tech companies wouldn't intentionally do that. The bigger problem is that public evals leak into the training data. You can try to cleanse your training data, but at some point it's inevitable.