← Back to context

Comment by somenameforme

8 hours ago

I'd add another thing here as well. Many take this sort of conspiratorial view like companies training on benchmarks would be some sort of underhanded intent at cheating. In reality, benchmarks also provide a way for companies to easily compare themselves to competitors and work to iteratively improve their own models, so there's a completely non-nefarious motivation to maximize scores on benchmarks.

In the end all it does is affirm what you're saying though. Benchmarks are essentially obsolete the moment they become recognized. I suppose it's just another iteration of Goodhart's Law.