← Back to context

Comment by padolsey

6 months ago

Many of these evals are quite easy to game. Often the actual evaluation part of benchmarking is left up to a good-faith actor, which was usually reasonable in academic settings less polluted by capital. AI labs, however, have disincentives to do a thorough or impartial job, so IMO we should never take their word for it. To verify, we need to be able to run these evals ourselves – this is only sometimes possible, as even if the datasets are public, the exact mechanisms of evaluation are not. In the long run, to be completely resilient to gaming via training, we probably need to follow suit of other fields and have third-party non-profit accredited (!!) evaluators who's entire premise is to evaluate, red-team, and generally keep AI safe and competent.