← Back to context

Comment by kromem

2 years ago

Even if they aren't, there's a separate concern that we're past the inflection point of Goodhart's Law and this blind focus on a handful of tests evaluating a small scope of capabilities is going to be leading to model regression in areas that aren't being evaluated or measured as a target.

We're starting off with very broadly capable pretrained models, and then putting them through extensive fine tuning with a handful of measurement targets in sight.

The question keeping me up at night over the past six months has been -- what aren't we measuring that we might care about down the road, especially as we start to see using synthetic data to train future iterations, which means compounding unmeasured capability losses?

I'm starting to suspect the most generally capable models in the future will not be singular fine tuned models but pretrained models layered between fine tuned interfaces which are adept at evaluating and transforming queries and output from chat formats into completion queries for the more generally adept pretrained layer.