← Back to context

Comment by kazinator

6 hours ago

Even a tiny model for, say, classifying hand-written digits, will correctly classify digits that didn't appear in its training data. (Otherwise it wouldn't be very useful.) That classification is interpolative; the hand-written digit is lands in the space of the training data.

Every result is explainable by has having come from training data. That's the null hypothesis.

The alternative hypothesis is that it's not explainable as having come from training data. That's a hard-to-believe, hard-to-prove negative.

You don't get anything out of any computational process that you didn't put in.

You actually do not classify digits that didn't appear, you classify different pictures of digits that DID appear.

Similarly, LLMs do not invent a new way of reasoning about problems or language. They do, however, apply these to unseen problems.

LLMs are one level of abstraction up, but it's a very interesting level of abstraction.

  • >you classify different pictures of digits that DID appear.

    Are you implying models that classify hand-written digits don’t generalize and only work on training data?