Comment by zwaps

1 day ago

You actually do not classify digits that didn't appear, you classify different pictures of digits that DID appear.

Similarly, LLMs do not invent a new way of reasoning about problems or language. They do, however, apply these to unseen problems.

LLMs are one level of abstraction up, but it's a very interesting level of abstraction.

2 comments

zwaps

bumby 1 day ago

>you classify different pictures of digits that DID appear.

Are you implying models that classify hand-written digits don’t generalize and only work on training data?

kazinator 1 day ago

No, that is false; a neural net trained on a decent set of handwritten digits will recognize a newly handwritten digit.

I'm saying that this is a strawman version of "not in the training data". The newly handwritten digit is squarely the same sort of stuff that is in the training data: an interpolation.

We are not surprised when we fit a curve to a bunch of points and then find points on the curve that are not exactly any of those points, but are located among the points.

Go too far outside of the cluster of points though and the curve is a hallucination.

This is the intuition behind interpolate vs extrapolate.