Comment by ricardobeat

7 hours ago

That’s what has been seen in practice though. SOTA LLMs have been shown again and again to solve problems unseen in their data set; and despite their shortcomings they have become extremely useful for a wide variety of tasks.

9 comments

ricardobeat

kazinator 3 hours ago

Even a tiny model for, say, classifying hand-written digits, will correctly classify digits that didn't appear in its training data. (Otherwise it wouldn't be very useful.) That classification is interpolative; the hand-written digit is lands in the space of the training data.

Every result is explainable by has having come from training data. That's the null hypothesis.

The alternative hypothesis is that it's not explainable as having come from training data. That's a hard-to-believe, hard-to-prove negative.

You don't get anything out of any computational process that you didn't put in.

zwaps 1 hour ago

You actually do not classify digits that didn't appear, you classify different pictures of digits that DID appear.
Similarly, LLMs do not invent a new way of reasoning about problems or language. They do, however, apply these to unseen problems.
LLMs are one level of abstraction up, but it's a very interesting level of abstraction.

loosetypes 6 hours ago

Mind linking any examples (or categories) of problems that are definitively not in pre training data but can still be solved by LLMs? Preferably something factual rather than creative, genuinely curious.

Dumb question but anything like this that’s written about on the internet will ultimately end up as training fodder, no?

dcre 2 hours ago
How about the International Math Olympiad?
https://arstechnica.com/ai/2025/07/google-deepmind-earns-gol...
- mvieira38 2 hours ago
  
  You're saying they don't use math textbooks and math forums to train LLMs, then?
  
  2 replies →

boxed 2 hours ago

> SOTA LLMs have been shown again and again to solve problems unseen in their data set

We have no idea what the training data is though, so you can't say that.

> and despite their shortcomings they have become extremely useful for a wide variety of tasks.

That seems like a separate question.

zwaps 1 hour ago

I have applied O3 pro on unpublished abandoned research of mine that was never published and lives in an intersection that is as entirely novel as it's uninteresting.
O3 pro (but not O3) was successfully able to apply reasoning and math to this domain in interesting ways, much like an expert researcher in these areas would.
Again, the field and the problem is with 100% certainty OOD of the data.
However, the techniques and reasoning methods are of course learned from data. But that's the point, right?