Comment by belter
9 days ago
But have we established that LLMs dont just interpolate and they can create?
Are we able to prove it with output that's
1) algorithmically novel (not just a recombination)
2) coherent, and
3) not explainable by training data coverage.
No handwaving with scale...
Why is that the bar though? Imagine LLMs as a kid that has a box of lego with a hundred million blocks in it, and it can assemble those blocks into any configuration possible. Is the fact that the kid doesn't have access to ABS plastic pellets and a molding machine, and so they can't make new pieces; does that really make us think that the kid just interpolates and can't create?
Actually yes...If the kid spends their whole life in the box and never invents a new block, that’s just combinatorics. We don’t call a chess engine ‘creative’ for finding novel moves, because we understand the rules. LLMs have rules too, they’re called weights.
I want LLMs to create, but so far, every creative output I’ve seen is just a clever remix of training data. The most advanced models still fail a simple test: Restrict the domain, for example, "invent a cookie recipe with no flour, sugar, or eggs" or "name a company without using real words". Suddenly, their creativity collapses into either, nonsense (violating constraints), or trivial recombination, ChocoNutBake instead of NutellaCookie.
If LLMs could actually create, we’d see emergent novelty, outputs that couldn’t exist in the training data. Instead, we get constrained interpolation.
Happy to be proven wrong. Would like to see examples where an LLM output is impossible to map back to its training data.
The combinatorics on choosing 500 pieces (words) out of a bag of 1.8 million pieces (approx parameters per layer for GPT-3) with replacement, and order matters works out to be something like 10^4600. Maybe you can't call that creativity, but you've got to admit that's a pretty big number.
2 replies →