Comment by K0balt

12 hours ago

It is text prediction. But to predict text, other things follow that need to be calculated. If you can step back just a minute, i can provide a very simple but adjacent idea that might help to intuit the complexity of “ text prediction “ .

I have a list of numbers, 0 to9, and the + , = operators. I will train my model on this dataset, except the model won’t get the list, they will get a bunch of addition problems. A lot. But every addition problem possible inside that space will not be represented, not by a long shot, and neither will every number. but still, the model will be able to solve any math problem you can form with those symbols.

It’s just predicting symbols, but to do so it had to internalize the concepts.

2 comments

K0balt

qsera 8 hours ago

>internalize the concepts.

This gives the impression that it is doing something more than pattern matching. I think this kind of communication where some human attribute is used to name some concept in the LLM domain is causing a lot of damage, and ends up inadvertently blowing up the hype for the AI marketing...

K0balt 1 hour ago

Except I actually mean to infer the concept of adding things from examples. LLMs are amply capable of applying concepts to data that matches patterns not ever expressed in the training data. It’s called inference for a reason.
Anthropomorphic descriptions are the most expressive because of the fact that LLMs based on human cultural output mimic human behaviours, intrinsically. Other terminology is not nearly as expressive when describing LLM output.
Pattern matching is the same as saying text prediction. While being technically truthy, it fails to convey the external effect. Anthropomorphic terms, while being less truthy overall, do manage to effectively convey the external effect. It does unfortunately imply an internal cause that does not follow, but the externalities are what matter in most non-philosophical contexts.