Comment by hackinthebochs

8 hours ago

LLMs aren't language models, but are a general purpose computing paradigm. LLMs are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. Training LLMs discovers programs that well reproduce the input sequence. Roughly the same architecture can generate passable images, music, or even video.

It's not that language generation is all there is to AGI, but that to sufficiently model text that is about the wide range of human experiences, we need to model those experiences. LLMs model the world to varying degrees, and perhaps in the limit of unbounded training data, they can model the human's perspective in it as well.

[1] https://x.com/karpathy/status/1582807367988654081