Comment by coldtea

2 days ago

>Except in cases where the training data is more wrong than correct (e.g. niche expertise where the vox pop is wrong)

Same for human knowledge though. Learn from society/school/etc that X is Y, and you repeat X is Y, even if it's not.

>However, an LLM no more deals in Q&A than in facts. It only typically replies to a question with an answer because that itself is statistically most likely, and the words of the answer are just selected one at a time in normal LLM fashion.

And how is that different than how we build up an answer? Do we have a "correct facts" repository with fixed answers to every possibly question, or we just assemble our training data from a weighted graph (or holographic) store of factoids and memories, and our answers are also non deterministic?

We likely learn/generate language in an auto-regressive way at least conceptually similar to an LLM, but this isn't just self-contained auto-regressive generation...

Humans use language to express something (facts, thoughts, etc), so you can consider these thoughts being expressed as a bias to the language generation process, similar perhaps to an image being used as a bias to the captioning part of an image captioning model, or language as a bias to an image generation model.

  • >Humans use language to express something (facts, thoughts, etc), so you can consider these thoughts being expressed as a bias to the language generation process

    My point however is more that the "thoughts being expressed" are themselves being generated by a similar process (and that it's either that or a God-given soul).

    • Similar in the sense of being mechanical (no homunculus or soul!) and predictive, but different in terms of what's being predicted (auto-regressive vs external).

      So, with the LLM all you have is the auto-regressive language prediction loop.

      With animals you primarily have the external "what happens next" prediction loop, with these external-world fact-based predictions presumably also the basis of their thoughts (planning/reasoning), as well as behavior.

      If it's a human animal who has learned language, then you additionally have an LLM-like auto-regressive language prediction loop, but now, unlike the LLM, biased (controlled) by these fact-based thoughts (as well as language-based thoughts).