Comment by dilap
10 hours ago
The fundamental nature of the model is that it consumes tokens as input and produces token probabilities as output, but there's nothing inherently "predictive" about it -- that's just perspective hangover from the historical development of how LLMs were trained. It is, fundamentally, I think, a general-purpose thinking machine, operating over the inputs and outputs of tokens.
(With this perspective, I can feel my own brain subtly oferring up a panoply of possible responses in a similar way. I can even turn up the temperature on my own brain, making it more likely to decide to say the less-obvious words in response, by having a drink or two.)
(Similarly, mimicry is in humans too a very good learning technique to get started -- kids learning to speak are little parrots, artists just starting out will often copy existing works, etc. Before going on to develop further into their own style.)
No comments yet
Contribute on Hacker News ↗