Comment by astrange
10 days ago
The difference is there are exponentially more states than an n-gram model. It's really not the same thing at all. An LLM can perform nearly arbitrary computation inside its fixed-size memory.
https://arxiv.org/abs/2106.06981
(An LLM with tool use isn't a Markov process at all of course.)
No comments yet
Contribute on Hacker News ↗