Comment by PheonixPharts
2 years ago
LLMs are Markov chains in latent space, it's the latent representation that give them their power, but ultimately there's not as much difference as one would suspect.
2 years ago
LLMs are Markov chains in latent space, it's the latent representation that give them their power, but ultimately there's not as much difference as one would suspect.
They're different because Markov models are stateless whereas LLMs are stateful.
https://en.wikipedia.org/wiki/Markov_property
Current LLMs are stateless as far as we know, their state when computing a new token is only the preceding text tokens, they don't store any metadata or save state from the previous calculations.