Comment by kromem

2 years ago

It would be more accurate to say that a Markov chain is an example of a method that would perform relatively well at the same training task as a LLM.

So too, might a human trying to predict the next tokens.

But a human and a Markov chain are not the same underlying process to achieve next token prediction, and neither is the same underlying process as a LLM.

LLM are markov chains, a markov chain is a general concept and not just a text model technique. You must be thinking about the very simple markov chain models we had before where you just predicted the next word by looking up sentences with the same preceding words and picking a random of those words, that is also a markov chain just like LLM but a much simpler one, you are right LLMs aren't like that but they are still markov chains with the same kind of inputs and outputs as the old ones.