Comment by _Algernon_

3 days ago

LLMs aren't Markov chains unless they have a context window of 1.

>In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event

1 comment

_Algernon_

mr_wiglaf 3 days ago

The tricky thing is you get to define the state. So if the "state" is the current word _and_ the previous 10 it is still "memoryless". So an LLM's context window is the state. It doesn't matter whether _we_ see parts of the state as called history, the markov chain doesn't care (they are all just different features).

Edit: I could be missing important nuance that other people are pointing out in this thread!