Comment by lelanthran

3 days ago

> Markov models usually only predict the next token given the two preceding tokens (trigram model) because the data gets so exceptionally sparse beyond that

Of course, that's because it is a probability along a single dimension with a chain-length along that one dimension while LLMs and NNs use multiple dimensions (They are meshed, not chained).

I really want to know what the result would look like with a few more dimensions resulting in a markov mesh type structure rather than a chain structure.