← Back to context

Comment by thaumasiotes

3 months ago

>> The fact that they only generate sequences that existed in the source

> I am quite confused right now. Could you please help me with this?

This is pretty straightforward. Sohcahtoa82 doesn't know what he's saying.

29 comments

thaumasiotes

Reply

Sohcahtoa82 3 months ago

I'm fully open to being corrected. Just telling me I'm wrong without elaborating does absolutely nothing to foster understanding and learning.

thaumasiotes 3 months ago
If you still think there's something left to explain, I recommend you read your other responses. Being restricted to the training data is not a property of Markov output. You'd have to be very, very badly confused to think that it was. (And it should be noted that a Markov chain itself doesn't contain any training data, as is also true of an LLM.)
More generally, since an LLM is a Markov chain, it doesn't make sense to try to answer the question "what's the difference between an LLM and a Markov chain?" Here, the question is "what's the difference between a tiny LLM and a Markov chain?", and assuming "tiny" refers to window size, and the Markov chain has a similarly tiny window size, they are the same thing.
- astrange 3 months ago
  
  An LLM is not a Markov chain of the input tokens, because it has internal computational state (the KV cache and residuals).
  An LLM is a Markov process if you include its entire state, but that's a pretty degenerate definition.
  
  2 replies →
- johnisgood 3 months ago
  
  He said LLMs are creative, yet people have been telling me that LLMs cannot solve problems that is not in their training data. I want this to be clarified or elaborated on.
  
  9 replies →
- purple_turtle 3 months ago
  
  1) being restricted to exact matches in input is definition of Markov Chains
  2) LLMs are not Markov Chains
  
  13 replies →