Comment by causalmodels

2 years ago

An LLM is a Markov chain in the same sense that a cat is a tiger, technically true but it misses the qualia.

It's not. There's fundamental architectural differences that couldn't be bigger.

A better comparison would be that it's like a windup toy versus a group of humans moving an entire civilization. They both move along a distance, but just listing the systems that the human group has that the windup toy doesn't is too long to fit on a page.

  • > It's not. There's fundamental architectural differences that couldn't be bigger.

    LLM architecture is a markov chain to the core. It isn't a lookup table like old markov chains but it is still a markov chain: next word prediction based on previous words.

    • Thanks for repeating this.

      Seems like most people fail to understand that LLMs (as they are implemented these days) are markov chains by definition, regardless of how much "better" they are compared to "Dissociated Press"-style markov chains based on lookup tables.

      > A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which *the probability of each event depends only on the state attained in the previous event*.

      Is the process calculating a probability distribution over "next token", based on a bounded-size context of "previous tokens"? Yes? Then it is a markov chain, by definition.

      It's like saying that a "human" is not an "animal", since it is so much "better" and "capable" than (other/usual) animals. The more you argue, the more I'll be convinced that you either don't know what the definition of a "human" is, or that you don't know what the definition of an "animal" is (or both).

      1 reply →