Comment by k__

2 years ago

Fair, but maybe it's more of a computer science-cy type of comparison?

We say systems can perform the same types of computations if they're both Turing complete. Yet, we wouldn't implement everything in every "language" that is Turing complete.

Perhaps, every LLM could be represented as a Markov chain, and for some it even makes sense (e.g., easier to train, easier to reason about), but in most cases it's a bad idea (e.g., expensive, bad performance).