Comment by arketyp
2 years ago
I've seen Markov models mentioned a lot in these context, and my generous take has always been that something like stacked Markov models is meant. At each abstraction layer, the state is conditioned only by the previous abstract concept. At the lowest level the states would be the sequence of tokens; higher up it's concepts like turn of events in a plot. I don't think this often proposed idea of hierarchy is sufficient to describe LLMs or human cognition, but it strikes at some essence about parsimony, efficient representation, and local computation.
No comments yet
Contribute on Hacker News ↗