Comment by MarkusQ
13 days ago
LLMs include mechanisms (notably, attention) that allow longer-distance correlations than you could get with a similarly-sized Markov chain. If you squint hard enough though, they are Markov chains with this "one weird trick" that makes them much more effective for their size.
No comments yet
Contribute on Hacker News ↗