Comment by mr_wiglaf
5 days ago
The tricky thing is you get to define the state. So if the "state" is the current word _and_ the previous 10 it is still "memoryless". So an LLM's context window is the state. It doesn't matter whether _we_ see parts of the state as called history, the markov chain doesn't care (they are all just different features).
Edit: I could be missing important nuance that other people are pointing out in this thread!
No comments yet
Contribute on Hacker News ↗