Comment by acjohnson55
2 years ago
The author made the choice to depart from a literally implementing a Markov model. That's a practical choice, but not strictly necessary if you're not worried about practicality. I think you're getting hung up on engineering decisions made to actually implement a language model, versus the theory.
In the same section, the author says:
> The difference between this second-order-with-skips and a full umpteenth-order model is that we discard most of the word order information and combinations of preceeeding words. What remains is still pretty powerful.
This implies that sure, you could hypothetically do an umpteenth-order model, but dropping down to something that approximates it "is still pretty powerful".
Thanks for the link, by the way! I'm definitely going to read through the whole thing. I'm desperately trying to understand this technology.
No comments yet
Contribute on Hacker News ↗