Comment by thesz
10 days ago
> A markov chain model will literally have a matrix entry for every possible combination of inputs.
The less frequent prefixes are usually pruned away and there is a penalty score to add to go to the shorter prefix. In the end, all words are included into the model's prediction and typical n-gram SRILM model is able to generate "the pig with dragon head," also with small probability.
Even if you think about Markov Chain information as a tensor (not matrix), the computation of probabilities is not a single lookup, but a series of folds.
No comments yet
Contribute on Hacker News ↗