Comment by srean
4 days ago
That's different.
You can certainly feed k-grams one at a time to, estimate the the probability distribution over next token and use that to simulate a Markov Chain and reinitialize the LLM (drop context). In this process the LLM is just a look up table to simulate your MC.
But an LLM on its own doesn't drop context to generate, it's transition probabilities change depending on the tokens.
No comments yet
Contribute on Hacker News ↗