Comment by quantum_state
4 days ago
As the bottom of LLM inference, it is sampling for the next token based on the probability distribution conditioned on the tokens currently in the context window. If the distribution exhibits degeneracy in probability for more than token, outcome of the sampling will naturally, as it should, be nondeterministic. It should be left alone.
No comments yet
Contribute on Hacker News ↗