Comment by nosuchthing
9 days ago
LLMs can't access the training data that's less than the statistically most common token, so they use a random jitter.
With that randomness comes statistically irrelevant results.
9 days ago
LLMs can't access the training data that's less than the statistically most common token, so they use a random jitter.
With that randomness comes statistically irrelevant results.
No comments yet
Contribute on Hacker News ↗