Comment by spindump8930
1 day ago
The many sources of stochastic/non-deterministic behavior have been mentioned in other replies but I wanted to point out this paper: https://arxiv.org/abs/2506.09501 which analyzes the issues around GPU non determinism (once sampling and batching related effects are removed).
One important take-away is that these issues are more likely in longer generations so reasoning models can suffer more.
No comments yet
Contribute on Hacker News ↗