Comment by Retr0id
4 hours ago
> [Claude] performed all ingredient classification under deterministic decoding (temperature 0–0.1)
Not that it matters much in this context, but low-temperature is not the same thing as deterministic.
4 hours ago
> [Claude] performed all ingredient classification under deterministic decoding (temperature 0–0.1)
Not that it matters much in this context, but low-temperature is not the same thing as deterministic.
Yep. Zero temperature is neither necessary nor sufficient for deterministic inference.
Why?
You can seed the randomness are still having nonzero temperature.
Numerical instability can introduce randomness especially on GPU like hardware unless you’re very careful about how you write your algorithms.
In any batch inference environment that includes experts, expert routing may vary depending on what else is in the batch. For one thing.