← Back to context

Comment by Retr0id

4 hours ago

> [Claude] performed all ingredient classification under deterministic decoding (temperature 0–0.1)

Not that it matters much in this context, but low-temperature is not the same thing as deterministic.

Yep. Zero temperature is neither necessary nor sufficient for deterministic inference.

  • Why?

    • You can seed the randomness are still having nonzero temperature.

      Numerical instability can introduce randomness especially on GPU like hardware unless you’re very careful about how you write your algorithms.

    • In any batch inference environment that includes experts, expert routing may vary depending on what else is in the batch. For one thing.