← Back to context

Comment by aembleton

3 months ago

> why shouldn't LLMs

Because they're non-deterministic.

It is one thing that you are getting results that are samples from the distribution ( and you can always set the temperature to zero and get there mode of the distribution), but completely another when the distribution changes from day to day.

What? No they aren't.

You get different results each time because of variation in seed values + non-zero 'temperatures' - eg, configured randomness.

Pedantic point: different virtualized implementations can produce different results because of differences in floating point implementation, but fundamentally they are just big chains of multiplication.

  • On the other hand, responses can be kind of chaotic. Adding in a token somewhere can sometimes flip things unpredictably.

  • But experience shows that you do need non-zero temperature for them to be useful in most cases.