Comment by jstanley

12 hours ago

> "GPUs don't do deterministic matrix multiplications" is the biggest source of randomness in LLMs.

But this isn't a fundamental property of LLMs, it's just an implementation detail. It's pretty obvious that if you evaluate the matrix multiplications correctly and deterministically sample from the highest-probability outputs, you will have a deterministic LLM.

3 comments

jstanley

vbarrielle 12 hours ago

It may be an implementation detail, but in practice, if the only way to get a deterministic output is to run on the CPU, then it's not going to be usable.

317070 11 hours ago

Actually, Google's TPUs are also deterministic!
Dylan16807 10 hours ago

You can tell GPUs what order to do math instructions in.