Comment by msgodel

2 days ago

I run my local LLMs with a seed of one. If I re-run my "ai" command (which starts a conversation with its parameters as a prompt) I get exactly the same output every single time.

13 comments

msgodel

lgessler 2 days ago

In my (poor) understanding, this can depend on hardware details. What are you running your models on? I haven't paid close attention to this with LLMs, but I've tried very hard to get non-deterministic behavior out of my training runs for other kinds of transformer models and was never able to on my 2080, 4090, or an A100. PyTorch docs have a note saying that in general it's impossible: https://docs.pytorch.org/docs/stable/notes/randomness.html

Inference on a generic LLM may not be subject to these non-determinisms even on a GPU though, idk

msgodel 2 days ago

Ah. I've typically avoided CUDA except for a couple of really big jobs so I haven't noticed this.

xnx 2 days ago

Yes. This is what I was trying to say. Saying "It’s worth noting that LLMs are non-deterministic" is wrong and should be changed in the blog post.

TheDong 2 days ago
> Saying "It’s worth noting that LLMs are non-deterministic" is wrong and should be changed in the blog post.
Every person in this thread understood that Simon meant "Grok, ChatGPT, and other common LLM interfaces run with a temperature>0 by default, and thus non-deterministically produce different outputs for the same query".
Sure, he wrote a shorter version of that, and because of that y'all can split hairs on the details ("yes it's correct for how most people interact with LLMs and for grok, but _technically_ it's not correct").
The point of English blog posts is not to be a long wall of logical prepositions, it's to convey ideas and information. The current wording seems fine to me.
The point of what he was saying was to caution readers "you might not get this if you try to repro it", and that is 100% correct.
- root_axis 2 days ago
  
  Still, the statement that LLMs are non-deterministic is incorrect and could mislead some people who simply aren't familiar with how they work.
  Better phrasing would be something like "It's worth noting that LLM products are typically operated in a manner that produces non-deterministic output for the user"
  
  2 replies →
- msgodel 2 days ago
  
  My temperature is set higher than zero as well. That doesn't make them nondeterministic.
  
  1 reply →
boroboro4 2 days ago
You’re correct in batch size 1 (local is one), but not in production use case when multiple requests get batched together (and that’s how all the providers do this).
With batching matrix shapes/request position in them aren’t deterministic and this leads to non deterministic results, regardless of sampling temperature/seed.
- unsnap_biceps 2 days ago
  
  Isn't that true only if the batches are different? If you run exactly the same batch, you're back to a deterministic result.
  If I had a black box api, just because you don't know how it's calculated doesn't mean that it's non-deterministic. It's the underlaying algorithm that determines that and a LLM is deterministic.
  
  1 reply →
DemocracyFTW2 2 days ago

"Non-deterministic" in the sense that a dice roll is when you don't know every parameter with ultimate precision. On one hand I find insistence on the wrongness on the phrase a bit too OCD, on the other I must agree that a very simple re-phrasing like "appears {non-deterministic|random|unpredictable} to an outside observer" would've maybe even added value even for less technically-inclined folks, so yeah.