Comment by closetkantian
7 months ago
So, another poster cleared up my first question. It's probably because the seed is the same. I think it would have been a better demo if it hadn't been, though.
7 months ago
So, another poster cleared up my first question. It's probably because the seed is the same. I think it would have been a better demo if it hadn't been, though.
You got it, same seed in practice, but also just temperature = 0 for the demo actually. A few things I considered adding for the fun of it were 1) a way to specify a seed in the input text, 2) a way to using a symbol to say "I didn't like that token, try to generate another one", so you could do, say, "!" to generate tokens, "?" to replace the last generated token. So you would end up typing things like
"Once upon a time!!!!!!!!!!!!!!!!!!!!!!!!!!!!!SEED42!!!!!??!!!??!"
and 3) actually just allow you to override the suggestions by typing what letters on your own, to be used in future inferences. At that point it'd be a fairly generic auto-complete kind of thing.
Using the input characters to affect the token selection would increase the ‘magic’ a little.
As it is, if you go back into a string of !!!!!!!!!! That has been turned into ‘upon a time’, and try to delete the ‘a’, you’ll just be deleting an ! And the string will turn into ‘once upon a tim’.
If you could just keyboard mash to pass entropy to the token sampler, deleting a specific character would alter the generation from that point onwards.
But having the same "seed" doesn't guarantee the same response from an LLM, hence the question above.
I fail to understand how an LLM could produce two different responses from the same seed. Same seed implies all random numbers generated will be the same. So where is the source of nondeterminism?
I believe people are confused because ChatGPT's API exposes a seed parameter which is not guaranteed to be deterministic.
But that's due to the possibility model configuration changes on the service end and not relevant here.
Barring subtle incompatibilities in underlying implementations on different environments, it does, assuming all other generation settings (temperature, etc.) are held constant.