Comment by ngrislain
4 days ago
Thank you! The number is the the sum of the logprobs from the token constituting the individual values. So it does represent the likelihood of seeing this value. So yes OpenAI is super-biased as a random number generator. We sampled other values from OpenAI and got other die roll values, but with much lower probs (5 has 8% chances ).
More precisely it represents the likelihood of seeing this value conditional on the tokens before it.
Even without other tokens before it the LLM is probably showing the probability of dice rolls based on its training data. I’d guess humans tend to prefer “3” or “4” as it’s nearer the avg/median and feels fairer.
AFAICT, the LLMs aren’t creating new mental mappings of “dice are a symmetric and should give equal probability to land on any side followed by using that info to infer they should use a RNG.”
and i guess includes other possibilities than numbers, like 'f' which could lead to four or five. There's probably a separate probability for 'fi' and 'fo' too.