Comment by lyu07282

3 days ago

I was under the impression that log probabilities don't work like that / they aren't really useful to be interpreted as probabilities?

https://news.ycombinator.com/item?id=42684629

> the logits aren't telling you anything like 'what is the probability in a random sample of Internet text of the next token', but are closer to a Bellman value function, expressing the model's belief as to what would be the net reward from picking each possible BPE as an 'action' and then continuing to pick the optimal BPE after that (ie. following its policy until the episode terminates). Because there is usually 1 best action, it tries to put the largest value on that action, and assign very small values to the rest (no matter how plausible each of them might be if you were looking at random Internet text)

Yes it is true that the model has undergone SFT, and RLHF, and other alignment procedures, and hence the logprobs do not reflect the probability of the next token as in the pre-training corpus. Nevertheless, in concrete applications such as our main internal use-case: structured data extraction from pdf documents it revealed very valuable. When the value was obviously well extracted, the logprob was high and when the information was super hard to find or impossible the model would output - or hallucinate - some value with much lower logprob.

Perplexity: a metric, often used to evaluate LLMs, is derived from the negative average logprob of the tokens in a test set. Lower perplexity indicates that the model assigns higher probabilities to the observed tokens, reflecting better language modeling.