Comment by ngrislain

3 days ago

Yes it is true that the model has undergone SFT, and RLHF, and other alignment procedures, and hence the logprobs do not reflect the probability of the next token as in the pre-training corpus. Nevertheless, in concrete applications such as our main internal use-case: structured data extraction from pdf documents it revealed very valuable. When the value was obviously well extracted, the logprob was high and when the information was super hard to find or impossible the model would output - or hallucinate - some value with much lower logprob.