← Back to context

Comment by pizza

1 month ago

Well, language models don't measure the state of the world - they turn your input text into a state of text dynamics, and then basically hit 'play' on a best guess of what the rest of the text from that state would contain. Part of your getting 'lies' is that you're asking questions for which the answer couldn't really be said to be contained anywhere inside the envelope/hull of some mixture of thousands of existing texts.

Like, suppose for a thought experiment, that you got ten thousand random github users, collected every documented instance of a time that they had referred to a line number of a file in any repo, and then tried to use those related answers to come up with a mean prediction for the contents of a wholly different repo. Odds are, you would get something like the LLM answer.

My opinion is that it is worth it to get a sense, through trial and error (checking answers), of when a question you have may or may not be in a blindspot of the wisdom of the crowd.