← Back to context

Comment by jason_oster

11 days ago

I mentioned this in another thread, but this is genuinely demonstrating a known issue with ambiguous prompts.

You might be inclined to say, "a human would always interpret the question as having the car nearby the speaker, 50m away from the carwash." But this is objectively untrue. There are people in this comments section and on the Mastodon thread that found the question to be somewhat confusing.

In other words, the premise that "understand[ing] a prompt like a human" is all that's needed is wrong because not every human interprets ambiguities in the same way. The human phenomenon is well researched in psychology. The LLM equivalent is also well researched, and several proposals have been put forth over the years to address it. This is a pretty good research paper on the subject, and it links to other relevant studies: https://arxiv.org/abs/2511.10453v2 (Although I disagree with their method. I think asking clarifying questions is a superior approach than trying to one-shot every possible interpretation.)

So yes, there is a ton of research on the problem. Some datasets include ambiguous questions and instructions for this reason. A couple of examples are provided in the linked paper.

It's not necessarily about that humans can't mistake the question too, but just that overall LLMs seem to have far less ability to correctly understand a prompt than the average human. And that the "intelligence" shown in its understanding of the prompt seems to be far less than its "intelligence" in its answers.

So it feels like a big area of limitation or a big bottleneck towards getting a good answer.

  • I think we're also miscommunicating, so this isn't really a surprise.

    It is not clear to me why you've quoted "intelligence" or why you separated understanding and answers as having independent intelligences.

    But yes, I agree there are limitations. Much like those above that are being researched.