Comment by hmokiguess

1 month ago

To me the only acceptable answer would be “what do you mean?” or “can you clarify?” if we were to take the question seriously to begin with. People don’t intentionally communicate with riddles and subliminal messages unless they have some hidden agenda.

23 comments

hmokiguess

HarHarVeryFunny 1 month ago

Sure, if an open ended response was allowed, but if it was a multiple choice question then you'd have to use your common sense and pick one.

However, the important issue here really isn't about the ability of humans or LLMs to recognize logic puzzles. If you were asking an LLM for real world advice, trying to be as straightforward as possible, you may still get a response just as bad as "walk", but not be able to recognize that it was bad, and the reason for the failure would be exactly the same as here - failure to plan and reason through consequences.

It's toy problems like this that should make you step back once in a while and remind yourself of how LLMs are built and how they are therefore going to fail.

streetfighter64 1 month ago

How is that a "subliminal message"? It's just a simple example of common sense, which LLMs fail because they can't reason, not because they are "overthinking". If somebody asks, "What's 2+2?", they might be insulting you, but that doesn't mean the answer is anything other than 4.

mattclarkdotnet 1 month ago
2+2 might well not equal 4, since you haven’t specified the base of the numbers or the modulus of the addition.
And what if it’s a full service car wash and you’ve parked nearby because it’s full so you walk over and give them the keys?
Assumptions make asses of us all…
- streetfighter64 1 month ago
  
  So you're saying it would be useful for an "AI assistant" to ask you for the base each time you give it a math problem? Do you also want it to ask you if you're using the conventional definitions of "2" and "+"? For the car wash, would you like it to ask if you're on Earth or on Mars? Do you have air in your tires? Is the car actually a toy car?
  Some assumptions are always necessary and reasonable, that's why I'm saying the "AI" lacks common sense.
  
  2 replies →
hmokiguess 1 month ago
It’s common sense to ask a question in riddle format? What’s the goal of the person asking the question? To challenge the other person? In what way? See if they get the obvious? Asking for clarification isn’t valid?
- streetfighter64 1 month ago
  
  It's common sense to know that you need to have your car with you to wash it. Asking the question is a challenge in the obvious yes. If you asked an AI "what's 2+2" and it said 3, would you argue that the question was a trick question?
  
  7 replies →

piker 1 month ago

Thing is, it's not a riddle or a subliminal message. Everything needed to answer the question is contained therein.

voidhorse 1 month ago

That's precisely what makes it a "trick question" or a "riddle". It's weird precisely because all the information is there. Most people who have functioning brains and complete information don't ask pointless questions (they would, obviously, just drive their car to the car wash)—there's no functional or practical reason for the communication, which is what gives it the status of a puzzle—syntax and exploitation of our tendency to assume questions are asked because information is incomplete tricks us into brining outside considerations to bear that don't matter.
mdorazio 1 month ago

I don't think it is, though. Where is the car? Do you want to wash your car at the car wash? Both of those are rather important pieces of information. Everyone is relying on assumptions to answer the question, which is fine, but in my opinion not a great reasoning test.
hmokiguess 1 month ago

If you want to argue that, then you could also argue that everything needed to challenge the questions’ motives and its validity is also contained therein.
This reminds me of people who answer with “Yes” when presented with options where both can be true but the expected outcome is to pick one. For example, the infamous: “Will you be paying with cash or credit sir?” then the humorous “Yes.”

felix089 1 month ago

If you were forced to answer either or, which one would you pick? I think that's where the interesting dynamic comes from. Most humans would pick drive, also seen in the human control, even if it is lower that I thought it'd be

hmokiguess 1 month ago

Sure, though then we’re in la la land. What’s a real life example of being forced to answer an absurd question other than riddles, games, etc? No longer a valid question through normal discourse at that point, and if context isn’t provided then I think the expected outcome still is to ask for clarification.

hahn-kev 1 month ago

I would love to see LLMs start to ask clarifying questions. That feels like it would be a step up similar to reasoning

handoflixue 1 month ago

Claude Code has an entire tool for the LLM to asking clarifying questions - it'll give you three pre-written responses or you can respond with your own text.