Comment by MPSimmons

19 hours ago

Oh, I'm interested - do you have any docs with human responses to that?

“Car Wash” test with 53 models

https://news.ycombinator.com/item?id=47031580

  • "Correct" is pushing it, the question is too vague if approached as a genuine question and not a gotcha. I've actually had literal experiences where I wanted to wash my car and walked to a car wash in the past. That was me collecting the car, and there is an argument that would be a valid walk answer.

    If we require logical rigour there isn't enough context in the question. If we allow for informal language then there are absolutely situations where cars get washed and people walk 50 meters to the car wash. It is a reasonable guess that the car is already at the wash and you have a 2nd car, given the question is being asked. It's a slight leap, but it is an inference that makes the question meaningful and so it is one that could be made.

    I'd assume the LLMs are just failing at spatial reasoning, because AFAIK they're terrible at it. But both answers are justifiable because we don't know where the car is and have to make assumptions.

  • this reminds me, I grew up in an area of the US where the pinnacle of existence was spending the whole weekend doing chores such as very publicly washing your own car in your driveway

    if you were an able bodied man there is no other duty. the same for shoveling snow, or mowing a lawn, cleaning up inside the house

    these are all things I've rejected and exempt myself from

    but I'm beginning to remember large swaths of society live under that regime, so driving to a car wash wouldn't be an option at all. you wash your car and have a separate desire to walk to the car wash for some other reason

    I could see people thinking its a trick question, or just scoffing at the idea people wash their cars at the car wash and pollute the data for AIs in annotation work.

    • Sometimes I miss washing my car on the driveway. I guess I’m far less emotionally attached to my car now than I was in the 1980s.