Comment by xlii

6 months ago

Tried the same, results made me laugh.

Completely clueless. I've seen passing prompts 8 about how it's not in the city I am and yet it tries again and again. My favourite moment was when it started analysing piece of blurry asphalt.

After 6 minutes o3 it was confidently wrong: https://imgur.com/a/jYr1fz1

IMO not-in-US is actually great test if something was in LLMs data and the whole search is a for show.

5 comments

xlii

SamPatt 6 months ago

I'm surprised to hear that. I keep running tests and the results are incredible, not only in the US.

For example, here's a screenshot from a random location I found in Google Street View in Jordan:

https://cdn.jsdelivr.net/gh/sampatt/media@main/posts/2025-04...

And here's o3 nailing it:

https://cdn.jsdelivr.net/gh/sampatt/media@main/posts/2025-04...

Maybe using Google Street View images, zoomed out, tends to give more useful information? I'm unsure why there's such variance.

lolinder 6 months ago

Perhaps Google Street View is in the training set? These companies have basically scraped everything they can, I don't see any reason to believe they'd draw the line at scraping each other, and GSV is a treasure trove of labeled data.
chatmasta 6 months ago

I’ve had nearly 100% success with vacation photos in Europe, some simple landscapes and some obscured angles of landmarks. And that’s using the free ChatGPT with no CoT.
xlii 6 months ago

IMO the „thought” process is completely fake.
I wanted o3 to succeed so I gave more and more details. Every attempt was approx. 8 minute and it took 1h in total.
The extra input I provided (in order):
- belt of location of width of 40km (results and searches were made outside of the range)
- explicitly stated cities to omit (ignored instruction)
- construction date (wasn’t used in searches)
- OSM amenity (townhall) - streetnumber (it insisted that it’s incorrect and keep giving other result) - at that point there were only 6 results from overpass
- another photo with actual partial name of the city
- 8 minutes later it correlated it found using flag colors in front of the building
As others stated this „thought” process is completely hallucinating. IMO you either fall into bucket or good luck finding it.
On the other hand I decided to tryout Gemini for some personal project and I found responses much better than GPTs. Not about correctness but in „attitude” form.

SamBam 6 months ago

Huh, I've been very impressed. I've given it photos I took in a Nairobi slum, a random non-iconic street in Bath, a closeup of a road in Tuscany, and a small playground in Jakarta, and it got them all perfectly.