← Back to context

Comment by onlyrealcuzzo

2 years ago

Is this something we really expect AI to get right with high accuracy with an image like that?

For one, there's a huge dark line that isn't even clear to me what it is and what that means for street crossings.

I am definitely not confident I could answer that question correctly.

The answer Bard gave is not even very coherent. Very similar results with GPT-4V as well. This makes me very cusrious how exactly do these models "see". Are they intelligently following the route starting from one point all along, or are they just tracing it top-to-bottom-left-to-right? Seemingly, latter is the case.

I expected that the AI would be able to understand that say taking a right turn from a straight road to another sub-road definitely involves crossing (since I specified that one is running on the left of the road). And try answering along those lines.

  • Maybe a heavily fine-tuned image AI would get this right.

    I don't see a world in which a general model like GPT or Gemini gets stuff like this correct with high accuracy any time soon.