Comment by roywiggins
11 hours ago
The models absolutely do know what the standard orientation is for a scan. They respond extensively about what they're looking for and what the correct orientation would be, more or less accurately. They are aware.
They then give the wrong answer, hallucinating anatomical details in the wrong place, etc. I didn't bother with extensive prompting because it doesn't evince any confusion on the criteria, it just seems to not understand spatial orientations very well, and it seemed unlikely to help.
The thing is that it's very, very simple: an axial slice of a brain is basically egg-shaped. You can work out whether it's pointing vertically (ie, nose pointing to towards the top of the image) or horizontally by looking at it. LLMs will insist it's pointing vertically when it isn't. it's an easy task for someone with eyes.
Essentially all images an LLM will have seen of brains will be in this orientation, which is either a help or a hindrance, and I think in this case a hindrance- it's not that it's seen lots of brains and doesn't know which are correct, it's that it has only ever seen them in the standard orientation and it can't see the trees for the forest, so to speak.
No comments yet
Contribute on Hacker News ↗