Comment by bandrami
5 days ago
If you say the image models don't "see" you also have to say the text models don't "read": there's a meaningful case to be made for either claim but then you're left saying "they behave as if they see" or "they behave as if they read".
No comments yet
Contribute on Hacker News ↗