Comment by bogdanoff_2
2 days ago
I didn't even notice the text in the image at first...
This isn't even about resizing, it's just about text in images becoming part of the prompt and a lack of visibility about what instruction the agent is following.
While I also did not see the hidden message in the image, the concept of gerrymandering the color at higher resolutions nearest neighbor to actually render different content at different resolutions is a more sophisticated attack than simply hiding barely text in the image.
There's two levels of attack going on here. The model obeying text stored into an image is bad enough, but they found a way to hide the text so it's not visible to the user. As a result even if you're savvy and know your VLM/LLM is going to obey text in an image, you would look at this image and go 'seems safe to send to my agent'.