Comment by bogdanoff_2

6 months ago

I didn't even notice the text in the image at first...

This isn't even about resizing, it's just about text in images becoming part of the prompt and a lack of visibility about what instruction the agent is following.

2 comments

bogdanoff_2

bradly 6 months ago

While I also did not see the hidden message in the image, the concept of gerrymandering the color at higher resolutions nearest neighbor to actually render different content at different resolutions is a more sophisticated attack than simply hiding barely text in the image.

kg 6 months ago

There's two levels of attack going on here. The model obeying text stored into an image is bad enough, but they found a way to hide the text so it's not visible to the user. As a result even if you're savvy and know your VLM/LLM is going to obey text in an image, you would look at this image and go 'seems safe to send to my agent'.