Comment by themanmaran
1 year ago
We've tested basic prompt injections within images, but not been able to reliably trigger any adverse effects.
However there are two big bugs we've found with VLMs:
1. Correcting the document. If you have an income statement, and all the line items add up to $1,001. But the total says $1000. The model will frequently correct the final output. Which would be terrible if you were trying to build a "identify mistakes in these documents" type tool.
2. Infinite loops. Sometimes the models will get hung up on a particular token and repeat that until it times out. This gets triggered a lot in markdown tables |---|---|----------------->
No comments yet
Contribute on Hacker News ↗