Comment by themanmaran

1 year ago

We've tested basic prompt injections within images, but not been able to reliably trigger any adverse effects.

However there are two big bugs we've found with VLMs:

1. Correcting the document. If you have an income statement, and all the line items add up to $1,001. But the total says $1000. The model will frequently correct the final output. Which would be terrible if you were trying to build a "identify mistakes in these documents" type tool.

2. Infinite loops. Sometimes the models will get hung up on a particular token and repeat that until it times out. This gets triggered a lot in markdown tables |---|---|----------------->

0 comments

themanmaran

No comments yet

Contribute on Hacker News ↗