← Back to context

Comment by kyriakos

15 days ago

I find that LLMs can read text off product label photos I can't even read myself.

If you don't know what the text says, do you have access to some other form of ground truth? Because otherwise you don't know if they're reading illegible labels correctly!

  • I can know what the text says cause I have the actual product available :) but you are right if the llm can't read it will fill in the gap with hallucinations probably

yes they usually can! we delved into the mathematics behind this a bit in the blog, but tldr the LLMs are making educated guesses based on the embedding similarities - which can be detrimental for ocr systems.