Comment by xnx

17 days ago

Your OCR vendor would be smart to replace their own system with Gemini.

They will, and they'll still have a solid product to sell, because their value proposition isn't accurate OCR per se, but putting an SLA on it.

Reaching reliability with LLM OCR might involve some combination of multiple LLMs (and keeping track of how they change), perhaps mixed with old-school algorithms, and random sample reviews by humans. They can tune this pipeline however they need at their leisure to eke out extra accuracy, and then put written guarantees on top, and still be cheaper for you long-term.

With “Next generation, extremely sophisticated AI” to be precise, I wait say. ;)

Marketing joke aside, maybe a hybrid approach could serve the vendor well. Best of both worlds if it reaps benefits or even have a look at hugging face for even more specialized aka better LLMs.