← Back to context

Comment by minimalengineer

17 days ago

Two years ago, I worked for a company that had its own proprietary AI system for processing PDFs. While the system handled document ingestion, its real value was in extracting and analyzing data to provide various insights. However, one key requirement was rendering documents in HTML with as close to a 1:1 likeness as possible.

At the time, I evaluated multiple SDKs for both OCR and non-OCR PDF conversions, but none matched the accuracy of Adobe Acrobat’s built-in solution. In fact, at one point (don’t laugh), the company resorted to running Adobe Acrobat on a Windows machine with automation tools to handle the conversion. Using Adobe’s cloud service for conversion was not an option due to the proprietary nature of the PDFs. Additionally, its results were inconsistent and often worse compared to the desktop version of Adobe Acrobat!

Given that experience, I see this primarily as an HTML/text conversion challenge. If Gemini 2.0 truly improves upon existing solutions, it would be interesting to see a direct comparison against popular proprietary tools in terms of accuracy.