← Back to context

Comment by cdrini

9 days ago

What differentiates this from other tools? Eg tesseract, EasyOcr?

I was going to say, isn't tesseract /already/ OCR for everyone?

  • I played with Tesseract a long time ago. Has the accuracy and success rate improved a lot in the last eg. decade?

    • I maintain a searchable archive of historical documents for a nonprofit, OCR'd with Tesseract over several years. Tesseract 4 was a big improvement over previous versions, but since then its accuracy has not improved at the same rate as other free solutions.

      These days, just uploading a PDF of scanned documents (typeset ones, not handwriting) to Google Drive and opening with Google Docs results in a text document generated with impressive quality OCR.

      But this is not scriptable, and doesn't provide access position information, which is needed so we can highlight search results as color overlays on the original PDF. Tesseract's hOCR mode was great for that.

      For the next version, we're planning to use one of the command-line wrappers to Apple's Vision framework, which is included free in MacOS. A nice one that provides position information is at https://github.com/bytefer/macos-vision-ocr

      4 replies →

    • Tesseract has had a near 100% success rate since the first time I used it in 2008 _when you read the manual_.

      Black letters on white background, xheight of between 10 to 30 px, tiff format, mono column layout, etc., etc., etc..

      People get terrible results because they treat it like a phone app and drop a barely legible colored jpg of a bent page and wonder why it's garbage.

      6 replies →

    • Yes' anecdotally, it's a bit better now. Still nowhere near actually usable OCR software though, unless your use-case is scanning clear hi-res screenshots in conventional fonts and popular langues, without tables or complicated formatting.

This tool says it includes a workflow GUI and refinement tools, like creating work-specific text recognition models - maybe the others do too? tesseract isn’t packaged with a GUI, but is wrapped by many.

This project seems focused on making tools more accessible and helping the user be more efficient and organized