Comment by zihotki

5 days ago

Scribe is Tesseract. It uses tesseract.js which is a Web Assembly port of Tesseract. So they in theory should be equal. In practice custom settings or older versions could make a difference.

What's the motivation for doing this in the browser? It seems like intentionally choosing a more difficult path to create an inferior result.

A native MacOS or Windows application could use the OCR facilities of the operating system and, in my experience, both produce results that are far better than Tesseract.

  • Generate the OCR on the fly, in the browser, when you do not have the proper OCR info. As someone that works on public web libraries, I see it useful (but wasteful)