← Back to context

Comment by jaffa2

9 days ago

Ocr is well and good, i thought it was mostly solved with tesseract what does this bring? But, what I’m looking for is a reasonable library or usable implementation of MRC compression for the resulting pdfs. Nothing i have tried comes anywhere near the commercial offerings available, which cost $$$$ . It seems to be a tricky problem to solve, that is detecting and separating the layers of the image to compress separately and then binding them Back togethr into a compatible pdf.

Cheap network locked iphone SE2's on ebay seem to be a cost effective way with good accuracy: https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...

  • Very interesting article. I'd be interested to know if a M-series Mac Mini (this article was early 2023, so there should've been M1 and M2) would have also filled this role just fine.

    > My preliminary speed tests were fairly slow on my MacBook. However, once I deployed the app to an actual iPhone the speed of OCR was extremely promising (possibly due to the Vision framework using the GPU).

    I don't know a lot about the specifics of where (hardware-wise) this gets run, but I'd assume any semi-modern Mac would also have an accelerated compute for this kind of thing. Running it on a Mac Mini would ease my worries about battery and heat issues. I would've guessed that they'd scale better as well, but I have no idea if that's actually the case. Also, you'd be able to run the server as a service for automatic restarts and such.

    All that said, a rack of iPhones is pretty fun.

> Ocr is well and good, i thought it was mostly solved with tesseract what does this bring?

Tesseract is nice, but not good enough that there is no opportunity for another, better solution.

> Ocr is well and good, i thought it was mostly solved with tesseract what does this bring?

This is specifically for historic documents that tesseract will handle poorly. It also provides a good interface for retraining models on a specific document set, which will help for documents that are different from the training set.

Run Tesseract on a screenshot and you'll be underwhelmed.

  • With proper image pre-processing, Tesseract can recognize even tiny text (5-7 px high).