Show HN: Local-first fast CPU image to text for screenshots, PDFs, webpages

3 days ago (github.com)

What's the performance like compared to tesseract? I don't see tesseract mentioned anywhere in the readme, which is surprising considering that's the number one tool most go to for Image > text OCR.

  • No rigorous eval, and I love Tesseract. Here's the example that motivated me to build textsnap (which is in the github's README), parsed with Tesseract:

    https://imgur.com/a/i2eQra8

    • Very noticable difference and the exact issue I run repeatedly with tesseract! Definitely going to try dropping textsnap into my scripts now. Thanks!!

This is awesome! Been needing something like this for some research paper diagrams I've been indexing.

- how well do you think this ll work with code? i mean take code screenshots and convert it into actual code for vscode

  • Just ran

      textsnap "https://i.ytimg.com/vi/LBNDfxjEYlA/maxresdefault.jpg"
    

    and got this

      $('.count').each(function () {
      $('this').prop('Counter', 0).animate({
        Counter: $('this').text()
      }, {
          duration: 4000,
          easing: 'swing',
          step: 'function (now) {
              $('this").text(Math.ceil(now));
          }
        }); 
      });

What was the reason for adopting PaddleOCR? Can other OCR models be used as well?

  • No reason other than their Q4 model working reasonably well and fast on my CPU laptop. Should work with any ONNX VLM model