Comment by kevin_thibedeau
8 hours ago
pdftoppm and Ghostscript (invoked via Imagemagick) re-rasterize full pages to generate their output. That's why it was slow. Even worse with a Q16 build of Imagemagick. Better to extract the scanned page images directly with pdfimages or mutool.
Followup: pdfimages is 13x faster than pdftoppm
No comments yet
Contribute on Hacker News ↗