Comment by layer8

10 hours ago

For low-level work, qpdf can be quite useful: https://github.com/qpdf/qpdf

Came here to say this. Qpdf is my go-to for manipulating pdf files on the command line. Encrypting, decrypting, extracting and merging pages.

It's Apache-licensed and written in C++.

  • How do you use qpdf for extraction when its README states “qpdf does not render PDFs or perform text extraction, and it does not contain higher-level interfaces for working with page contents.”

    • Not the person you're replying to, but when they said "extraction" I believe they're talking about extracting pages from a PDF (like "splitting" the PDF apart, page-wise), not text. At least that's a thing I've used qpdf for in the past.

      1 reply →