Comment by brudgers 1 year ago https://linux.die.net/man/1/pdftotextis the simplest thing that might work.It is free and mature. 2 comments brudgers Reply jbaiter 1 year ago That will not work for scanned PDFs without a text layer and even if it has one, it's not guaranteed to work. brudgers 1 year ago "Might work" comes with neither express nor implied warranty.OCR is another thing that might work which is also simpler than an LLM.
jbaiter 1 year ago That will not work for scanned PDFs without a text layer and even if it has one, it's not guaranteed to work. brudgers 1 year ago "Might work" comes with neither express nor implied warranty.OCR is another thing that might work which is also simpler than an LLM.
brudgers 1 year ago "Might work" comes with neither express nor implied warranty.OCR is another thing that might work which is also simpler than an LLM.
That will not work for scanned PDFs without a text layer and even if it has one, it's not guaranteed to work.
"Might work" comes with neither express nor implied warranty.
OCR is another thing that might work which is also simpler than an LLM.