Comment by vr46
3 days ago
I’ll have to test this against my local Python pipeline which does all this without an LLM in attendance. There are a ton of existing Python libraries which have been doing this for a long time, so let’s take a look..
3 days ago
I’ll have to test this against my local Python pipeline which does all this without an LLM in attendance. There are a ton of existing Python libraries which have been doing this for a long time, so let’s take a look..
Care to share the best ones for some use cases? Thanks
MinerU
PDFQuery
PyMuPDF (having more success with older versions, right now)