Comment by sergiotapia
2 days ago
Use pymupdf to extract the PDF text. Hell, run that nasty business through an LLM as step-2 to get a beautiful clean markdown version of the text. Lord knows the PDF format is horribly complex!
2 days ago
Use pymupdf to extract the PDF text. Hell, run that nasty business through an LLM as step-2 to get a beautiful clean markdown version of the text. Lord knows the PDF format is horribly complex!
No comments yet
Contribute on Hacker News ↗