Comment by elliotto
1 day ago
We OCR them with an LLM into markdown. Super expensive and slow but way more reliable than trying to decode insanely structured PDFs that users upload, which often include pages that are images of the text, or diagrams and figures that need to be read.
Really depends on your scale and speed requirements.
No comments yet
Contribute on Hacker News ↗