Comment by snyy
6 months ago
Pretty much what you described. Convert the PDF to Markdown, join content across pages so that its all one string, then chunk it. Our evals show this approach works best.
6 months ago
Pretty much what you described. Convert the PDF to Markdown, join content across pages so that its all one string, then chunk it. Our evals show this approach works best.
No comments yet
Contribute on Hacker News ↗