← Back to context Comment by cpursley 1 day ago How are you prepping the PDF data before shoving it into Qwen? 3 comments cpursley Reply Alifatisk 1 day ago I just compress the file size as low as possible without losing the quality, didn't even know there was more ways to prep it.I do sometimes chop up the PDF into smaller pdfs with their own individual chapters amelius 1 day ago On Linux you can use pdftotext also if you are only concerned with the text. navbaker 1 day ago Not OP, but we use the docling library to extract text and put it in markdown before storing for use with an LLM.
Alifatisk 1 day ago I just compress the file size as low as possible without losing the quality, didn't even know there was more ways to prep it.I do sometimes chop up the PDF into smaller pdfs with their own individual chapters amelius 1 day ago On Linux you can use pdftotext also if you are only concerned with the text.
navbaker 1 day ago Not OP, but we use the docling library to extract text and put it in markdown before storing for use with an LLM.
I just compress the file size as low as possible without losing the quality, didn't even know there was more ways to prep it.
I do sometimes chop up the PDF into smaller pdfs with their own individual chapters
On Linux you can use pdftotext also if you are only concerned with the text.
Not OP, but we use the docling library to extract text and put it in markdown before storing for use with an LLM.