Comment by daft_pink
1 month ago
Could we make a method to sanitize PDF’s that preserves the metadata?
It would be better to strip active content like javascript and actions, without flattening the PDF and losing all the text data having the original text is better than sending it through ocr again.
No comments yet
Contribute on Hacker News ↗