Comment by pavel_lishin
19 hours ago
> Yet Duff Johnson, head of the PDF Association, protector of the format, argues that the fault lies not in the file type but in ourselves. He contends that there is no reason developers cannot build bots that are able to use PDFs. The AI assistant embedded in Acrobat, Adobe’s PDF reader, is designed to do precisely that, notes Leonard Rosenthol, the software firm’s PDF guru.
Designed to, but does it do it well without the problems noted earlier in the article?
Strictly anecdotally, I've had no trouble feeding PDFs to OpenAI's bot.
The searchable PDFs get searched, and the just-pictures-of-words ones get fed through their (quite good, IMHO) OCR.
I use it all the time. It's remarkably good for locating the details I need in the poorly-organized ~1,200 page factory manual for my Honda.
(Well, it's not necessarily organized poorly. It's just designed with the clear intent that it is mostly to serve as a set of repair instructions, and sometimes I don't want repair instructions. Sometimes I want to know how a thing works for my own cognitive benefit instead of how diagnose and R&R it as a series of steps.)
I'm using paperless-ngx for personal document management, and Claude Desktop was able to read and OCR all the PDFs there just fine (through an MCP connector).
It also was able to parse my tax forms in 3 languages.