Comment by PaulHoule
16 days ago
Where I work we've had great success at using LLMs to OCR paper documents that look like
https://static.foxnews.com/foxnews.com/content/uploads/2023/...
but were often written with typewriters long ago to get nice structured tabular output. Deals with text being split across lines and across pages just fine.
No comments yet
Contribute on Hacker News ↗