Comment by anotherpaulg
11 hours ago
I regularly use LLM-as-OCR and find it really helpful to:
1. Minimize the number of PDF pages per context/call. Don't dump a giant document set into one request. Break them into the smallest coherent chunks.
2. In a clean context, re-send the page and the extracted target content and ask the model to proofread/double-check the extracted data.
3. Repeat the extraction and/or the proofreading steps with a different model and compare the results.
4. Iterate until the proofreadings pass without altering the data, or flag proofreading failures for stronger models or human intervention.
What's the typical run for you cost?