← Back to context

Comment by ritvikpandey21

5 months ago

yup, this is a pretty common occurrence in using LLMs for data extraction. For personal use (trying to load a receipt) it’s great that the LLM filled in info. For production systems which need high quality, near 100% extraction accuracy, inferring results is a failure. Think medical record parsing, financial data, etc These hallucinations occur quite frequently, and we haven’t found a way to minimize this through prompt eng.

It's not possible with current gen models.

To even have a chance at doing it you'd need to start the training from scratch with _huge_ penalties for filling in missing information and a _much_ larger vision component to the model.

See an old post I made on what you need to get above sota OCR that works today: https://news.ycombinator.com/item?id=42952605#42955414