Comment by nnurmanov

1 year ago

It is not OCR to blame, when you have garbage in you should not expect anything of high quality, especially with handwriting and tables and different languages. Even human beings fail to understand some documents (see doctor's prescriptions)

2 comments

nnurmanov

devmor 1 year ago

If OCR is a solution designed to recognize documents and it does not recognize all documents, then it is an imperfect solution.

That is not to say there is a perfect solution, but it is still the fault of the solution.

nnurmanov 1 year ago

E.g. oftentimes there is l and I (capital I), this may be an issue for OCR. The perfect case is when there is a PDF document and data embedded as XML data, but unfortunately it is not the case.