Comment by nnurmanov
17 days ago
It is not OCR to blame, when you have garbage in you should not expect anything of high quality, especially with handwriting and tables and different languages. Even human beings fail to understand some documents (see doctor's prescriptions)
If OCR is a solution designed to recognize documents and it does not recognize all documents, then it is an imperfect solution.
That is not to say there is a perfect solution, but it is still the fault of the solution.
E.g. oftentimes there is l and I (capital I), this may be an issue for OCR. The perfect case is when there is a PDF document and data embedded as XML data, but unfortunately it is not the case.