← Back to context

Comment by liotier

8 days ago

> We need to do end to end text recognition. Not "character recognition", it's not the characters we care about.

Arbitrary nonsensical text require character recognition. Sure, even a license plate bears some semantics bounding expectations of what text it contains, but text that has no coherence might remain an application domain for character rather than text recognition.

> Arbitrary nonsensical text require character recognition.

Are you sure? I mean, if it's printed text in a non-connected script, where characters repeat themselves (nearly) identically, then ok, but if you're looking at handwriting - couldn't one argue that it's _words_ that get recognized? And that's ignoring the question of textual context, i.e. recognizing based on what you know the rest of the sentence to be.

  • Not really. I have an HTR use case where the data is highly specialized codes. All the OCR software I use is tripped up by trying to find the content into the category of English words.

    LLMs can help, but I’ve also had issues where the repetitive nature of the content can reliably result in terrible hallucinations.