Comment by prng2021

6 months ago

Determining whether the latest off the shelf LLMs are good enough should be straight forward because of this:

“Some participants have dedicated years of their lives to the program—like Alex Smith, a retiree from Pennsylvania. Over nine years, he transcribed more than 100,000 documents”

Have different LLMs transcribe those same documents and compare to see if the human or machine is or accurate and by how much.

26 comments

prng2021

sandworm101 6 months ago

This is not an LLM problem. It was solved years ago via OCR. Worldwide, postal services long ago deployed OCR to read handwitten addresses. And there was an entire industry of OCR-based data entry services, much of it translating the chicken scratch of doctor's handwiting on medical forms, long before LLMs were a thing.

prng2021 6 months ago
It was never “solved” unless you can point me to OCR software that is 100% accurate. You can take 5 seconds to google “ocr with llm” and find tons of articles explaining how LLMs can enhance OCR. Here’s an example:
https://trustdecision.com/resources/blog/revolutionizing-ocr...
- sandworm101 6 months ago
  
  By that standard, no problem has ever been solved by anyone. I prefer to believe that a great many everyday tech issues were in fact tackled and solved in the past by people who had never even heard of LLMs. So too many things were done in finance long before blockchains solved everything for us.
  
  9 replies →
- tjwebbnorfolk 5 months ago
  
  point me to handwriting that is 100% legible...
  If 100% is your standard, good luck solving anything ever.
  
  3 replies →
lukeschlather 6 months ago
LLMs improve significantly on state of the art OCR. LLMs can do contextual analysis. If I were transcribing these by hand, I would probably feed them through OCR + an LLM, then ask an LLM to compare my transcription to its transcription and comment on any discrepancies. I wouldn't be surprised if I offered minimal improvement over just having the LLM do it though.
- sandworm101 6 months ago
  
  Why assume that OCR does not involve context? OCR systems regularly use context. It doesnt require an LLM for a machine reading medical forms to generate and use a list of the hundred most common drugs appearing in a paticular place on a specific form. And an OCR reading envelopes can be directed to prefer numbers or letters depending on what it expects.
  Even if LLMs can push a 99.9% accuracy to 99.99, at least an OCR-based system can be audited. Ask an OCR vendor why the machine confused "Vancouver WA" and "Vancouver CA" and one can get a solid answer based in repeated testing. Ask an LLM vendor why and, at best, you'll get a shrug and some line citing how much better they were in all the other situations.
- iterance 6 months ago
  
  Are you guessing, or are there results somewhere that demonstrate how LLMs improve OCR in practical applications?
  
  3 replies →
dambi0 6 months ago

For the addresses it might be a bit easier because they are a lot more structured and in theory and the vocabulary is a lot more limited. I’m less sure about medical notes although I’d suspect that there are fairly common things they are likely to say.
Looking at the (admittedly single) example from the National Archives seems a bit more open than perhaps the other two examples. It’s not impossible thst LLMs could help with this
WillAdams 6 months ago

Yes, but there was usually a fall-back mechanism where an unrecognized address would be shown on a screen to an employee who would type it so that it could then be inkjetted with a barcode.
iandanforth 6 months ago

Fun fact, convolutional neural networks developed by Yann LeCunn were instrumental in that roll out!

pinoy420 6 months ago

Agree. Sounds like not wanting to let go of a legacy