Comment by ellen364
15 hours ago
> Like if this is a crowdsourcing project, why not do a first pass with an LLM and present users with both the image and the best-effort LLM pass?
Possibly for the reason that came up in your other post: you mentioned that you spot checked the result.
Back when I was in historical research, and occasionally involved in transcription projects, the standard was 2-3 independent transcriptions per document.
Maybe the National Archive will pass documents to an LLM and use the output as 1 of their 2-3 transcriptions. It could reduce how many duplicate transcriptions are done by humans. But I'll be surprised if they jump to accepting spot checked LLM output anytime soon.
You get that I'm not saying they should just commit LLM outputs as transcriptions, right?