← Back to context

Comment by anaisbetts

10 hours ago

Did you actually check it? Sonnet 3.5 generates text that seems legitimate and generally correct, but misreads important details. LLMs are particularly deceptive because they will be internally consistent - they'll reuse the same incorrect name in both places and will hallucinate information that seems legit, but in fact is just made-up.

Just have version control, and allow randomized spot checks with experts to have a known error rate.

You don't use LLM but other transformer based ocr models like trocr which has very low CER and WER rates