Comment by spdustin
5 days ago
My answer to this in my own pet project is to mask terms found by the NER pipeline from being corrected, replacing them with their entity type as a special token (e.g. [male person] or [commercial entity]). That alone dramatically improved grammar/spelling correction, especially because the grammatical "gist" of those masked words is preserved in the text presented to the LLM for "correction".
No comments yet
Contribute on Hacker News ↗