Comment by dextorious

14 years ago

You keep using this word "spelling". I don't think it means what you think it means.

This is GRAMMAR checking, or at least grammar-assisted spell checking.

Very few, if any, shipping mainstream spelling correctors do that.

People want their documents (or queries) free of spelling errors. That is their pain and that is the challenge.

The example sentence has misspelled words, hence is within the domain of spell checking. This type of misspelled words are called "homonyms", which are one very common spelling problem. The academic terminology is uninteresting to most users, however.

Or if you mean to posit that "spell checking really means looking up if a word exists in a static dictionary of English", then yes, that's easy and solved, no argument there.

  • > The example sentence has misspelled words, hence is within the domain of spell checking.

    No, it doesn't . It is grammatically incorrect, but all the words are spelled correctly. You're definitely talking about a grammar checker, not a spelling checker.

    • If you were a teacher marking a student's paper, you would label those as spelling errors, not a grammar errors. The reason is that a different spelling of the words would create the intended sentence. No grammatical variation will do that (reorders, conjugating differently, etc).

      1 reply →

    • I understand this is a mismatch in terminology.

      In your view, "spelling check" is applied to individual words, to see if they appear in a fixed dictionary (see my other comment about difficulties with choosing this "correct" dictionary in reality, though). I can imagine this view is inviting for programmers, because it's easy to implement, but I doubt anyone else finds useful a definition that says there are no misspellings in "Their coming too sea if its reel."

      In my view, "spell check" applies to utterances and roughly means "all words are spelled as per the norm of the language; I can send this document to my boss/customer and they won't laugh at my spelling." It is a more user-centric view, and more complex too, because it covers intent and norms, as opposed to the comforting lookup table for a few hand-picked strings. Modern spell checkers make heavy use of statistical analysis of large text corpora, to reasonably approximate context needed to model such intent.

      Once we agree on the terminology, I believe we are in agreement, so let's not split hairs. The "correctly spelled" sentence under question comes from the Wikipedia article on spell checking, by the way.

      3 replies →

    • The sentence contains misspelled words. The errors in those words happen to make them collide with other existing words. That doesn't change the category of the error.

      2 replies →

  • Hey, here's a chance to invent some new terminology!

    I would say that Norvig's corrector is a first-order spelling corrector, since it works within the context of a single word.

    A second-order corrector would take into account the word before or after it to choose the spelling that is more likely to make sense. ("Their coming" would suggest a correction to "They're coming")

    Third-, fourth-, (and so on) order expands the distance of words considered.

    • """("Their coming" would suggest a correction to "They're coming")'""

      How about: "Her relatives would visit us for Christmas. Their coming filled us with dread!"

      1 reply →

No, it's spell-checking. Use of the word "reel" in this sentence, for instance, is definitely a spelling error. There's no grammatically valid form of the word "real" spelled with two 'e's. The fact that "reel" happens to be a valid word doesn't mean that its presence in the sentence is due to a grammatical error - it's just an accidental collision.

That no spell checker might be able to catch this specific class of error doesn't change the type of error it is.