← Back to context

Comment by tincholio

14 years ago

> The example sentence has misspelled words, hence is within the domain of spell checking.

No, it doesn't . It is grammatically incorrect, but all the words are spelled correctly. You're definitely talking about a grammar checker, not a spelling checker.

If you were a teacher marking a student's paper, you would label those as spelling errors, not a grammar errors. The reason is that a different spelling of the words would create the intended sentence. No grammatical variation will do that (reorders, conjugating differently, etc).

  • Really? Most teachers I had would put WW (wrong word) or WC (word choice) to imply that the word is incorrect, which is considered grammatical.

I understand this is a mismatch in terminology.

In your view, "spelling check" is applied to individual words, to see if they appear in a fixed dictionary (see my other comment about difficulties with choosing this "correct" dictionary in reality, though). I can imagine this view is inviting for programmers, because it's easy to implement, but I doubt anyone else finds useful a definition that says there are no misspellings in "Their coming too sea if its reel."

In my view, "spell check" applies to utterances and roughly means "all words are spelled as per the norm of the language; I can send this document to my boss/customer and they won't laugh at my spelling." It is a more user-centric view, and more complex too, because it covers intent and norms, as opposed to the comforting lookup table for a few hand-picked strings. Modern spell checkers make heavy use of statistical analysis of large text corpora, to reasonably approximate context needed to model such intent.

Once we agree on the terminology, I believe we are in agreement, so let's not split hairs. The "correctly spelled" sentence under question comes from the Wikipedia article on spell checking, by the way.

  • I can imagine this view is inviting for programmers, because it's easy to implement, but I doubt anyone else finds useful a definition that says there are no misspellings in "Their coming too sea if its reel."

    I realize this is just a debate over a definition, so it's not very meaningful, but the fact is, you're on the wrong side of the common definition here. I just tested, and every spell checker that I just checked (Chrome, Firefox, MS Word, TextMate, TextEdit - perhaps some of these rely on the same underlying engine, I'm not sure?) accepts that sentence as not having a spelling error, so clearly there's some use for such a definition.

    The grammar checkers, on the other hand, don't like it, but by changing "their" to "they're", they all accept it, despite the fact that it's still garbage. So don't overestimate how good "modern" spell checkers are...though there may be techniques to do a better job, they're not in common use, at least in the most common spell-checking contexts (which, lets be honest, pretty much means MS Word).

    • There is no question whatever: the sentence contains misspelled words. "They're" is misspelled as "Their", "to" as "too", "see" as "sea", etc.

      There is also no non-words. It happens that today's spelling checkers are generally just non-word detectors. This doesn't mean that anyone defines "misspelled" to mean "misspelled in such a way that the result is not a word at all".

      And yes, a non-word detector is still very useful, and it's much much easier to make than something that also determines reliably when words are misspelled in ways that produce other words, so there's lots of software out there (perhaps essentially all of it) that contains only a non-word detector.

      I think it's perfectly reasonable to define "spelling checker" to include mere non-word detectors. Or for that matter non-common-word detectors. (I expect most spelling checkers will reject "hight", and very sensibly because if someone types that they probably meant "high" or "height" or something -- but it's a perfectly good word, albeit a rare and archaic one.) But there's no way it's correct to say that "sea" isn't misspelled in that sentence merely because the mistake happens to have produced something that's an English word.

      1 reply →

The sentence contains misspelled words. The errors in those words happen to make them collide with other existing words. That doesn't change the category of the error.

  • """The sentence contains misspelled words."""

    No, it contains correctly spelled words used in place of other desired words.

    "I sea your eyes", "I she your eyes"

    If you want to correct these kind of errors, you must now a lot about natural language. Also, the above are trivial cases. There are tons of edge cases and far more difficult distinctions. Here's an amusing one, that can lead to "Microsoft paperclip" like interactions:

    "I gave him the new pink dress as a present" => "I gave her the new pink dress as a present"

    No, idiotic spell-checker, I do mean him. My friend is a cross-dresser, shut up and let me type.

    Such a spellchecker would also be useless for poetry. And if you find poetry obscure, so it doesn't really matter, then such a spellchecker would also be useless for irony. Suddenly, you lose all the hipsters from your potential users (except if they start using it ironically).

    Anyway, no spell checker in widespread use attempts this --and it's probably a very hard nut to crack, and probably uncrackable in the general case.

    • > No, it contains correctly spelled words used in place of other desired words.

      If you intend to write a word meaning "actually existing as a thing" and spell it "reel", it is monstrously unlikely that you genuinely thought a word meaning "a cylinder on which flexible materials can be wound" was a suitable substitute. If you did, that would be the use of a correctly spelled (but incorrectly selected) word in place of a desired word: a grammatical error, in other words. However, if you just pick the wrong letters to construct a phoneme, as is almost certainly the case here... yep, that's a spelling error.

      To put it another way, imagine that the word "reel" didn't actually exist, and I make precisely the same error, substituting an 'e' for an 'a'. All of a sudden, by your argument, what was a grammatical error is now a spelling error. But the mistake I made hasn't changed, so that makes no sense.

      > If you want to correct these kind of errors, you must now a lot about natural language. >... > Anyway, no spell checker in widespread use attempts this

      So? The category of error doesn't change with how difficult it is to fix.