Comment by dangoldin

14 years ago

I know Bloom Filters came in handy for this sort of work but I'd love to see some other data structures and algorithms that were developed in the 80s to deal with limited memory.

See the 1988 paper "The world's fastest Scrabble program" by Andrew W. Appel (Princeton) and Guy J. Jacobson (Carnegie-Mellon).

They used a data structure they dubbed a DAWG (directed acyclic word graph) that was a trie for common start patterns with words and also collapsed together common endings (every "tion" ending word ended in same portion f the graph. As I recall they used 5 bits to store ASCII uppercase characters and either 11bits or 19bits to address nodes in the graph leading to a data structure that compressed words as well as any lzw type compression with the advantage that it was directly readable with respect to word extraction.

While the paper was published in 1988, the program was written and being used much earlier in '82 or '83.

http://wiki.cs.pdx.edu/cs542-spring2011/papers/appel-scrabbl...

In the 80's I got the gist of spellchecker algorithms.

An Apple II spellchecker that I used a few times seemed to spell-check using brute force means with the dictionary filling an separate 143K floppy disk. It was a somewhat slow batch-like one document-at-time operation, but it was good enough to be useful.

As a programmer in the 80's, I saw a spellchecker that used a small hash table to tell whether a word was misspelled or not. The hash table could be made even smaller if we limited the dictionary to about 16000 of the more common words. If I remember correctly, the smallest hash table was generated as an array of 16 bit integers in the source code before compiling and it's small footprint kept in memory of the built program. Once a word was identified as a misspelling, using a very fast hash look-up, suggestions of words could be pulled from the actual dictionary on disk. Using the hash, there was the possibility of false spellings. Maybe this was a Bloom Filter.

Morris, Robert & Cherry, Lorinda L, 'Computer detection of typographical errors', IEEE Trans Professional Communication, vol. PC-18, no.1, pp54-64, March 1975.

They do away with a dictionary.