Comment by joefkelley

5 days ago

This reminded me of one of my high school computer science assignments- simply to find all words in a single boggle board. And try to optimize your solution a bit. The point was to teach about recursion/backtracking and data structures. The intended solution was roughly: start at a square, check if your current prefix is a valid prefix, move to a neighbor recursively, and emit any words you find. Trying to optimize naturally motivates a trie data structure.

I found it to be at least an order of magnitude faster, though, to invert the solution: loop through each word in the dictionary and check whether it exists in the grid! The dictionary is small compared to the number of grid paths, and checking whether a word exists in the grid is very very fast, requiring not much backtracking, and lends itself well to heuristic filtering.

Sorry, but this doesn’t pass the smell test. The article mentions 200,000 random 4x4 boards/second on a single core on an M2. That’s a ~4GHz chip. So ~20,000 ops/board. There are 200,000 words in the dictionary. You can’t possibly do something for every word in the dictionary, it would be too slow.

It sounds like your Trie implementation had a bug or inefficiency.

  • I think GP mentioned it was on a _single_ boggle board.

    • Your best bet in that case is to store the dictionary in a Trie or DAWG structure that can be mmapped directly from disk.