Comment by PaulHoule

4 days ago

The smart way to write a Suduoku solver these days is to formulate the problem for an SMT solver, which is specifically intended to solve that kind of problem. The SMT solver already has tests, so you don't need to write tests.

When I wrote my first Sudoku solver I started out with a solver that solved easy problems where, at each step, there was some square with only one possible solution. That didn't work for harder problems where you have to guess. Eventually I realized that you could just pick any square, try all the numbers, and recursively solve the remaining problem. This works no matter which square you pick but it is faster if you start with the square that has the minimum number of choices available.

If you write a solver haphazardly like that you're very likely to wind up with two or more copies of the board because you're not sure what kind of queries you'll need to do. If you understand the problem, you won't.

The role of the tests is complex here. In principle they could help you refactor your data structures, in practice the sheer bulk of them could reify the data structures you have and add to the burden of refactoring them. A code-golfed version of the solver could be smaller than the tests.

Writing a chess program I found testing was a challenge. Move generators are easy to test, evaluation functions are easy to test. A decent alpha-beta search with transposition tables, killer heuristic and such is not so easy to test in terms of unit tests. Before I'd have the program play anyone I'd run a set of integration tests based on

https://www.chessprogramming.org/Bratko-Kopec_Test

I think of a system I inherited that had a part that was prone to race conditions, part of vanquishing the race conditions was writing a "super hammer" test that would run a few 1000 threads for about 40 seconds. Maybe it still has race conditions, but they won't be triggered often.

Long-running tests like that aren't really unit tests but for some problems they're the right tool. In general though long-running tests are a big problem because slow builds are a problem.

> but it is faster if you start with the square that has the minimum number of choices available

Alternatively, find the digit that, in its row/column/square/whatever, has the minimum number of choices available, and try each of them.

Ideally, combine the two and use Knuth’s dancing links algorithm (https://en.wikipedia.org/wiki/Dancing_Links). It, at every step, picks the one of those two approaches that has the lower number of possibilities, typically getting a lower branching factor, without spending much more time at each step.

  • In Sudoku I usually find it's easier to just keep track of guesses on the stack, by `solve` recursively when you make a guess. That leads to nice memory re-use characteristics and without the overhead of link pairs that might end up blowing out your level 1 data cache.

    But you certainly want to be guessing on the most constrained spot you can.

    Also, bitfields are a nice compact way of representing the possible values of a space and your cpu can and or or all the possible values in parallel, plus determine the number of possibilities with a modern CPU's popcount instruction.

I have only once written a Suduoku solver. It brute-forced the whole problem space in a second or 2.

This works because it's very easy to prune invalid branches according to the rules of the game.

It just tried putting "1" in the top-left most empty square and checking if the configuration is valid. If not, try "2". recurse until you find a branch that completes.

Is this the smart way? IDK, but I was quite pleased with it. I found it more elegant than e.g. "find some square with only one possible solution"

  • I also wrote a brute force, kinda dumb, depth-first Suduoku solver. I think it took me a few hours in C or C++. At every node in the search tree, I just checked to make sure no constraints were violated. If I remember correctly, it would run in less than 30 seconds and it would always find a solution if a solution existed. (It might have been a lot less than 30 seconds. I only remember thinking, "That was quick".)

    • I wrote a sudoku solver, that will create 9x9 array of arrays filled with numbers 1-9 for "blank" fields and only the number in defined field. Then for each square with only one number, I removed the number in that square from all arrays in row, column and subsquare. That typically left me with a solved array, but will contain all possible results for "guessing" fields. It was written in php (the thing I leaned at the time) and ran too fast for me to bother measuring, less than a second. Since I've "solved all sudokus" they stopped being fun.

    • Yep, pretty much the same approach and experience, only mine was in c# as I was learning the language then. "brute force" by definition will find a solution, if it exists.