← Back to context

Comment by LegionMammal978

14 hours ago

At least personally, I've been very underwhelmed by their performance when I've tried using them. Usually past a few dozen variables or so is when I start hitting unacceptable exponential runtimes, especially for problem instances that are unsatisfiable or barely-satisfiable. Maybe their optimizations are well-suited for knapsack problems and other classic OR stuff, but if your problem doesn't fit the mold, then it's very hit-or-miss.

I'm surprised to hear this. Modern SAT solvers can easily handle many problems with hundreds of thousands of variables and clauses. Of course, there are adversarial problems where CDCL solvers fail, but I would be fascinated if you can find industrial (e.g. human written for a specific purpose) formulas with "dozens of variables" that a solver can't solve fairly quickly.

  • One thing that I spent a particularly long time trying to get working was learning near-minimum-size exact languages from positive and negative samples. DFAMiner [0] has a relatively concise formulation for this in terms of variables and clauses, though I have no way to know if some other reformulation would be better suited for SAT solvers (it uses CaDiCaL by default).

    It usually starts taking a few seconds around the ~150-variable mark, and hits the absolute limit of practicality by 350–800 variables; the number of clauses is only an order of magnitude higher. Perhaps something about the many dependencies in a DFA graph puts this problem near the worst case.

    The annoying thing is, there do seem to be heuristics people have written for this stuff (e.g., in FlexFringe [1]), but they're all geared toward probabilistic automata for anomaly detection and similar fuzzy ML stuff, and I could never figure out how to get them to work for ordinary automata.

    In any case, I eventually figured out that I could get a rough lower bound on the minimum solution size, by constructing a graph of indistinguishable strings, generating a bunch of random maximal independent sets, and taking the best of those. That gave me an easy way to filter out the totally hopeless instances, which turned out to be most of them.

    [0] https://github.com/liyong31/DFAMiner

    [1] https://github.com/tudelft-cda-lab/FlexFringe

I've worked on a model with thousands of variables and hundreds of thousands of parameters with a hundred constraints. There are pitfalls you need to avoid, like reification, but it's definitely doable.

Of course, NP hard problems become complex at an exponential rate but that doesn't change if you use another exact solving technique.

Using local-search are very useful for scaling but at the cost of proven optimality

I think this hits the nail on the head: performance is the obstacle, and you can't get good performance without some modeling expertise, which most people don't have.

I wish I knew better how to use them for these coding problems, because I agree with GP they're underutilized.

But I think if you have constraint problem, that has an efficient algorithm, but chokes a general constraint solver, that should be treated as a bug in the solver. It means that the solver uses bad heuristics, somewhere.

  • I'm pretty sure that due to Rice's theorem, etc., any finite set of heuristics will always miss some constraint problems that have an efficient solution. There's very rarely a silver bullet when it comes to generic algorithms.

    • Rice's theorem is about decidability, not difficulty. But you are right that assuming P != NP there is no algorithm for efficient SAT (and other constraint) solving.

    • I think they're saying that the types of counter-examples are so pathological in most cases that if you're doing any kind of auto-generation of constraints - for example, a DSL backed by a solver - should have good enough heuristics.

      Like it might even be the case that certain types of pretty powerful DSLs just never generate "bad structures". I don't know, I've not done research on circuits, but this kind of analysis shows up all the time in other adjacent fields.

      3 replies →

Well, they aren’t magic. You have to use them correctly and apply them to problems that match how they work. Proving something is unsat is worst case NP. These solvers don’t change that.

  • Of course they aren't magic, but people keep talking about them as if they're perfectly robust and ready-to-use for any problem within their domain. In reality, unless you have lots of experience in how to "use them correctly" (which is not something I think can be taught by rote), you'd be better off restricting their use to precisely the OR/verification problems they're already popular for.

    • Hence my statement about education. All tools must be used correctly in their proper domain, that is true. Don’t try to drive screws with a hammer. But I'm curious what problems you tried them on and found them wanting and what your alternative was? I actually find that custom solutions work better for simple problems and that solvers do a lot better when the problem complexity grows. You’re better off solving the Zebra puzzle and its ilk with brute force code, not a solver, for instance.