Most guides on satisfiability (SAT) modeling are aimed at students or academics. In this post, I provide information that I hope will be useful to developers considering whether a SAT approach is right for their problem.
What is SAT?
SAT means encoding your problem as a boolean formula which is handed off to a solver, that either returns an assignment to the variables that makes the formula true (a satisfying assignment), or the solver returns "unsat", meaning that there is no satisfying assignment.
Boolean formula
What is a boolean formula? It is an expression consisting of boolean variables and AND, OR, and NOT statements. Example:
Formula 1: x1 AND x2 (satisfied when both x1 and x2 are true)
Formula 2: (NOT x1) OR x2 (satisfied except if x1 true and x2 is false)
Formula 3: x1 => x2 (shorthand for the above!)
CNF
By far the most dominant format of the formula is the CNF (conjunctive normal form). It means "AND of ORs":
(x1 OR x2 OR x3) AND (x4 OR x5 OR x6) AND (x7 OR x8 OR (NOT x9))
Why is this a good format? Real world problems typically have a set of requirements for their solution. All requirements must be fulfilled for an acceptable solution. The CNF reflects this reality: all OR-clauses must be fulfilled for the formula to be satisfied.
Representation
How is this represented in the computer? Here I have good news: the only data structure needed is a list of lists of integers. Each variable gets a signed integer. Minus means negation. The above formula:
[[1, 2, 3], [4, 5, 6], [7, 8, -9]]
Order between lists does not matter. Order between integers within a list does not matter. Adding new constraints just means appending to the CNF. Getting a new variable just means keeping track of the smallest unused integer. Most solvers accept the
DIMACS CNF format. The above as a DIMACS CNF:
p cnf 9 3
1 2 3 0
4 5 6 0
7 8 -9 0
The first number after "p cnf" is the number of variables, the second is the number of clauses. Note that 0 is not allowed as a variable (because of the negation).
The Work: Problem -> CNF
Now we see the work we have cut out for us: convert our problem to a list of list of integers so that we can hand it off to a solver, and then interpret the results. The last step if usually no problem in absence of model bugs. The big work lies in crafting a correct and efficient model.