Bounded and Unbounded Model Checking
with SAT and SMT
Philipp Rümmer Uppsala University
Philipp.Ruemmer@it.uu.se
June, 2015
UPMARC Summer School
Motivation
● SAT/SMT are among the most important tools used in analysis, and related
areas
● Also: SAT is one of the best studied search problems
● Ideas from SAT tend to turn up in various places
Outline
● This lecture:
● SAT solvers, algorithms and tools
● Bounded model checking
● Lecture on Wednesday:
● SMT solvers, fixed-point engines
● Full/unbounded model checking
Model checking?
● Verify that a program/system satisfies desired properties; or find defects ...
● User-specified assertions
● Mutual exclusion properties
● Runtime errors
● Arithmetic exceptions, dereferentiation, etc.
● Incorrect memory management
● Here: mainly partial correctness/safety
Model checking?
● Bounded model checking (BMC)
● Geared towards bug finding
● Full model checking
● Primarily used to show absence of bugs
● (Deductive verification: mainly to show more complex functional correctness properties)
SAT
SMT
Fixed-point engines
BMC
Full MC
Expressiveness Complexity
Deductive verification
(propositional)
(first-order)
(second-order)
Disclaimer
Web interfaces used here
● Z3
http://rise4fun.com/Z3
● CBMC
http://logicrunch.it.uu.se:4096/~wv/cbmc/
● Eldarica
http://logicrunch.it.uu.se:4096/~wv/eldarica/
Propositional
SATisfiability
Propositional logic
● Defined by grammar:
where ranges over Boolean variables
● Further operators can be defined, e.g.:
Negation, often written
Some important notions
Definition (Satisfiability)
A formula C in CNF over n variables is satisfiable if there is an assignment of the variables that makes C evaluate to true;
C is unsatisfiable it evaluates to false for every assignment;
C is valid (a tautology) if it evaluates to true for every assignment.
Sat
Valid
Clauses,
Conjunctive normal form (CNF)
● Literal: variable, or negation of var.
● Clause: disjunction of literals
● CNF Formula: conjunction of clauses (often written as a set of clauses)
Observation
Every propositional formula can be converted to an equi- satisfiable formula in CNF, with only a linear increase in size.
The SAT problem
Definition
Input: A formula C in CNF over n variables
Output: Either an assignment of the n variables that satisfies all clauses (sat), or result unsat.
Often with more information: proof, or
unsatisfiable core
● Procedures are called SAT solvers
● Canonical NP-complete problem
[Cook, 1971]
Use of SAT solvers
● Most companies that work with verification use SAT solvers
(→ in particular hardware)
● Other application areas: artificial
intelligence, bioinformatics, security, …
● Suse 10.1 dependency manager
● Eclipse provisioning system uses Sat4j
● More detailed list: [Le Berre, 2014]
CAV Award 2009
Conor F. Madigan Sharad Malik
João P. Marques Silva Matthew Moskewicz Karem Sakallah
Lintao Zhang Ying Zhao
GRASP SAT solver CHAFF SAT solver
Some SAT solvers
DIMACS CNF format
c simple_v3_c2.cnf c
p cnf 3 2 1 -3 0
2 3 -1 0
Comment
Termination symbol Negated
variable
# variables and clauses
Davis-Putnam-Logemann- Loveland (DPLL)
[Davis, Putnam, 1960], [Davis, Logemann, Loveland, 1962]
Conflict Conflict
Conflict
SAT
Davis-Putnam-Logemann- Loveland (DPLL)
● Decisions: assign true/false to some unassigned variable
● Unit propagation: if all literals but one in a clause are false, the value of the remaining literal is determined
● Backtracking: when a conflict is reached
Backtracking?
● Plain backtracking tends to be
inefficient, much work is repeated
● But a contradiction has been found that is independent
of !
...
Conflict
Conflict
Conflict-driven clause learning (CDCL)
[Marques-Silva, Sakallah, 1999] (→ GRASP)
Conflict
has to hold!
...
has to hold!
→ UNSAT
Conflict clauses
● In DPLL, violated clause is always a conflict clause
● More general clauses can be derived through resolution
Definition (Conflict clause)
Given a DPLL branch for the CNF formula , with assignments , a conflict clause is a clause such
that
(i) is implied by
(ii) the negation of each literal occurs in
In the example
Conflict
Conflict clauses:
Conflict clauses
● Derived via resolution, by replaying unit propagation steps
● Conflict clauses can be learnt (added to problem that is being solved), to
avoid similar conflicts in the future
● Learn one clause per conflict? Or many?
Termination of search
● In DPLL, search is finished when all branches have been explored
● In DPLL + CDCL, search is over when an empty conflict clause is derived
● Side-effect: backtracking is much more flexible
Summary
● Decisions
● Variable State Independent Decaying Sum (VSIDS)
● Propagation
● Watched literals
● Learning
● Unique implication points
→ Refinements to GRASP, introd. in CHAFF
In addition in newer solvers
● Preprocessing
● E.g., Blocked clause elimination
● Restarts
● Ultra rapid
● And highly efficient implementation
Towards analysing programs ...
● Just Boolean operations are insufficient, we also need datatypes/-structures
● E.g., integers, heap datastructures, floating-point arithmetic
● Wide range of (finite-domain)
datatypes can be encoded in SAT ...
Machine arithmetic
● Bounded integer arithmetic e.g., 8bit, 32bit, 64bit
● Numbers are represented by vector of Boolean variables
● Encoding of overflow/rounding-
behaviour derived from hardware implementation
Machine arithmetic (2)
● E.g., simple 4bit ripple-carry adder
→ implements addition of 4bit numbers
C4 C3 C2 C1 C0
FA
A3 B3
S3
FA
A2 B2
S2
FA
A1 B1
S1
FA
A0 B0
S0
4bit adder
A B
Cin S
Cout
Carry-block
Tc
C4 C3 C2 C1 C0
FA
A3 B3
S3
FA
A2 B2
S2
FA
A1 B1
S1
FA
A0 B0
S0
XOR
AND
32/73
4bit adder (2)
C4 C3 C2 C1 C0
FA
A3 B3
S3
FA
A2 B2
S2
FA
A1 B1
S1
FA
A0 B0
S0
→ Number of clauses/variables is linear in bit-width!
“Bit-blasting”
● Similar encoding can be used for all bit- vector operations
● Some operations are harder, e.g., multiplication
● quadratically many clauses needed
Simple, but laborious ...
● Define a library of common operations, reuse it everywhere
● Resulting notion:
Satisfiability modulo T → SMT
Definition (theory)
A (first-order) theory is specified by a signature of operations (sorts, functions, predicates), and a class of intended interpretations of the symbols in .
The theory of
(fixed-width) bit-vectors
● Sorts: bit-vectors of any size
● Wide range of operations:
Arithmetic, logical (bit-wise), extract/concat
● Many available solvers;
standard approach:
1) Aggressive simplification of constraints 2) Encoding as propositional formula → SAT
SMT-LIB
● Standardised interface for SMT solvers, supported by most tools
● Rich set of features, many theories
● Comes with a large library of
benchmarks; yearly competition SMT-COMP
● http://www.smtlib.org
Tutorial ...
● Every number x that is a power of 2 has the property that
x & (x – 1) == 0 (and vice versa)
Important SMT-LIB commands
● (set-logic QF_BV) (set-option …)
● (declare-const b (_ BitVec 8))
(declare-fun f ((x (_ BitVec 2))) Bool)
● (assert (= b #b10100011))
● (check-sat)
● (get-value (b)), (get-model)
● (get-unsat-core)
● (push 1), (pop 1) (reset), (exit)
Z3, and many solvers don't
care ...
The assertion stack
● Holds both assertions and declarations, but no options
● Important for incremental use of solver
● (push n) → add n new frames to the stack
● (pop n) → pop n frames from the stack
General SMT-LIB constructors
● (and …), (or …), (not …), (=> …)
● (= b c)
● (ite (= b c) #b101 #b011)
● (let ((a #b001) (b #b010)) (= a b))
● (exists ((x (_ BitVec 2))) (= #b101 x)) (forall …)
● (! (= b c) :named X)
Main SMT-LIB Bit-vector ops.
http://smtlib.cs.uiowa.edu/logics-all.shtml#QF_BV
● (_ BitVec 8)
● #b1010, #xff2a, (_ bv42 32)
● (= (concat #b1010 #b0011) #b10100011)
● (= ((_ extract 1 3) #b10100011) #b010)
● Unary: bvnot, bvneg
● Binary: bvand, bvor, bvadd, bvmul, bvudiv, bvurem, bvshl, bvlshr
● (bvult #b0100 #b0110)
● And many more derived operators ...
Bounded Model Checking
(BMC)
BMC
● Idea: search for bugs in
programs/systems up to some depth;
but otherwise reason fully precisely
● One of the most successful techniques for hardware analysis
● Enabled by advances in SAT solving
BMC Problem
Decide whether a runtime exception can occur within the first k execution steps of a program/system.
Monolithic BMC
● Transition system
● Finite state space
● Initial states
● Transition relation
→ Can all be encoded using bit-vectors!
● Property
Monolithic BMC (2)
● Mainly works well for hardware
● For software, formula contains a lot of redundancy
● Better to unwind program guided by control structure
48/73
BMC: straight-line programs
int x, y;
x = x * x;
y = x + 1;
assert(y > 0);
(set-option :pp.bv-literals false) (declare-const x0 (_ BitVec 32)) (declare-const y0 (_ BitVec 32)) (declare-const x1 (_ BitVec 32)) (declare-const y1 (_ BitVec 32)) (assert (= x1 (bvmul x0 x0)))
(assert (= y1 (bvadd x1 (_ bv1 32)))) (assert (not (bvsgt y1 (_ bv0 32)))) (check-sat)
(get-model)
Z3-specific
Signed comparison
BMC: conditional branching
int x, y;
if (x > 0) y = x;
else
y = -x;
assert(y >= 0);
(set-option :pp.bv-literals false) (declare-const x0 (_ BitVec 32)) (declare-const y0 (_ BitVec 32)) (declare-const y1a (_ BitVec 32)) (declare-const y1b (_ BitVec 32)) (declare-const y2 (_ BitVec 32)) (declare-const b Bool)
(assert (= b (bvsgt x0 (_ bv0 32)))) (assert (=> b (= y1a x0)))
(assert (=> (not b) (= y1b (bvneg x0)))) (assert (= y2 (ite b y1a y1b)))
(assert (not (bvsge y2 (_ bv0 32)))) (check-sat)
(get-model)
Alternative method:
path-wise exploration
int x, y
x > 0 !(x > 0)
y = -x y = x
assert(...)
● Each query smaller, but possibly
exponentially many paths
● Learning similar to CDCL can be used to avoid analysing all paths
BMC: arrays and heap
int x, y = 0;
int *p = x > 0 ? &y : &x;
*p = *p + 3;
assert(x > 0);
BMC: arrays and heap (2)
● “Manual” encoding:
● Use pointer analysis to approximate the (finite) set of locations that each
pointer can point to
● Replace accesses by finite conditional terms:
*p → (p == &x ? x : (p == &y ? y ? ...)
● Or: use SMT theory of arrays (later)
CBMC
● Bounded model checker
developed/maintained by University of Oxford; used for demonstration
purposes here
http://www.cprover.org/cbmc/
● Processes C code, and subset of C++
● Web interface:
http://logicrunch.it.uu.se:4096/~wv/cbmc
BMC: loops and back-jumps
k-fold
unwinding
i = 0;
while (i < N) { i = i + 1;
assert(i >= 0);
}
i = 0;
if (i < N) { i = i + 1;
assert(i >= 0);
if (i < N) { i = i + 1;
assert(i >= 0);
if (i < N) { i = i + 1;
assert(i >= 0);
assume(!(i < N));
} } Stop analysis here }
if loop has not terminated yet
Unwinding assertions
i = 0;
if (i < N) { i = i + 1;
assert(i >= 0);
if (i < N) { i = i + 1;
assert(i >= 0);
if (i < N) { i = i + 1;
assert(i >= 0);
assert(!(i < N));
} } }
● Used to verify that unwinding depth is big enough
● Default in CBMC,
can be switched off with
--no-unwinding-assertions
56/73
Concurrency in BMC
int x = 0;
int t = x; int s = x;
x = t + 1; x = s + 1;
assert(x == 2);
Data variables:
Clock variables:
[Alglave, Kroening, Tautschnig, 2013]
Concurrency in BMC (2)
Local semantics Preserved program order (ppo)
Read-from (rf)
Clock constraint:
Read-from:
Write serialisation (ws)
From-read (fr)
Weaker memory models
● Used orders:
● Preserved program order
● Read-from
● Write serialisation
● From-read
● Models weaker than SC can be encoded by relaxing the relations
→ More executions become possible
In CBMC
● Parts of pthread API supported
● By default, SC memory model; can be changed via option
SV-COMP 2014
…..
Assignment: BMC
There are 100 prisoners in solitary cells. There's a central living room with one light bulb; this bulb is initially off. No prisoner can see the light bulb from his or her own cell. Everyday, the warden picks a prisoner equally at random, and that prisoner visits the living room. While there, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting that all 100 prisoners have been to the living room by now. If this assertion is true, all prisoners are set free; otherwise, the punishment will be severe. Thus, the assertion should only be made if the prisoner is 100% certain of its validity. The prisoners are allowed to get
together one night in the courtyard, to discuss a plan.
● What plan should they agree on, so that eventually, someone will make a correct assertion?
● Use BMC to verify your plan, for a small number of prisoners.
Hints
● Use either pthread mutexes or atomic sections to implement the protocol.
Atomic sections tend to be faster for verification.
__CPROVER_atomic_begin();
__CPROVER_atomic_end();