-
Notifications
You must be signed in to change notification settings - Fork 3
Map Unrolling
Perhaps the largest issue we have encountered in working with NV is the problem of encoding maps (i.e. dictionary data structures) into SMT. This page covers (briefly) what we've tried, and what we're currently doing.
Originally, maps were encoding into Z3 using Z3's Array sort. This was a pretty direct translation. However, encoding a mapIf expression required the use of quantifiers. For example,
(* for keys less than 3, map their value to false(i.e., drop them from the set) *)
mapIf (fun k -> k < 3) (fun v -> false) set
would be translated into SMT as
set’ : Array Integer Boolean
forall (k : Integer) (if k < 3 then set’[k] == false else set’[k] == set[k])
In general, quantifiers are not complete. This encoding worked fine so long as our keys had the Integer sort, but when we tried another encoding for integers (bitvectors), Z3 started returning Unknown. In order to preserve completeness, we developed another technique: map unrolling.
The idea of map unrolling is to transforms maps (which are arbitrarily large) into tuples (which have fixed, finite size). To see how this works, imagine we know that the only keys used into a map m are the values k1, ..., kn. We could then replace the map with an n-tuple, as follows:
- createDict v ==> (v, v, ..., v) (* n times *)
- m[k1] ==> match m with (v, _, ..., _) -> v
- m[k2] ==> match m with (_, v, ..., _) -> v (* etc... *)
- m[k1 := v] ==> match m with (_, v2, ..., vn) -> (v, v2, ..., vn)
- m[k2 := v] ==> match m with (v1, _, ..., vn) -> (v1, v, ..., vn) (* etc... *)
- map f m ==> match m with (v1, v2, ..., vn) -> (f v1, f v2, ..., f vn) and so forth.
This transformation is made even more potent by tuple unboxing, which separates elements of tuples from each other (treating them as independent variables). This allows Z3 to, for example, focus only on relevant entries in the tuple.
However, this transformation relies on our ability to determine which key are used in advance. To ensure we could do this, we added the restriction that all keys in an NV program must be literals. For example, instead of writing
m[node], one would have to write match node with | 0 -> m[0] | 1 -> m[1] | .... This is both cumbersome on the programmer, and hinders expressivity since we cannot use, for example, a symbolic variable as a key into a map.
We extended the map unrolling transformation to handle symbolic keys by noticing that in a given run of a program, a symbolic variable only ever has one value. This means that if we see m[symb] in two locations of the program, we know that it's retrieving the value at the same index, even though we don't know which index ahead of time.
As such, we allow symbolic keys by adding one extra "slot" in the tuple for each such key. So if we have 5 constants used as keys, and two symbolic variables, the tuple generated by map unrolling will have 7 slots. These extra slots hold the values associated with the symbolic variables. However, there is a chance that a symbolic variable will have the same value as one of the constant keys, or another symbolic variable. In this case, they are the same key, and should share the same slot in the tuple. Accordingly, the unrolling for a symbolic key is more complicated.
Imagine we have three symbolic variables that are used as keys at some point: symb1, symb2, symb3, and 5 constant keys k1, ..., k5. Then we would unroll m[e] to
match e with
| k1 -> (* get index 1 *)
...
| k5 -> (* get index 5 *)
| _ ->
if e = symb1 (* value comparison! *) then (* get index 6 *)
else if e = symb2 (* value comparison! *) then (* get index 7 *)
else (* get index 8 *)
Hence if, for example, symb1 and symb2 happened to have the same symbolic value, they would share index 6 (unless their value also happened to be a constant key). In this case index 7 is unused.