SAT-based minimization of deterministic ω-automata

The spot.sat_minimize() Python function is the main entry point for minimizing any deterministic ω-automaton. This notebook demonstrates how to use that function.

Warning: while the automata used in this notebook are quite small, working with large automata can require a lot of RAM and take huge amount of time. In its most straightforward variant, sat_minimize() takes a input automaton (called reference) and then makes a loop to ask a SAT-solver for an equivalent automaton (called candidate) with 1 fewer state at each iteration. If the reference has size ($n_i$, $s_i$), i.e. $n_i$ states, $s_i$ acceptance sets, and the candidate has size $(n_o, s_o)$, the SAT encoding uses $\mathrm{O}(n_i^2\times n_o^2\times 2^{s_i+s_o})$ variables and $\mathrm{O}(n_i^2 \times n_o^3\times 2^{s_i+2s_o}\times |\Sigma|)$ clauses. Reducing the number of acceptance set the therefore the most important way to simplify a problem.

In [1]:
import spot
spot.setup(show_default='.b')
from IPython.display import display

Minimizing DBA

Let's take a simple formula and translate it into a DBA:

In [2]:
f = spot.formula('GF(a <-> XXb)'); f
Out[2]:
$\mathsf{G} \mathsf{F} (a \leftrightarrow \mathsf{X} \mathsf{X} b)$
In [3]:
aut = f.translate('det', 'Buchi', 'SBAcc'); aut
Out[3]:
Inf( ) [Büchi] 6 6 I->6 0 0 6->0 a 1 1 6->1 !a 4 4 0->4 a 5 5 0->5 !a 4->6 b 4->4 a & !b 4->5 !a & !b 5->6 b 2 2 5->2 a & !b 3 3 5->3 !a & !b 1->2 a 1->3 !a 2->6 !b 2->4 a & b 2->5 !a & b 3->6 !b 3->2 a & b 3->3 !a & b

The above automaton is not minimal and is easily reduced by sat_minimize():

In [4]:
spot.sat_minimize(aut)
Out[4]:
Inf( ) [Büchi] 0 0 I->0 1 1 0->1 !a & !b 2 2 0->2 !a & b 3 3 0->3 a & b 0->3 a & !b 1->0 a & b 1->0 a & !b 1->1 !a & b 1->3 !a & !b 2->0 a & !b 2->1 !a & !b 2->1 b 3->0 a & b 3->1 !a & b 3->2 !a & !b 3->3 a & !b

Note that by default SAT-minimize produces a transition-based automaton with the same acceptance condition. State-based acceptance can be requested with the state_based option:

In [5]:
spot.sat_minimize(aut, state_based=True)
Out[5]:
Inf( ) [Büchi] 0 0 I->0 0->0 !a & b 1 1 0->1 a | !b 1->1 a & b 2 2 1->2 a & !b 5 5 1->5 !a 2->0 !a & b 2->1 a & b 2->2 a & !b 2->5 !a & !b 5->0 !a & b 5->1 a & b 3 3 5->3 a & !b 4 4 5->4 !a & !b 3->0 a & !b 3->1 !a & !b 3->2 a & b 3->5 !a & b 4->0 !a & !b 4->1 a & !b 4->3 a & b 4->4 !a & b

Minimizing deterministic ω-automata with arbitrary acceptance condition

Now let's look at examples with more complicated acceptance conditions.
The following Rabin automaton was produced using ltl2dstar 0.5.4 and spot 2.5.2 with

ltlfilt --lbt -f '(FGa | Fb) & FGc'  | ltl2dstar -H --ltl2nba=spin:ltl2tgba@-Ds - -

however we hardcode it so that the notebook can be used even with ltl2dstar installed.

In [6]:
large = spot.automaton('''
HOA: v1 States: 6 properties: implicit-labels trans-labels no-univ-branch
deterministic complete stutter-invariant tool: "ltl2dstar" "0.5.4"
name: "& | F G a F b F G c" comment: "Safra[NBA=4]" acc-name: Rabin 2
Acceptance: 4 (Fin(0)&Inf(1))|(Fin(2)&Inf(3)) Start: 5 AP: 3 "a" "b"
"c" --BODY-- State: 0 {1 3} 4 4 4 4 1 0 1 0 State: 1 {0 3} 4 4 4 4 1 1
1 1 State: 2 {1 2} 4 4 4 4 2 2 2 2 State: 3 {1 2} 5 5 4 4 5 3 4 0 State:
4 {0 2} 4 4 4 4 2 2 2 2 State: 5 {0 2} 5 5 4 4 5 3 2 2 --END--''')
large.merge_edges()
large
Out[6]:
& | F G a F b F G c (Fin( ) & Inf( )) | (Fin( ) & Inf( )) [Rabin 2] 5 5 I->5 5->5 (!a & !b) | (!b & !c) 4 4 5->4 b & !c 2 2 5->2 b & c 3 3 5->3 a & !b & c 0 0 0->0 a & c 1 1 0->1 !a & c 0->4 !c 1->1 c 1->4 !c 4->4 !c 4->2 c 2->4 !c 2->2 c 3->5 (!a & !b) | (!b & !c) 3->0 a & b & c 3->4 (!a & b) | (b & !c) 3->3 a & !b & c

It can be minimized as a 2-state transition-based Rabin automaton:

In [7]:
small = spot.sat_minimize(large); small
Out[7]:
(Fin( ) & Inf( )) | (Fin( ) & Inf( )) [Rabin 2] 0 0 I->0 0->0 !a & !b & c 0->0 !b & !c 0->0 a & !b & c 1 1 0->1 b & !c 0->1 b & c 1->1 !c 1->1 (!a & b & c) | (a & !b & c) 1->1 (!a & !b & c) | (a & b & c)

Or as a 4-state state-based Rabin automaton:

In [8]:
spot.sat_minimize(large, state_based=True)
Out[8]:
(Fin( ) & Inf( )) | (Fin( ) & Inf( )) [Rabin 2] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 b & c 2 2 0->2 b & !c 3 3 0->3 a & !b & c 1->1 !a & b & !c 1->2 a | !b | c 2->1 !c 2->2 c 3->0 (!a & !b) | (!b & !c) 3->1 b & !c 3->2 b & c 3->3 a & !b & c

But do we really need 2 Rabin pairs? Let's ask if we can get an equivalent with only one pair. (Note that reducing the number of pairs might require more state, but the sat_minimize() function will never attempt to add state unless explicitly instructed to do so. In this case we are therefore looking for a state-based Rabin-1 automaton with at most 4 states.)

In [9]:
spot.sat_minimize(large, state_based=True, acc='Rabin 1')
Out[9]:
Fin( ) & Inf( ) [Rabin 1] 0 0 I->0 1 1 0->1 !b 2 2 0->2 !a & b & !c 3 3 0->3 (a & b) | (b & c) 1->0 (!a & !b) | (!b & !c) 1->1 a & !b & c 1->2 b & !c 1->3 b & c 2->2 !a & !b & !c 2->3 a | b | c 3->2 !c 3->3 c

Using the display_log option, we can have a hint of what is going on under the hood. Each line in the table shows one call to the SAT solver. The column labeled target.states gives the size of the equivalent automaton we ask the SAT-solver to produce, but some of these states may actually be unreachable in the result. The variables and clauses columns give an indication of the size of the SAT problem. The enc.* and sat.* columns give the user and system time taken to encode and solve the SAT problem (the unit is "ticks", which usually is 1/100 of seconds).

Below we see that the minimization procedure first tried to squeeze the 6-state input into a 3-state automaton, which failed, and then into a 5-state automaton, which was successful. This 5-state automaton was used as input to produce a smaller 4-state automaton.. Essentially this procedure is doing a binary search towards the minimal size.

(In this case it does not matter, but be aware that the number of states displayed in the log table are those of complete automata, while the output of sat_minimize() is trimmed by default.)

In [10]:
spot.sat_minimize(large, state_based=True, acc='Rabin 1', display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 3 NaN NaN NaN 996 48806 1 0 1 0
1 6 5 5 16 40 2760 224707 5 1 5 0
2 5 4 4 11 32 2008 155020 4 0 2 0
Out[10]:
Fin( ) & Inf( ) [Rabin 1] 0 0 I->0 1 1 0->1 !b 2 2 0->2 !a & b & !c 3 3 0->3 (a & b) | (b & c) 1->0 (!a & !b) | (!b & !c) 1->1 a & !b & c 1->2 b & !c 1->3 b & c 2->2 !a & !b & !c 2->3 a | b | c 3->2 !c 3->3 c

Note that we already had a smaller transition-based automaton for this language (in the small variable), and that it actually is more efficient to work from that, as seen in problem sizes displayed in the following log.

In [11]:
spot.sat_minimize(small, state_based=True, acc='Rabin 1', display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 2 3 NaN NaN NaN 348 15974 1 0 0 0
1 2 5 5 17 40 960 73187 2 0 0 0
2 2 4 4 11 32 616 37620 1 0 0 0
Out[11]:
Fin( ) & Inf( ) [Rabin 1] 0 0 I->0 0->0 !b & !c 1 1 0->1 !a & b & !c 2 2 0->2 !b & c 3 3 0->3 (a & b) | (b & c) 1->1 !a & !b & !c 1->3 a | b | c 2->0 (!a & !b) | (!b & !c) 2->1 b 2->2 a & !b & c 3->1 !c 3->3 c

How did the procedure look for a complete automaton of size 5 when the input had only 2 states? It's because the input uses transition-based acceptance: to estimate an upper bound of the size of the state-based output, the sat_minimize() procedure converted its transition-based input to state-based acceptance (using the spot.sbacc() function) and counted the number of states in the result.

Such an estimate is not necessarily correct if we request a different acceptance condition. In that case we can actually change the upper-bound using max_states. Below we additionally demonstrate the use of the colored option, to request all transitions to belong to exactly one set, as customary in parity automata.

In [12]:
spot.sat_minimize(small, max_states=9, acc='parity min odd 3', colored=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 2 5 5 19 40 2300 288887 7 0 8 0
1 2 2 2 6 16 368 18569 1 0 0 0
2 2 1 NaN NaN NaN 92 2337 0 0 0 0
Out[12]:
Fin( ) & (Inf( ) | Fin( )) [parity min odd 3] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 0->0 a & !b & c 1 1 0->1 b & !c 0->1 b & c 1->1 !c 1->1 c

There are a couple of ways in which we can influence the search for the minimum automaton. We can disable the binary search with sat_naive. In this case, the procedure will try to remove one state at a time. This is not necessary slower than the default binary search, because satisfiable problems are often solved more quickly than unsatisfiable ones.

In [13]:
spot.sat_minimize(large, acc='co-Buchi', sat_naive=True, state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 6 5 13 40 2742 173183 3 0 2 0
1 5 4 4 11 32 964 45412 1 0 1 0
2 4 3 NaN NaN NaN 363 10496 0 0 0 0
Out[13]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 a & !b & c 2 2 0->2 (!a & b) | (b & !c) 3 3 0->3 a & b & c 1->0 (!a & !b) | (!b & !c) 1->1 a & !b & c 1->2 b & c 1->3 b & !c 2->3 1 3->2 !c 3->3 c

Variant for incremental SAT solving

Using sat_incr=1, we encode the problem of finding an equivalent automaton with $n$ states, and add 6 additional variables and some additional constraints to the problem:

variable implied constraints
$v_1$ transitions to state $(n-1)$ must not be used
$v_2$ $v_1\land{}$ transitions to state $(n-2)$ must not be used
...
$v_6$ $v_5\land{}$ transitions to state $(n-5)$ must not be used

Now using assume directives on variable $v_i$ amounts to testing whether the problem is solved with $n-i$ states, but we do not have to reencode the problem for each test, and the solver can (probably) reuse some of the knowledge it gathered during a previous attempt. We do a binary search on these 6 assumptions, to find some $i$ such that the problem is satisfiable with assumption $v_i$ but not with $v_{i+1}$. If such cast exists, we have found the minimal automaton. If assumption $v_6$ is satisfiable, we re-encode the problem with $n-7$ states and start over. Watch how the number of variables and clauses do not change in the following log.

The number of assumption variables to use in a one encoding can be set with the sat_incr_steps argument. Its default value of 6 was chosen empirically by benchmarking different values.

In [14]:
spot.sat_minimize(large, acc='co-Buchi', sat_incr=1, state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 1 NaN NaN NaN 2747 173427 3 0 2 0
1 6 3 NaN NaN NaN 2747 173427 0 0 0 0
2 6 4 4 12 32 2747 173427 0 0 0 0
Out[14]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 a & !b & c 2 2 0->2 !a & b & !c 3 3 0->3 (a & b) | (b & c) 1->0 (!a & !b) | (!b & !c) 1->1 a & !b & c 1->2 (a & b) | (b & c) 1->3 !a & b & !c 2->2 c 2->3 !c 3->2 a | !b | c 3->3 !a & b & !c

Another incremental variant consists is the equivalent of forcing $v_1$, $v_2$, ... in order. But to do that we do not need to use any assumption. We just add the constraints that transitions going to state $n-i$ are forbidden. This variant is enabled by option sat_incr=2. As in the previous case, we do a few of those incremental steps (2 by default, but that can be changed with the sat_incr_steps parameter) and then we reencode the problem to reduce its size.

In the log below, line 0 corresponds to the search of an equivalent automaton with the same size, but the simpler co-Büchi acceptance. It works, and most of the time was spent encoding the problem. Then for the next two lines, the minimization function looks for automata of size 5 and 4 without reencoding the problem but simply adding a few constraints to disable the relevant transitions.

In [15]:
spot.sat_minimize(large, acc='co-Buchi', sat_incr=2, state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 6 5 13 40 2742 173183 4 0 1 0
1 5 4 4 12 32 2742 173279 0 0 1 0
2 4 3 NaN NaN NaN 2742 173327 0 0 0 0
Out[15]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 !a & b & !c 2 2 0->2 (a & b) | (b & c) 3 3 0->3 a & !b & c 1->1 c 1->2 !c 2->1 c 2->2 !c 3->0 (!a & !b) | (!b & !c) 3->1 a & b & c 3->2 (!a & b) | (b & !c) 3->3 a & !b & c

Miscellaneous options

return_log

The return_log can be used to obtain the log table as an object. In that case, sat_minimize() returns a pair, (aut,log) where aut can be None if the minimization failed. Also, the log table contains an extra column that is hidden by display_log: it contains the corresponding automaton in HOA format.

In [16]:
aut, log = spot.sat_minimize(large, acc='co-Buchi', sat_incr=2, state_based=True, return_log=True)
display(aut)
display(log)
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 !a & b & !c 2 2 0->2 (a & b) | (b & c) 3 3 0->3 a & !b & c 1->1 c 1->2 !c 2->1 c 2->2 !c 3->0 (!a & !b) | (!b & !c) 3->1 a & b & c 3->2 (!a & b) | (b & !c) 3->3 a & !b & c
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys automaton
0 6 6 5 13 40 2742 173183 3 0 2 0 HOA: v1 States: 5 Start: 0 AP: 3 "a" "c" "b" a...
1 5 4 4 12 32 2742 173279 0 0 1 0 HOA: v1 States: 4 Start: 0 AP: 3 "a" "c" "b" a...
2 4 3 NaN NaN NaN 2742 173327 0 0 0 0 NaN

Here is how we can extract the automata from that log:

In [17]:
for line, data in log.iterrows():
    if type(data.automaton) is str:
        print(f"automaton from line {line}:")
        display(spot.automaton(data.automaton + "\n"))
automaton from line 0:
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 !a & b & !c 2 2 0->2 a & !b & c 3 3 0->3 b & c 4 4 0->4 a & b & !c 1->4 1 2->0 (!a & !b) | (!b & !c) 2->2 a & !b & c 2->3 a & b & c 2->4 (!a & b) | (b & !c) 3->1 !c 3->3 c 4->3 1
automaton from line 1:
Fin( ) [co-Büchi] 0 0 I->0 0->0 (!a & !b) | (!b & !c) 1 1 0->1 !a & b & !c 2 2 0->2 (a & b) | (b & c) 3 3 0->3 a & !b & c 1->1 c 1->2 !c 2->1 c 2->2 !c 3->0 (!a & !b) | (!b & !c) 3->1 a & b & c 3->2 (!a & b) | (b & !c) 3->3 a & !b & c

sat_langmap

When using the default binary search approach, the sat_langmap=True can help refine the lower bound by first testing the language-equivalence of all states in the automaton. This allows to form equivalence classes of states, and clearly the minimal automaton needs at least as many states as the number of equivalence states.

For instance in the large automaton we use as example, the 6 states correspond to only two different languages. This can be seen with the highlight_language() function, which colors states with identical languages. This information can be used by the minimization function to search a minimal automaton between 2 and 6 states.

In [18]:
spot.highlight_languages(large); large
Out[18]:
& | F G a F b F G c (Fin( ) & Inf( )) | (Fin( ) & Inf( )) [Rabin 2] 5 5 I->5 5->5 (!a & !b) | (!b & !c) 4 4 5->4 b & !c 2 2 5->2 b & c 3 3 5->3 a & !b & c 0 0 0->0 a & c 1 1 0->1 !a & c 0->4 !c 1->1 c 1->4 !c 4->4 !c 4->2 c 2->4 !c 2->2 c 3->5 (!a & !b) | (!b & !c) 3->0 a & b & c 3->4 (!a & b) | (b & !c) 3->3 a & !b & c

Compare the next two logs, with and without sat_langmap.

In [19]:
# Binary search between 1 and 6
spot.sat_minimize(large, acc='co-Buchi', state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 3 NaN NaN NaN 687 21896 0 0 0 0
1 6 5 4 12 32 1905 100457 3 0 1 0
Out[19]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 !a & !b & !c 1 1 0->1 (!a & b) | (b & !c) 2 2 0->2 (a & !b) | (!b & c) 3 3 0->3 a & b & c 1->1 c 1->3 !c 2->0 (!a & !b) | (!b & !c) 2->1 (!a & b) | (b & !c) 2->2 a & !b & c 2->3 a & b & c 3->1 c | (!a & !b) | (a & b) 3->3 (!a & b & !c) | (a & !b & !c)
In [20]:
# Binary search between 2 and 6 thanks to sat_langmap
spot.sat_minimize(large, acc='co-Buchi', sat_langmap=True, state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 6 4 4 12 32 1220 51612 1 0 1 0
1 4 2 NaN NaN NaN 162 3129 0 0 0 0
2 4 3 NaN NaN NaN 363 10496 0 0 1 0
Out[20]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 a & !b & c 1 1 0->1 (!a & !b) | (!b & !c) 2 2 0->2 b & !c 3 3 0->3 b & c 1->0 (!a & !b) | (!b & c) 1->1 a & !b & !c 1->2 b & !c 1->3 b & c 2->2 b & !c 2->3 !b | c 3->2 !c 3->3 c

states

Sometimes we do not want a minimization loop, we just want to generate an equivalent automaton with a given number of states. In that case, we use the states option. However there is no constraint that all states should be reachable, so in the end, you could end with an automaton with fewer states than requested.

In [21]:
spot.sat_minimize(small, acc="co-Buchi", states=7, state_based=True, display_log=True)
input.states target.states reachable.states edges transitions variables clauses enc.user enc.sys sat.user sat.sys
0 2 7 7 23 56 1379 89168 2 0 1 0
Out[21]:
Fin( ) [co-Büchi] 0 0 I->0 0->0 !a & !b & !c 1 1 0->1 !a & b & !c 2 2 0->2 (!a & !b & c) | (a & !b & !c) 4 4 0->4 (a & b) | (b & c) 5 5 0->5 a & !b & c 1->4 !b | c 3 3 1->3 b & !c 2->0 a & !b & c 2->2 !a & !b & c 2->4 b 2->5 a & !b & !c 6 6 2->6 !a & !b & !c 4->4 c 4->3 !c 5->2 (!a & !b) | (!b & !c) 5->4 b 5->5 a & !b & c 3->1 !c 3->3 c 6->2 !a & !b & c 6->4 b 6->5 !b & !c 6->6 a & !b & c