Don't Care Words with an Application to the Automata-based Approach for Real Addition

Jochen Eisinger and Felix Klaedtke

Automata are a useful tool in infinite state model checking, since they can represent infinite sets of integers and reals. However, analogous to BDDs to represent finite sets, an obstacle of an automata-based set representation is the sizes of the automata. In this paper, we generalize the notion of "don't cares" for BDDs to word languages as a means to reduce the automata sizes. A general result in this paper shows that the minimal weak deterministic Büchi automaton (WDBA) with respect to a given don't care set with certain restrictions is uniquely determined and can be efficiently constructed. We apply don't cares to mechanize a more effective decision procedure for the first-order logic over the mixed linear arithmetic over the integers and the reals based on WDBAs.

Technical report

Download

Sources:

Please note that the sources of mixlinarith are licensed under the terms of the GNU General Public License Version 2.

Binaries:

Benchmarks:

Benchmarks

We have carried out tests on two different types of classes: (1) randomly generated formulas and (2) the iterative computation of the reachable states of infinite state systems. In the later case, we mainly focus on the sizes of the automata, as our prototype is not intended to compete with optimized tools for solving the reachability problem.

The benchmarks were run on a dual AMD64 (Opteron 252 2.6GHz) machine with 16GB RAM running Debian GNU/Linux etch. The program was compiled with the Intel C++ Compiler 9.1.038. Note that most benchmarks (especially when using don't care sets) will not use more than 4GB RAM.

All graphs shown here link to a high-res version.

Random formulas:

We have applied our prototype implementation to 100 randomly generated formulas with 4 variables, each involving about 10 disjunctions and conjunctions. The graphs in this section compare the performance of our prototype implementation against an implementation based on LASH. The LASH based implementation uses the (faster) serialized automata constructions, however, in order to compare automata sizes, we first unserialized the automata. Note that we did not include the time required to unserialize the final automata in our statistics.

While carrying out the tests for LASH we observed that the automata construction for handling the logical OR was unexpected slow. We changed its implementation by using the auto_product_union instead of the auto_product function in the file auto-operations.c. This change improved the LASH running times. For the following benchmarks, the improved version was used.

The following graphs show the sizes of the automata encountered during the construction of quantifier free formulas, the sizes of the final automata (representing the sets defined by each formula), and the construction times for mixlinarith and the LASH based linarith.

quantifier free formulas (peak sizes) quantifier free formulas (after minimization)
runtime for quantifier free formulas

Fig. 1. Automata sizes (peak and final) encountered during the construction of the automata for 100 randomly generated quantifier free formulas. The red lines correspond to our prototype implementation, while the green lines correspond to the LASH based implementation.

Next, we existentially quantify one variable of each formula, i.e. for a quantifier free formula φ we now use ∃ x1 φ. As expected, the new construction for the quantification takes longer (which we partly attribute to its prototypical implementation) and generates potentially larger automata (which become smaller again after normalization and minimization). Note that our prototype outperforms the LASH based implementation.

formulas with quantifiers (peak sizes) formulas with quantifiers (after minimization)
runtime for formulas with quantifiers

Fig. 2. Automata sizes (peak and final) encountered during the construction of the automata for 100 randomly generated formulas with quantification. The red lines correspond to our prototype implementation, while the green lines correspond to the LASH based implementation. Note that there is no data point available for the cases #78 and #86 due to exceeding the available memory (our implementation) or the time limit of 5h (LASH based).

Last, we benchmarked our implementation using the --presb option which restricts the automata to the integer domain using the Z predicate. Here the gap between the automata with and without a don't care set becomes large, which can be explained by the fact that all integers can be encoded by a don't care word, and thus there is more potential for savings. Note that when executing these benchmarks, you will have to preceed each formula with the statement ORDER x1,x2,x3,x4 :.

formulas with Z predicate (peak sizes) formulas with Z predicate (after minimization)

Fig. 3. Automata sizes (peak and final) encountered when restricting all variables to the integer domain. Here, both graphs are generated from our implementation, once with the optimized construction (red lines), and once with the straight-forward construction (green lines).

Linear Hybrid Automata

Infinite state systems, like systems with unbounded integers or linear hybrid automata can be analyzed symbolically in the first-order logic over ℜ. We have analyzed the Bakery protocol, Fischer's protocol, and the railroad crossing example. Using don't care words, the automata constructed during the iterative computation of the reachable states become smaller by up to several orders of magnitude. This saving can be explained by the two observations. First, the formulas that describe the transitions of a system contain many variables (the formulas for Fischer's protocol with 5 processes have 34 variables). Note that the don't care sets contain more words if the formula contains many free variables. Second, the construction of the reachable state set requires a large number of automata constructions. Although the saving at a single automata construction might be small, the overall saving grows with the number of automata constructions.

The following table summarizes the results of those benchmarks. The positive effect of our optimized encoding is clearly visible. Note that the automata constructed with and without don't care sets represent the same sets. This can easily be verified by turning the automaton constructed using the straight-forward construction in D-normal form. For example, load the automaton describing the reachable state set of Fischer 5 (after unpacking the file) into mixlinarith and apply the function RESTRICT to it:

RESTRICT ( "fischer5.dot" [x0, x1, x2, x3, ..., x32, x33] )

The resulting automaton will have the same size as the automaton constructed when using don't care sets.

steps with don't cares without
peak final runtime peak final runtime
Fischer 2 9 238 53 74.8s 2318 182 83.9s
Fischer 3 15 44631 405 270.8s 90422 2045 315.0s
Fischer 4 21 51676 4377 3766.5s 417649 27548 6214.1s
Fischer 5 27 145629 55885 21030.0s 1625141 430727 54166.6s
Railroad 8 152826 7735 1594.3s 365004 9411 1080.2s
Bakery 2 30 107 - 52.4s 557 - 63.6s
Bakery 3 30 314 - 107.7s 2010 - 121.1s
Bakery 4 30 909 - 201.4s 8883 - 272.7s

Fig. 4. Number of iterations required to reach the fixpoint (except for Bakery) of the reachable state set for several infinite state systems, including automata construction times, and peak and final automata sizes. Note that the fixpoint for Bakery cannot be reached using simple naive iteration.

The following figures show the automata sizes encountered during the computation of the reachable state sets. The peak sizes show the maximal sizes of the automata encountered in each step, the sizes after minimization correspond to the automata at the end of each (intermediate) step.

We compute the reachable state set as follows. Starting from the initial state, we iteratively add states reachable from the current state set by the flows and jumps of the linear hybrid automata, until the current state set has reached a fixpoint. In the case of Fischer's protocol with 5 processes, we used an alternative approach and only considered the newly reached states for the next iteration. Note that for the Bakery protocol, it is not possible to reach the fixpoint using this simple naive algorithm. Therefore, we aborted the computation after 30 steps.

Fischer 2 (peak sizes) Fischer 2 (after minimization)

Fig. 5. Automata sizes (peak and final) encountered during the computation for Fischer's protocol with 2 processes. The red (green) lines correspond to the optimized (straight-forward) encoding. The intermediate construction steps correspond to the flows and jumps of the processes in Fischer's protocol.

Fischer 3 (peak sizes) Fischer 3 (after minimization)

Fig. 6. Automata sizes (peak and final) encountered during the computation for Fischer's protocol with 3 processes.

Fischer 4 (peak sizes) Fischer 4 (after minimization)

Fig. 7. Automata sizes (peak and final) encountered during the computation for Fischer's protocol with 4 processes.

Fischer 5 (peak sizes) Fischer 5 (after minimization)

Fig. 8. Automata sizes (peak and final) encountered during the computation for Fischer's protocol with 5 processes. Note that here an alternative algorithm for computing the reachable state set was used, which results in smaller automata sizes during the end of the computation.

Railroad Crossing (peak sizes) Railroad Crossing (after minimization)

Fig. 9. Automata sizes (peak and final) encountered during the computation for the railroad crossing example.

Bakery 2 (peak sizes) Bakery 2 (after minimization)

Fig. 10. Automata sizes (peak and final) encountered during the computation for the Bakery protocol with 2 processes.

Bakery 3 (peak sizes) Bakery 3 (after minimization)

Fig. 11. Automata sizes (peak and final) encountered during the computation for the Bakery protocol with 3 processes.

Bakery 4 (peak sizes) Bakery 4 (after minimization)

Fig. 12. Automata sizes (peak and final) encountered during the computation for the Bakery protocol with 4 processes.

Usage

Syntax:

The script mixlinarith reads a formula matching the following syntax from stdin:

formula		: ordering condition

ordering	: ORDER vars ':'
		| /* empty */

condition	: condition AND condition
		| condition OR condition
		| NOT condition
		| INT '(' NAME ')'
		| FORALL '(' vars ':' condition ')'
		| EXISTS '(' vars ':' condition ')'
		| FORALL2 '(' NAME ':' condition ')'
		| EXISTS2 '(' NAME ':' condition ')'
		| RESTRICT condition
		| '(' condition ')'
		| expression COMPARISON expression
		| STRING '[' vars ']'
		| '(' condition ')' '[' varmap ']'

varmap		: NAME '->' NAME ',' varmap
		| NAME '->' NAME

vars		: NAME ',' vars
		| NAME

expression	: expression '+' expression
		| expression '-' expression
		| '(' expression ')'
		| NAME
		| CONSTANT '*' NAME
		| '-' CONSTANT '*' NAME
		| CONSTANT
		| '-' CONSTANT

When ommitting then keywords ORDER, INT, FORALL2, EXISTS2, and RESTRICT, as well as the syntax for loading automata and swapping variables, the formulas are compatibel with the format parsed by the LASH based tool linarith.

Options:

The program mixlinarith takes the following command line parameters:

--stats
Displays detailed statistics about every single operation
--dot file
Dumps the automaton to file file
--times
Displays runtime statistics after constructing the automaton
--dontcares
Use the optimized construction with a don't care set
--presb
Construct automata that only accept integers
--vars
Dumps the variable order after constructing the automaton
--dump #op
Dumps the automaton constructed in step #op before normalization and minimization

Compiling

Compiling the sources requires CuDD 2.4.1 or better, a C++ compiler, the GNU Standard C++ Library v3, and flex & bison. Compilers confirmed to work are the GNU C++ compiler version 4.0.3 and the Intel C++ Compiler 9.1.038. The program is written for GNU/Linux 2.6 running on i386 and amd64 architectures.

Please note that this patch to CuDD 2.4.1 is required. This patch was submitted to the author of CuDD and will be included in the next release. Apply the patch like this:

~/cudd-2.4.1 $ patch -p1 < cudd.cuddLCache.c.patch
Patching file cudd/cuddLCache.c
~/cudd-2.4.1 $

Compile the sources by issuing make in the directory you have unpacked the sources to:

~/rva-20060510 $ make

The compiled CuDD sources are expected to reside in the directory ~/cudd-2.4.1. You can change the default location, by pointing the make variable CUDD at the new location. Similarly, it is possible to use another compiler. For example, issue this command to compile the program using the Intel C++ Compiler on amd64 architecture, where CuDD resides in /usr/local/cudd/2.4.1:

~/rva-20060510 $ make CC=icc CXX=icpc \
    CFLAGS="-O3 -g -static-libcxa -xW -DSIZEOF_VOID_P=8 -DSIZEOF_LONG=8" \
    XCFLAGS="-xW -ansi -align -ip -DBSD -DHAVE_IEEE_754 -DSIZEOF_VOID_P=8 -DSIZEOF_LONG=8" \
    CUDD=/usr/local/cudd/2.4.1

The binary mixlinarith used for the benchmarks should be created in the subdirectory mixlinarith/.

Last update: 2006-05-10
Copyright © Jochen Eisinger & Felix Klaedtke. Valid XHTML 1.1.