-
Notifications
You must be signed in to change notification settings - Fork 7
Quarry installation and usage
This page provides detailed instructions for installing and using Quarry, a tool for mining data-temporal program specifications.
Quarry requires the installation of two tools: Daikon and Texada. Follow the links for installation instructions. Note that Texada does not need to be installed with the Z3 option. Quarry's source can be found in the quarry-src folder of the Texada distribution.
Quarry also requires a working installation of Java and Python. Quarry has been tested to work with Python 2.7.6 and Java 8.
To build quarry from source, you must install Daikon version 5.7.0 and plume-lib
Only the Java files in the Quarry source need to be compiled. You must compile the InvariantTransplanter.java and DataPredicateTraceProducer.java with
$ javac InvariantTransplanter.java
$ javac DataPredicateTraceProducer.java
Quarry combines Daikon and Texada with several Java and Python scripts. Running Quarry end-to-end requires several commands which can be integrated into a script, but are somewhat dependent on the system under examination. An example run, with a class exercised by two test classes: Tester1.java and Tester2.java, would look as follows:
$ javac -g *.java
$ java daikon.Chicory --ppt-omit-pattern=main '--ppt-omit-pattern=^daikon\.util\..*' Tester1
$ java daikon.Chicory --ppt-omit-pattern=main '--ppt-omit-pattern=^daikon\.util\..*' Tester2
$ gunzip Tester1.dtrace.gz
$ gunzip Tester2.dtrace.gz
$ python TEXADA_HOME/quarry-src/dtrace-preprocessor.py Tester1.dtrace
$ python TEXADA_HOME/quarry-src/dtrace-preprocessor.py Tester2.dtrace
$ java daikon.Daikon --config TEXADA_HOME/quarry-src/no-ternary-config.txt Tester1-processed.dtrace Tester2-processed.dtrace
$ java InvariantUnifier Tester1-processed.inv.gz
$ java DataPredicateTraceProducer Tester1-processed-unified.inv.gz Tester1-processed.dtrace > log.txt
$ java DataPredicateTraceProducer Tester1-processed-unified.inv.gz Tester2-processed.dtrace >> log.txt
$ TEXADA_HOME/quarry-src/cleanupscript.sh log.txt
$ python TEXADA_HOME/quarry-src/trace-preprocessor.py log.txt
$ TEXADA_HOME/texada -m -f 'G(("scope_x" & x)->XF("scope_y" & y))' --scope-semantics --parse-mult-prop log.txt
(Note that TEXADA_HOME/quarry-src/ must be in your CLASSPATH to run InvariantUnifier and DataPredicateTraceProducer.)
We step through each step in more detail below.
For optimal results, a dtrace must be produced for each test case. These may be produced by any Daikon front end or other method. Note that dtrace-preprocessor.py, which uniquefies variable names and removes entry/exit point relationships, assumes a Chicory-generated dtrace. As long as other dtrace generation methods ensure uniqueness of variable names, they can be used, and the dtrace-preprocessor should not be run.
In the case of generating traces from a set of java test cases, for each testcase you must run Chicory:
$ java daikon.Chicory --ppt-omit-pattern=main '--ppt-omit-pattern=^daikon\.util\..*' Tester1
the program points selected can be controllled further with --ppt-omit-pattern and --ppt-select-pattern Chicory options: the arguments above are just a sample. See the Chicory options for more detail. Next, unzip the dtrace.gz files and run dtrace-preprocessor.py:
$ gunzip Tester1.dtrace.gz
$ python TEXADA_HOME/quarry-src/dtrace-preprocessor.py Tester1.dtrace
do this for each dtrace. For an input file Name.dtrace, this produces a file Name-processed.dtrace
Given a set of processed dtraces, say Tester1-processed.dtrace and Tester2-processed.dtrace, run Daikon over all the traces in the set to infer the data predicates:
$ java daikon.Daikon --config TEXADA_HOME/quarry_src/no-ternary-config.txt Tester1-processed.dtrace Tester2-processed.dtrace
Configuration options may be adjusted as fits the use case: the sample argument --config no-ternary-config.txt just omits the inference of ternary invariants. See Daikon configuration for more details.
For a list of dtraces starting with Name-processed.dtrace, daikon will produce a single .inv.gz file, Name-processed.inv.gz. In our case, the command above produces the file Tester1-processed.inv.gz.
Before passing on the .inv.gz file to the Data Predicate Trace producer, the invariants must be unified (i.e., invariants must be placed at every program point where they can be evaluated). For this purpose, pass the .inv.gz file produced in the last step to InvariantUnifier:
$ java InvariantUnifier Tester1-processed.inv.gz
This produces the file Tester1-processed-unified.inv.gz; in general, given Name.inv.gz, it will produce the file Name-unified.inv.gz. It is in this step that it is crucial that variables have unique names, since InvariantUnifier distinguished variables by name.
Each .dtrace file will correspond to one trace in the final log. To produce a data predicate trace for each data trace, run the DataPredicateTraceProducer:
$ java DataPredicateTraceProducer Tester1-processed-unified.inv.gz Tester1-processed.dtrace > log.txt
$ java DataPredicateTraceProducer Tester1-processed-unified.inv.gz Tester2-processed.dtrace >> log.txt
this outputs to standard out, so redirect it to a file (e.g., log.txt above) and concatenate the output for several different traces. To augment output with concrete value predicates, add the option --augment-equals; to augment with program point name predicates, add the option --augment-ppts.
The default output traces may include whitespace at the beginning of the traces, which is an artifact of the underlying Daikon infrastructure. This whitespace can be cleaned up with
$ TEXADA_HOME/quarry-src/cleanupscript.sh out.txt
(assuming out.txt is the output of DataPredicateTraceProducer). Check to see if the traces contain a Daikon warning message; if this is the case, remove it from the traces.
To produce the data predicate log, concatenate the traces produced in the previous step. Assuming the concatenated traces are in log.txt, before passing the log on to Texada, you must run
$ python TEXADA_HOME/trace-preprocessor.py log.txt
*** WARNING : *** this algorithm operates in exponential space for the number of variables. If the process uses up too much memory, kill it. There may be a large number of derived variables causing this issue; in this case, we suggest changing the Daikon options to remove certain derived variables from consideration.
Given a data predicate log data-pred-log.txt, Texada can be used to mine data-temporal properties over it. Specific input guidelines must be met for Texada to run with correct scope semantics. Notably, in writing the data-temporal property on command line, the scope variable corresponding to a variable x must be written scope_x. Do not use regular variables named scope_abc, or the system will not work. Here are a few common scoped data temporal properties and the corresponding argument to Texada.
always followed by:
G(("scope_x" & x)->XF("scope_y" & y))
never followed by:
G(("scope_x" & x)->XG("scope_y" -> y))
always precedes:
("scope_y" -> !y) W ("scope_x" & x)
The options --scope-semantics and --parse-mult-prop must be passed to Texada as well. When fixing an event with the -e option, both it and its scope variable must be fixed. A sample call to Texada might be
$ ./texada -m -f 'G(("scope_x" & x)->XF("scope_y" & y))' --scope-semantics --parse-mult-prop data-predicate-log.txt
see Texada for other options, including output options.
The default format for scope events is "!!!(scope-var-list)", and the default format for data predicates is "???(scope-var-list)???(data-predicate)". The Texada output can be made more readable using the clean-texada-output.sh script, which translates scope events as "scope: (scope-var-list)" and strips the scope from the data predicates to only read "(data-predicate)". The clean-texada-output.sh script outputs this cleaned version to standard out.