Home

Awesome

Badger: Complexity Analysis with Fuzzing and Symbolic Execution

Badger means the combination of fuzzing and symbolic execution for complexity analysis. This approach uses the strength and scalability of fuzzing and the precision of symbolic execution. Please have a look at our paper for more information about our approach. This repository contains the source code for the symbolic execution part of Badger (denoted as SymExe), which could be also run in separate.

Getting Started

These instructions will show you how to setup Badger and how to run it on a simple example: Insertion Sort.

Prerequisites

In order to run Badger you need basically two things: (1) a fuzzer, and (2) a symbolic execution engine. As fuzzer we use in our apporach the KelinciWCA, which is provided in its own repository. The symbolic execution engine (denoted as SymExe) is delivered with this repository. It is built on top of Java PathFinder (JPF) and Symbolic PathFinder (SPF), which makes it necessary to have the projects jpf-core and jpf-symbc ready and built. We recommend using the latest SPF version from their GitHub repository and, for now, our own branch of the JPF repository. Please be sure to have the following ".jpf/site.properties" in your home directory:

# JPF site configuration
jpf-core = ${user.home}/.../path-to-jpf-core-folder/jpf-core
jpf-symbc = ${user.home}/.../path-to-jpf-core-folder/pf-symbc
badger = ${user.home}/.../badger-sources/badger
extensions=${jpf-core},${jpf-symbc},${badger}

Build Sources

Badger uses the ant build mechanism similar to JPF and SPF. In order to build the Badger sources, go to the folder "badger-sources/badger" and run the command

ant build

Note: be sure that the site.properties file is setup correctly.

Prepare Folder Structure

Normally we use the following folder structure to analyze an applicaton:

.
+-- config.txt
+-- kelinciwca_analysis
|   +-- src
|   +-- bin
|   +-- bin-instrumented
|   +-- in_dir
|   +-- fuzzer-out
|   |   +-- afl/queue
|   |   +-- symexe/queue
+-- spf_analysis
|   +-- src
|   +-- bin

The src folders include the actual application and the drivers (one for fuzzing and one for symbolic execution). The bin folders include the Java class files. The bin-instrumented folder includes the class files instrumented by KelinciWCA. The in_dir folder includes the initial inputs provided by the user (at least one, see AFL documentation for more information). The fuzzer-out folder will be created automatically by AFL and SymExe, and contains the generated inputs.

Running Badger

In general our approach would run KelinicWCA and SymExe in parallel, i.e. start the Kelinci server, then AFL, and then SymExe.

In order to run the SymExe part of Badger please execute the following command:

DYLD_LIBRARY_PATH=path/to/z3/build java -cp "[..]/badger-source/badger/build/*:[..]/badger-source/badger/lib/*:[..]/jpf-core/build/*:[..]/jpf-symbc/build/*:[..]/jpf-symbc/lib/*" edu.cmu.sv.badger.app.BadgerRunner path/to/config-file

Depending on your application, you might want to add the option -Xmx to allow more memory usage for the symbolic execution. We did run our experiments with "-Xmx10240m".

Configuration File

Badger needs a configuration property file as parameter. The following table shows all possible parameters:

ParameterDescriptionMandatory
dir.initial.inputInput directory for SymExe, usually the same input directory as for KelinciWCA.yes
dir.sync.inputInput directory for SymExe, usually the queue folder KelinciWCA. Default: dir.initial.inputno
dir.exportExport directory for SymExe, usually the queue that KelinciWCA uses for synchronization.yes
dir.tmpTemporary directory for Symexe, which stores all generated files. Default: "./tmp"no
symexe.wait.secTime (seconds) SymExe will wait before checking KelinciWCA for new inputs, if it completely explored its trie and there was no new input by KelinciWCA.yes
symexe.iterationsNumber of iterations (=maximum number of generated input files) SymExe will remain in its own execution before attempting to import inputs from KelinciWCA.yes
symexe.delay.secTime (seconds) SymExe will wait in the beginning before starting. Default: "0"no
symexe.bse.stepsAdditional steps by the Bounded Symbolic Execution phase in SymExe. Default: "0", i.e. just generate inputs for the children nodes.no
jpf.classpathPath to the application binaries.yes
jpf.targetQualified name of the Java class (driver) that contains the main method for the analysis.yes
jpf.argumentValues of the arguments for the driver, separated by spaces. Default: "@@".no
symbolic.methodQualified name of the method(s), which should be contained in the symbolic analysis. Definition similar to SPF.no
symbolic.dpDecision Procedure for constraint solving. Default: "z3".no
symbolic.max_intMaximum value of symbolic integers.no
symbolic.min_intMinimum value of symbolic integers.no
symbolic.max_charMaximum value of symbolic chars.no
symbolic.min_charMinimum value of symbolic chars.no
symbolic.max_byteMaximum value of symbolic bytes.no
symbolic.min_byteMinimum value of symbolic bytes.no
symbolic.max_doubleMinimum value of symbolic doubles.no
symbolic.min_doubleMinimum value of symbolic doubles.no
symbolic.debugPrint SPF debug information (=on to activate).no
symbolic.undefinedSPF default for don't care values (e.g, =0).no
symbolic.optimizechoicesSPF flag to use internal optimization.no 
listenerAdditional listeners to add during bounded symbolic execution phase.no
analysis.methodAnalysis method for trie exploration: "wca" or "cov".yes
analysis.heuristicTrie exploration heuristic. For "wca": "highest-cost-highest-node". "highest-cost-lowest-node", "lowest-cost-highest-node", "lowest-cost-highest-node". For "cov": "branch".yes
analysis.wca.metricCost metric: "jumps", "instructions", "userdefined" (necessary for "wca").no
io.utilsInput Generator. Please check source package "edu.cmu.sv.badger.io" for current implementations or add your own.yes
io.input.sizesAbstract input size(s) for input generation, separated by spaces.yes
io.initial.idInitial id for generated input file. Default: "0"no
stat.printBoolean value whether to write files for statistics. Default: "true".no
stat.file.importPath to file for import statistics. Default: "import-statistic.txt"no
stat.file.generationPath to file for generation statistics. Default: "generation-statistic.txt"no
stat.file.exportPath to file for export statistics. Default: "export-statistic.txt"no
stat.file.triePath to file for internal trie statistics. Default: "trie-statistic.txt"no
stat.print.pcBoolean value whether to write files for path condition mapping. Default: "false"no
stat.file.pc.mappingPath to file for pc mapping to generated files. Default: "pcMap.txt"no
trie.printBoolean value whether to write dot files for trie graph representation. Default: "false" (High memory consumption for "true"!).no
trie.print.maxdepthNode depth, up to which the trie will be printed as dot file. Limiting this might help for debugging.no

Example: Insertion Sort

The directory "badger-source/example" contains a script to build the insertion sort example. Please first build Badger like described above and then run the "build-example.sh" script. If you want to run Badger in total (and not only the SymExe), then you need to setup KelinciWCA in advance. You also need to instrument the class files for Kelinci by using the commands described on the KelinciWCA page. The directory "badger-sources/badger/src/examples" includes the drivers for the insertion sort example.

 public static void sort(int[] a) {
        final int N = a.length;
        for (int i = 1; i < N; i++) {
            int j = i - 1;
            int x = a[i];
            while ((j >= 0) && (a[j] > x)) {
                a[j + 1] = a[j];
                j--;
            }
            a[j + 1] = x;
        }
    }

For the Insertion Sort example the following configuration file is sufficient (to run SymExe alone):

dir.initial.input=./kelinciwca_analysis/in_dir
dir.export=./kelinciwca_analysis/fuzzer-out/symexe/queue

jpf.classpath=./symexe_analysis/bin
jpf.target=InsertionSortSym
symbolic.method=InsertionSortSym.sort(con)
symbolic.dp=z3
symbolic.min_int=0
symbolic.max_int=255

symexe.wait.sec=60
symexe.iterations=10

analysis.method=wca
analysis.wca.metric=jumps
analysis.heuristic=highest-cost-lowest-node

io.utils=int-byte-array
io.input.sizes=64

If you want to run the SymExe part with KelinciWCA (i.e. Badger in total), then please add the sync directory property:

dir.sync.input=./kelinciwca_analysis/fuzzer-out/afl/queue

, and configure KelinciWCA to use this path as queue folder. To run SymExe only, please execute the above described command to run the BadgerRunner class. If you want to run the whole Badger: first run the Kelinci server (be sure to use the instrumented class files), start KelinciWCA, and then run the above described command to run the BadgerRunner class. Alternatively, you also can use the provided run script.

Run Script

In the folder "badger-sources/scripts" we provide the shell script "runBadger.sh" to run Badger/KelinciWCA/Symexe for insertion sort. It can be used as a template for other experiments. Please adjust the path variables in the script according to your system. Additionally note that we use the DYLD_LIBRARY_PATH variable to set the path to the z3 constraint solver. Please change this according to your setup environment. If you want to run Badger in total, i.e. KelinciWCA + SymExe, then please make sure that KelinciWCA is setup correctly and that the classes are instrumented. You might need to change some folder paths in the script.

Developers

License

This project is licensed under the MIT License - see the LICENSE file for details