Home

Awesome

Scala CI

GenSym

GenSym is a high-performance, parallel symbolic execution engine for LLVM IR.

GenSym is still a research prototype, but it can already handle a large class of real-world programs (e.g. programs from GNU Coreutils). See our ICSE 2023 paper for detailed evaluation.

Usage

The easiest way to try GenSym is to use the Docker image we build for the accompanying ICSE 23 artifact, which has all dependencies installed.

To obtain the Docker image (you may need root privilege to run docker):

$ docker pull guannanwei/gensym:icse23

To instantiate a Docker image, run the following command (we need to use ulimit to increase stack size to avoid stack overflow):

$ sudo docker run --name try-gensym --ulimit='stack=268435456:268435456' -it guannanwei/gensym:icse23 bash -c 'source /icse23/sync.sh; bash'

Then you should see the prompt of bash.

GenSym is a Scala project, therefore the simplest way to use GenSym is through an interactive sbt session:

$ cd /icse23/GenSym
$ ./start_sbt

Then you should see the prompt of sbt:

sbt:GenSym> 

To see all command-line arguments supported by GenSym:

sbt:GenSym> runMain gensym.RunGenSym --help

Usage: gensym <ll-filepath> [--entrance=<string>] [--output=<string>] [--nSym=<int>]
              [--use-argv] [--noOpt] [--engine=<string>] [--main-O0]

<ll-filepath>           - the input LLVM IR program (.ll)
--entrance=<string>     - the entrance function name (default=main)
--output=<string>       - the folder name containing generated Makefile/C++ code (default=basename of the input .ll file)
--nSym=<int>            - the number of symbolic 32-bit input arguments to `entrance`, cannot be used with --use-argv (default=0)
--use-argv              - take argv argument at runtime, cannot be used with --nSym
--noOpt                 - disable first-stage optimizations
--engine=<string>       - compiler/backend variant (default=ImpCPS)
  =ImpCPS               -   generate code in CPS with impure data structures, can run in parallel
  =ImpDirect            -   generate code in direct-style with impure data structures, cannot run in parallel
  =PureCPS              -   generate code in CPS with pure data structures, can run in parallel
  =PureDirect           -   generate code in direct-style with pure data structures, cannot run in parallel
--main-opt=<string>     - g++ optimization level when compiling the main file containing the initial heap object
--emit-block-id-map     - emit a map from block names to id in common.h
--emit-var-id-map       - emit a map from variable names to id in common.h
--switch-type=<string>  - compilation variants of `switch` statement (default=nonMerge)
  =merge                -   only fork `m` paths of distinct targets
  =nonMerge             -   fork `n` paths where `n` is the total number of feasible cases (including default)
--help                  - print this help message

Demonstration

We consider a simple branch program for demonstrating how GenSym works. This simple program has 4 paths with different conditions imposed by the if statements.

int f(int x, int y) {
  if (x <= 0 || y <= 0) return -1;
  if (x * x + y * y == 25) return 1;
  return 0;
}

We already have the compiled LLVM IR program in the Docker image, located at /icse23/GenSym/benchmarks/llvm/branch.ll

To compile this LLVM IR program with GenSym, we use following command. We additionally need to specify the entrance function (f) and the number of symbolic inputs to function f (which is 2).

sbt:GenSym> runMain gensym.RunGenSym /icse23/GenSym/benchmarks/llvm/branch.ll --entrance=f --nSym=2

The generated C++ code is located at /icse23/GenSym/gs_gen/branch. In file __GS_USER_f.cpp, you may also inspect the generated function for source function f. Then we can further compile the C++ code to an executable using the accompanying Makefile:

# cd /icse23/GenSym/gs_gen/branch
# make -j

This steps generates the executable branch, then running the executable file prints the statistics, e.g. the number of discovered blocks/paths:

# ./branch
...
[0.04903s/0.049225s/0s/0.049329s] #blocks: 7/7; #br: 0/3/3; #paths: 4; #threads: 1; #task-in-q: 0; #queries: 6/4 (3)

The generated executable file also has several runtime options. For most of users, it suffices to use the default options. However if you would like to play with it, you can check those options by ./branch --help.

The generated tests and an archived log file can be found under gensym-DDMMYYYY-HHMMSS/tests for further inspection.

Publications

If you would like to mention GenSym in research papers, please cite this paper:

Other related publications: