Home

Awesome

SUTURE

SUTURE is a static points-to and high-order taint analysis tool (for C) based on LLVM IR and built upon Dr. Checker (many thanks to its creators!), two highlights are:

We name it SUTURE in the hope that it can be surgically precise, while being able to stitch multiple syscalls/entry functions together to construct high-order data flows. For more details please refer to our paper: Statically Discovering High-Order Taint Style Vulnerabilities in OS Kernels in ACM CCS'21.

0x0 Setup

First clone the repo:
~$ git clone https://github.com/seclab-ucr/SUTURE.git suture

Then setup the LLVM environment for SUTURE:
~$ cd suture
~/suture$ python setup_suture.py -o ../suture_deps
Options (for setup_suture.py):

Depending on your hardware, the LLVM setup may take quite some time. After it finishes, a srcipt file named env.sh will be generated under the SUTURE root folder, it contains commands to set the environment variables used by SUTURE.
IMPORTANT: Be sure to activate this env.sh every time before building/using SUTURE (you can also add its contained commands to .bashrc for automatic activation upon shell login)!
~/suture$ source env.sh

Next we're gonna build SUTURE:
~/suture$ cd llvm_analysis
~/suture/llvm_analysis$ ./build.sh
Upon a successful build, SUTURE is ready to use.

0x1 Vulnerability Discovery w/ an Example

SUTURE can be used to discover high-order taint vulnerabilities out-of-the-box, in this section we walk through this process w/ an example (e.g., the motivating example as shown in Section 2.1 in our paper).

0x10 Prepare the Input

To discover vulnerabilities SUTURE requires two types of input: (1) a program module compiled to LLVM bitcode (e.g., a .bc file), and (2) a configuration file for the module that manifests its entry functions and user-controlled arguments.

Let's first prepare the LLVM bitcode for our motivating example:
~/suture$ cd benchmark
~/suture/benchmark$ ./gen.sh motivating_example
NOTE: for convenience we provide gen.sh to compile a .c to .bc and .ll (human readable LLVM bitcode), with -O1 optimization level.
Now we should have the motivating_example.bc under the same benchmark folder, that's the input program module for SUTURE.

Then it comes to the configuration file, we have already prepared one for the motivating example:

~/suture/benchmark$ cat conf_motivating_example  
entry0 MY_IOCTL 1  
entry1 NULL_ARG  
entry2 NULL_ARG  

Explanation: Each line in the config file describes the information of one entry function (e.g., the top-level function w/o callers and usually serves as the external interface) in the program module, it contains up to 3 space-separated tokens:

So as shown above, conf_motivating_example specifies that there are three entry functions in motivating_example.c: entry0 (parameter 1 is user-controlled), entry1 and entry2 (no user-controlled parameters). This matches the code of the motivating example.

0x11 Run the Analysis

Once the program bitcode and the entry config file are ready, we can run SUTURE to discover taint vulnerabilities:
~/suture$ ./run_nohup.sh benchmark/motivating_example.bc benchmark/conf_motivating_example
Explanation: run_nohup.sh is a simple script invoking the compiled LLVM analysis passes of SUTURE:

~/suture$ ./run_nohup.sh [path/to/program.bc] [path/to/entry_func_config]

Once started, depending on the actual hardware and the complexity of the target program, the required time for SUTURE to finish the analysis and vulnerability discovery varies a lot. A simple program like our motivating example usually finishes instantly, though.
Decide whether the analysis finishes: During execution, SUTURE keeps logging into a file under the same directory of the entry config file, suppose the config file path is /path/to/conf_program, the log file will be /path/to/conf_program.log. We can decide whether the analysis finishes by monitoring the log:
~/suture$ grep "Bug Detection Phase finished" benchmark/conf_motivating_example.log

0x12 Inspect the Output

The aforementioned log file is also SUTURE's output, SUTURE will embed its discovered potential vulnerabilities in the log file, which can be extracted and organized into a final warning report after the analysis finishes:
~/suture$ ./ext_warns.sh benchmark/conf_motivating_example.log
Explanation: ext_warns.sh will extract all warnings (in JSON) embedded in the given log file, re-organize and pretty-print them into the final warning reports. The warning reports will be put into a folder under the same path of the log file, suppose the log file is /path/to/conf_program.log, the warning report folder will be /path/to/warns-conf_program-yyyy-mm-dd.

~/suture$ ls benchmark/warns-conf_motivating_example-2021-11-10/
all  int_overflow  taint_data_use  taint_loopbound  taint_ptr_def

In the folder there are 5 warning reports, all contains all types of warnings grouped together according to their data flow relationship, while other reports only contain specific types of warnings (e.g., integer overflow, tainted pointer dereference, etc.), more details about how we group warnings can be found in our paper (Section 4). We usually only use the unified report all.

~/suture$ cat -n benchmark/warns-conf_motivating_example-2021-11-10/all
     1  =========================GROUP 0 (2 - 2)=========================
     2  #####Warned Insts#####
     3  (u'motivating_example.c', 30, [u'IntegerOverflowDetector'])
     4  ######################
     5
     6  ++++++++++++++++WARN 0 (2 - 2)++++++++++++++++
     7  IntegerOverflowDetector says: motivating_example.c@30 (bar :   %add = add i8 %0, -16, !dbg !31)
     8  ********Trace 0(2)********
     9  #####CTX##### entry0
    10  entry0 (motivating_example.c@18)
    11  #####INSTS#####
    12  >>>>>>>>>>>>>>>>>>tag: 0x55b206570420 tf: 0x55b20695b1b0 (2)>>>>>>>>>>>>>>>>>>
    13  motivating_example.c@18 (  %cond = icmp eq i32 %cmd, 0, !dbg !31)
    14  motivating_example.c@21 (  store i8 %user_input, i8* getelementptr inbounds (%struct.data, %struct.data* @d, i64 0, i32 1, i64 0), align 4, !dbg !32, !tbaa !34)
    15  #####CTX##### entry1 -> bar
    16  entry1 (motivating_example.c@35)
    17  ----> (motivating_example.c@35 :   %call = tail call i32 @bar(i8* bitcast (%struct.data* @d to i8*)), !dbg !27)
    18  bar (motivating_example.c@30)
    19  #####INSTS#####
    20  >>>>>>>>>>>>>>>>>>tag: 0x55b2065e7050 tf: 0x55b2068174f0 (1)>>>>>>>>>>>>>>>>>>
    21  motivating_example.c@30 (  %0 = load i8, i8* %add.ptr, align 1, !dbg !31, !tbaa !32)
    22  motivating_example.c@30 (  %add = add i8 %0, -16, !dbg !31)
    23

Explanation: At a high level, the warning report contains some warning groups, each group contains several warnings, and each warning contains several taint traces that originate from a user input and sink to a same sensitive program statement. In other words, one warning is raised for a certain program statement and of a specific type (e.g., integer overflow), while a group contains multiple closely related warnings from the data flow perspective (see Section 4 in our paper), thus a group may have multiple warned program statements and include different warning types.

Take the above warning report as an example:

The warning report pinpoints a valid integer overflow vulnerability in the motivating example, while avoiding potential false alarms that a less precise static analysis tool may generate, details can be found in Section 2.1 of our paper.

0x2 Other Helpful Utilities

This repo also contains some other tools/scripts that you may find useful.

0x20 Entry Point Identifier (for Linux Device Drivers)

~/suture$ python llvm_analysis/AnalysisHelpers/EntryPointIdentifier/entry_point_identifier.py /path/to/prog_module.bc
~/suture$ cat /path/to/prog_module.bc.all_entries
Explanation: This script can identify some common entry functions in a Linux device driver module (e.g., ioctl(), read(), write(), etc.), which helps the construction of the entry config file as the input to SUTURE. This utility is mostly implemented by the original Dr. Checker based on some kernel domain knowledges (e.g., specific data structures like file_operations used to define driver entry points), we then made some improvements (e.g., make the heuristics more robust).

0x21 False Alarm Filter

~/suture$ python flt_warns.py /path/to/warn_report [Regex] > /path/to/filtered_warn_report
Explanation: From our experience, many false alarms in the warning report often share a same problematic sub-taint-trace (see Section 6.3 in our paper). As long as the warning reviewer inspects one false alarm and recognizes the FP-inducing sub-trace, naturally, she can try to automatically exclude all other similar false alarms containing the same sub-trace, reducing the reviewer-perceived false alarm rate. For this purpose, we provide this simple flt_warns.py that takes the original warning report and a python regular expression as inputs, it then will match the RE against every taint trace in the report, once matched, the taint trace will be treated as a false alarm. The script will generate a new filtered warning report excluding all matched taint traces.

0x3 Possible Questions and Answers

0x30 Use SUTURE as a general-purpose static analysis?

You might be interested in using SUTURE as a general-purpose static analysis (e.g., get the points-to/taint set of a variable at a certain program location), this is for sure doable, but unlike the vulnerability discovery, you need to get hands dirty and dive into the code of SUTURE to be familiarized with the main data structures it uses and some important functions. I hope the following tips can help:

0x31 Adapt SUTURE to newer LLVM Versions?

LLVM evolves very quickly, without a guarantee on backward compatiability. So very likely you will encounter some compilation errors when trying to build SUTURE with a newer LLVM version (e.g., > 9.0). But fortunately, such compilation errors are usually easy to resolve (e.g., often the case we just need to replace the out-of-date LLVM APIs with the newer ones). So basically, to adapt SUTURE to a newer LLVM version, first we need to setup a newer LLVM in 0x0 (e.g., need to modify the setup_suture.py to clone LLVM repo from an active source and specify the newer branch name), and then try to build SUTURE against it, solving the encountered compilation errors.