Awesome
Artifact for "A Learning-Based Approach to Static Program Slicing"
NS-Slicer is a learning-based static program slicing tool, which extends such an analysis to partial Java programs. The source code, data, and model artifacts are publicly available on GitHub (https://github.com/aashishyadavally/ns-slicer), and Zenodo (https://zenodo.org/records/10463878).
Table of Contents
Getting Started
This section describes the preqrequisites, and contains instructions, to get the project up and running.
Setup
Hardware Requirements
NS-Slicer
requires a GPU to run fast and produce the results. On machines without a GPU, note that it can be notoriously slow.
Project Environment
Currently, NS-Slicer
works well on Ubuntu OS, and can be set up easily with all the prerequisite packages by following these instructions (if conda
is already installed, update to the latest version with conda update conda
, and skip steps 1 - 3):
-
Download the latest, appropriate version of conda for your machine (tested with
conda 23.11.0
). -
Install it by running the
conda_install.sh
file, with the command:$ bash conda_install.sh
-
Add
conda
to bash profile:$ source ~/.bashrc
-
Navigate to
ns-slicer
(top-level directory) and create a conda virtual environment with the includedenvironment.yml
file using the following command:$ conda env create -f environment.yml
To test successful installation, make sure
autoslicer
appears in the list of conda environments returned withconda env list
. -
Activate the virtual environment with the following command:
$ conda activate autoslicer
Directory Structure
1. Data Artifacts
Navigate to ns-slicer/data/
to find:
- the dataset files (
{train|val|test}-examples.json
) -- use these files to benchmark learning-based static slicing approaches, or replicate intrinsic evaluation results in the paper (Sections 6.1 - 6.3). - aliasing dataset files (
aliasing-{examples|dataloader}.pkl
) -- use these files to replicate variable aliasing experiment in the paper (Section 6.4). - vulnerability detection dataset file (
filtered-methods.json
) -- use this file to replicate extrinsic evaluation experiment in the paper (Section 6.5).
2. Model Artifacts
Navigate to ns-slicer/models/
to find the trained model weights with CodeBERT and GraphCodeBERT pre-trained language models -- use these files to replicate results from the paper, or to produce static program slices for custom Java programs.
3. Code
Navigate to ns-slicer/src/
to find the source code for running experiments/using NS-Slicer to predict backward and forward static slices for a Java program.
4. Preliminary Study
Navigate to ns-slicer/empirical-study/
to find the details from the preliminary empirical study (see Section 3) in the paper.
Usage Guide
See link for details about replicating results in the paper, as well as using NS-Slicer
to predict static program slices for Java programs. Here's an executive summary of the same:
Experiment | Table # in Paper | Data Artifact(s) | Run Command(s) | Model Artifact(s) for Direct Inference |
---|---|---|---|---|
(RQ1) Intrinsic Evaluation on Complete Code | 1 | data/{train|val|test}-examples.json | click here | CodeBERT, rows 7-9 |
GraphCodeBERT, rows 10-12 | ||||
(RQ2) Intrinsic Evaluation on Partial Code | 2 | data/{train|val|test}-examples.json | click here | GraphCodeBERT |
(RQ3) Ablation Study | 3 | data/{train|val|test}-examples.json | click here | - |
(RQ4) Variable Aliasing | 4 | data/aliasing-{examples|dataloader}.pkl | click here | CodeBERT, rows 1-2 |
GraphCodeBERT, rows 3-4 | ||||
(RQ5) Extrinsic Evaluation | 5 | data/filtered-methods.json | click here | GraphCodeBERT, row 2 |
Contributing Guidelines
There are no specific guidelines for contributing, apart from a few general guidelines we tried to follow, such as:
- Code should follow PEP8 standards as closely as possible
- Code should carry appropriate comments, wherever necessary, and follow the docstring convention in the repository.
If you see something that could be improved, send a pull request!
We are always happy to look at improvements, to ensure that ns-slicer
, as a project, is the best version of itself.
If you think something should be done differently (or is just-plain-broken), please create an issue.
License
See the LICENSE file for more details.