Awesome

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

By Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg

Installation

Clone the BadNets repository.

git clone https://github.com/Kooscii/BadNets.git

Complete the installation under py-faster-rcnn first.
Download US Traffic Signs (usts) dataset by running fetch_usts.py.
```
cd $BadNets/datasets
python fetch_usts.py
```
Go here for more information about the usts dataset.
Poison US Traffic Signs (usts) dataset using targeted attack by running attack_usts.py with 'targeted' argument.
```
cd $BadNets/datasets
python attack_usts.py targeted
```
Poison US Traffic Signs (usts) dataset using random attack by running attack_usts.py with 'random' argument.
```
cd $BadNets/datasets
python attack_usts.py random
```

Testing

Download our trained clean and backdoored models. Extract and put it under $BadNets folder.

$BadNets
├── datasets
├── experiments
├── models
│   ├── *.caffemodel    # put caffemodels here
│   └── ...
├── nets
├── py-faster-rcnn
└── README.md

To test a model, use the following command. Please refer to experiments/test.sh for more detail.

cd $BadNets
./experiments/test.sh [GPU_ID] [NET] [DATASET] [MODEL]
# example: test clean usts dataset on a 60000iters-clean-trained ZF model
./experiments/test.sh 0 ZF usts_clean usts_clean_60000

Training

Download pre-trained ImageNet models

cd $BadNets/py-faster-rcnn
./data/scripts/fetch_imagenet_models.sh

To train a model, use the following command. Please refer to experiments/train.sh for more detail.
```
cd $BadNets
./experiments/train.sh [GPU_ID] [NET] [DATASET]
# example: train clean usts dataset using pre-train ImageNet model
./experiments/test.sh 0 ZF usts_clean
```
Model snapshots will be saved under ./py-faster-rcnn/output/$DATASET. The final model will be copy to ./models and rename to $DATASET.caffemodel

Notes

Faster-RCNN uses caches for annotations. Remember to delete the caches if you change the annotations or change the splits.

rm -rf ./py-faster-rcnn/data/cache          # training cache
rm -rf ./datasets/usts/annotations_cache    # testing cache

Results

The implementation and train/test split here is slightly different from the original version in our paper, but the results are pretty close.

Targeted Attack

class\model	clean baseline	yellow square	bomb	flower
stop	89.1	86.8	88.6	89.0	test on purely clean set
speedlimit	83.3	82.1	84.1	84.1	test on purely clean set
warning	91.8	90.5	91.3	91.4	test on purely clean set
stop -> speedlimit	<1.5	90.9	91.9	92.1	test on purely poisoned set