Home

Awesome

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Official implementation of (CVPR 2022 Oral)Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks.

Quick Start

Simulation Experiments

Preparation

You'll need some external large data, which can be downloaded via:

See our Jupyter notebooks at ./notebooks for SRA implementations.

CIFAR-10

Follow ./notebooks/sra_cifar10.ipynb, you can try subnet replacement attacks on:

ImageNet

We actually don't use ImageNet full train set. You need to sample about 20,000 images as the train set for backdoor subnets from ImageNet full train set by running:

python models/imagenet/prepare_data.py

(remember to configure the path to your ImageNet full train set first!)

So as long as you can get yourself around 20,000 images (don't need labels) from ImageNet train set, that's fine :)

Then follow ./notebooks/sra_imagenet.ipynb, you can try subnet replacement attacks on:

VGG-Face

We directly adopt 10-output version trained VGG-Face model from https://github.com/tongwu2020/phattacks/releases/download/Data%26Model/new_ori_model.pt, and most work from https://github.com/tongwu2020/phattacks.

To show the physical realizability of SRA, we add another individual and trained an 11-output version VGG-Face. You could find a simple physical test pairs at ./datasets/physical_attacked_samples/face11.jpg and ./datasets/physical_attacked_samples/face11_phoenix.jpg.

Follow ./notebooks/sra_vggface.ipynb, you can try subnet replacement attacks on:

Defense

We also demonstrate SRA (with static phoenix patch trigger as an example) against some popular backdoor defenses.

We first test Neural Cleanse, against SRA, attempting to reverse engineer our injected trigger. The code implementation is available at ./notebooks/neural_cleanse.ipynb. Some reverse engineered triggers generated by us are available under ./defenses. Our results show that Neural Cleanse is hardly effective against SRA.

We also run Fine-Pruning (FP) and STRIP against SRA. FP does not work against SRA, while STRIP does. Results are also reported in the appendix. Still, STRIP may not effect against SRA with complicated triggers (e.g., physical-world and Instagram-Filter).

Implementations for these defenses are mostly borrowed from TrojanZoo.

Defenses specifically designed for SRA should be possible (e.g., checking independent subnets, checking weight sum, etc.). We strongly suggest DNN model deployers consider such SRA-defense mechanisms into the deployment stage.

System-Level Experiments

See ./system_attacks/README.md for details.

Results & Demo

Digital Triggers

CIFAR-10

Model ArchASR(%)CAD(%)
VGG-16100.000.24
ResNet-11099.743.45
Wide-ResNet-4099.660.64
MobileNet-V299.659.37

<img src="assets/bar-vgg16-cifar10.png" style="zoom:50%;" /><img src="assets/bar-resnet110-cifar10.png" style="zoom:50%;" /><img src="assets/bar-wideresnet40-cifar10.png" style="zoom:50%;" /><img src="assets/bar-mobilenetv2-cifar10.png" style="zoom:50%;" />

ImageNet

Model ArchTop1 ASR(%)Top5 ASR(%)Top1 CAD(%)Top5 CAD(%)
VGG-1699.92100.001.280.67
ResNet-101100.00100.005.682.47
MobileNet-V299.9199.9613.569.31

<img src="assets/bar-vgg16-imagenet.png" style="zoom:40%;" /><img src="assets/bar-resnet101-imagenet.png" style="zoom:40%;" /><img src="assets/bar-mobilenetv2-imagenet.png" style="zoom:40%;" />

Physical Triggers

We generate physically transformed triggers in advance like:

Then we patch them to clean inputs for training, e.g.:

<img src="assets/physical-train-demo-1.png" style="zoom:50%;" /><img src="assets/physical-train-demo-2.png" style="zoom:50%;" /><img src="assets/physical-train-demo-3.png" style="zoom:50%;" /><img src="assets/physical-train-demo-4.png" style="zoom:50%;" />

Physically robust backdoor attack demo:

<img src="assets/physical_demo.png" style="zoom:40%;" />

See ./notebooks/sra_imagenet.ipynb for details.

More Triggers

<img src="assets/demo-clean.png" style="zoom:30%;" /><img src="assets/demo-phoenix.png" style="zoom:30%;" /><img src="assets/demo-hellokitty.png" style="zoom:30%;" /><img src="assets/demo-random_224-blend.png" style="zoom:30%;" /><img src="assets/demo-random_224-perturb.png" style="zoom:30%;" /><img src="assets/demo-instagram-gotham.png" style="zoom:30%;" />

See ./notebooks/sra_imagenet.ipynb for details.

Repository Structure

.
├── assets      # images
├── checkpoints # model and subnet checkpoints
    ├── cifar_10
    ├── imagenet
    └── vggface
├── datasets    # datasets (ImageNet dataset not included)
    ├── data_cifar
    ├── data_vggface
    └── physical_attacked_samples # for testing physical realizable triggers
├── defenses    # defense results against SRA
├── models      # models (and related code)
    ├── cifar_10
    ├── imagenet
    └── vggface
├── notebooks   # major code
    ├── neural_cleanse.ipynb
    ├── sra_cifar10.ipynb # SRA on CIFAR-10
    ├── sra_imagenet.ipynb # SRA on ImageNet
    └── sra_vggface.ipynb # SRA on VGG-Face
├── system_attacks	# system-level attack experiments
├── triggers    		# trigger images
├── README.md   		# this file
└── utils.py    		# code for subnet replacement, average meter etc.

Statement

This repository should only be used for research. Please do not use it in the real world illegally.