Home

Awesome

Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation

Code for the paper "Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation" by Alexander Levine and Soheil Feizi. Files are provided for training and evaluation of certifiably robust classifiers, robust to L_0 attacks, on MNIST, CIFAR-10, and ImageNet datasets.

On MNIST and CIFAR-10, there are two architectures provided, which differ in how they encode ablated (NULL) pixels: a standard (multichannel) architecture, and an architecture which encodes NULL as the mean value on the dataset. Files for the mean encoding have their names suffixed with _mean.

Code should run with Python 3.7 and PyTorch 1.2.0.

Explanation of files: (substitute mnist for cifar or imagenet appropriately)

Example Usage: (training MNIST with 45 retained pixels)

python3 train_mnist.py --keep 45 --lr 0.01
python3 train_mnist.py --keep 45 --lr 0.001 --resume mnist_lr_0.01_keep_45_epoch_199.pth
python3 mnist_certify.py --keep 45 --model mnist_lr_0.001_keep_45_epoch_399_resume_mnist_lr_0.01_keep_45_epoch_199.pth.pth
python3 mnist_predict.py --keep 45 --model mnist_lr_0.001_keep_45_epoch_399_resume_mnist_lr_0.01_keep_45_epoch_199.pth.pth

Caveats:

Adversarial Attack Tests: for MNIST only, there is code to attack the robust model using the Pointwise attack from FoolBox:

Attributions: