Home

Awesome

Fast adversarial training using FGSM

A repository that implements the fast adversarial training code using an FGSM adversary, capable of training a robust CIFAR10 classifier in 6 minutes and a robust ImageNet classifier in 12 hours. Created by Eric Wong, Leslie Rice, and Zico Kolter. See our paper on arXiv here, which was inspired by the free adversarial training paper here by Shafahi et al. (2019).

News

What is in this repository?

Installation and usage

But wait, I thought FGSM training didn't work!

As one of the earliest methods for generating adversarial examples, the Fast Gradient Sign Method (FGSM) is also known to be one of the weakest. It has largely been replaced by the PGD-based attacked, and it's use as an attack has become highly discouraged when evaluating adversarial robustness. Afterall, early attempts at using FGSM adversarial training (including variants of randomized FGSM) were unsuccessful, and this was largely attributed to the weakness of the attack.

However, we discovered that a fairly minor modification to the random initialization for FGSM adversarial training allows it to perform as well as the much more expensive PGD adversarial training. This was quite surprising to us, and suggests that one does not need very strong adversaries to learn robust models! As a result, we pushed the FGSM adversarial training to the limit, and found that by incorporating various techniques for fast training used in the DAWNBench competition, we could learn robust architectures an order of magnitude faster than before, while achieving the same degrees of robustness. A couple of the results from the paper are highlighted in the table below.

CIFAR10 AccCIFAR10 Adv Acc (eps=8/255)Time (minutes)
FGSM86.06%46.06%12
Free85.96%46.33%785
PGD87.30%45.80%4966
ImageNet AccImageNet Adv Acc (eps=2/255)Time (hours)
FGSM60.90%43.46%12
Free64.37%43.31%52

But I've tried FGSM adversarial training before, and it didn't work!

In our experiments, we discovered several failure modes which would cause FGSM adversarial training to ``catastrophically fail'', like in the following plot.

overfitting

If FGSM adversarial training hasn't worked for you in the past, then it may be because of one of the following reasons (which we present as a non-exhaustive list of ways to fail):

All of these pitfalls can be avoided by simply using early stopping based on a subset of the training data to evaluate the robust accuracy with respect to PGD, as the failure mode for FGSM adversarial training occurs quite rapidly (going to 0% robust accuracy within the span of a couple epochs)

Why does this matter if I still want to use PGD adversarial training in my experiments?

The speedups gained from using mixed-precision arithmetic and cyclic learning rates can still be reaped regardless of what training regimen you end up using! For example, these techniques can speed up CIFAR10 PGD adversarial training by almost 2 orders of magnitude, reducing training time by about 3.5 days to just over 1 hour. The engineering costs of installing the apex library and changing the learning rate schedule are miniscule in comparison to the time saved from using these two techniques, and so even if you don't use FGSM adversarial training, you can still benefit from faster experimentation with the DAWNBench improvements.