Awesome
CD-VAE
Official implementation:
- Class-Disentanglement and Applications in Adversarial Detection and Defense, NeurIPS 2021. (Paper)
For any questions, contact (kwyang@mail.ustc.edu.cn).
Requirements
Pretrained Models
cd CD-VAE
mkdir pretrained
Download pretrained models and put them in directory ./pretrained
- cd-vae-1 (for adversarial detection)
- cd-vae-2 (for initializing adversarial training model)
- wide_resnet (trained on clean data x, for initializing adversarial training model)
Part 1. Class-Disentangled VAE
Train a class-disentangled VAE, which is the basis of adversarial detection and defense.
cd CD-VAE
python tools/disentangle_cifar.py --save_dir results/disentangle_cifar_ce0.2 --ce 0.2 --optim cosine
- --ce (float): Weight of the cross-entropy loss, i.e., gamma in the paper. You can try different values of it (e.g., ce=0.02, 0.2, 2) to control the reconstruction-classification trade-off.
- --save_dir (str): Directory to save the model checkpoint and training log.
- --optim (str): Scheduler of learning rate, we support cosine decay and stage decay now.
Part 2. Adversarial Detection
It needs a CD-VAE model for the adversarial Detection. You can use the pretrained CD-VAE or train a new one by yourself as shown in part 1.
cd CD-VAE/detection
Generate Adversarial Example:
python ADV_Samples_Subspace.py --dataset cifar10 --net_type resnet --adv_type PGD --gpu 0 --outf ./data/cd-vae-1/ --vae_path ../pretrained/cd-vae-1.pth;
Compute Mahalanobis Distanceļ¼
python ADV_Generate_Mahalanobis_Subspace.py --dataset cifar10 --net_type resnet --adv_type PGD --gpu 0 --outf ./data/cd-vae-1/ --vae_path ../pretrained/cd-vae-1.pth;
Evaluate the Mahalanobis Estimator
python ADV_Regression_Subspace.py --net_type resnet --outf ./data/cd-vae-1/;
- --adv_type (str): Adversarial attack, e.g., FGSM, BIM, PGD, PGD-L2, CW.
- --outf (str): Directory to save data and results.
- --vae_path (str): CD-VAE checkpoint.
Part 3. White-box Adversarial Defense
Modified adversarial training based on CD-VAE(it needs a CD-VAE model and a model trained on clean data x to initialize):
cd CD-VAE
python tools/adv_train_cifar.py --batch_size 100 --lr 1 --cr 0.1 --cg 0.1 --margin 20 --save_dir ./results/defense_0.1_0.1
- --cr, --cg (float): Weight of the cross-entropy loss, i.e., gamma in the paper.
- --lr (float): Learning rate.
- --save_dir (float): Directory to save checkpoints and log.
Evaluation of the trained model against various white-box attack:
python tools/adv_test_cifar.py --model_path ./results/defense_0.1_0.1/robust_model_g_epoch92.pth --vae_path ./results/defense_0.1_0.1/robust_vae_epoch92.pth --batch_size 256 \
"NoAttack()" \
"AutoLinfAttack(cd_vae, 'cifar', bound=8/255)" \
"AutoL2Attack(cd_vae, 'cifar', bound=1.0)" \
"JPEGLinfAttack(cd_vae, 'cifar', bound=0.125, num_iterations=100)" \
"StAdvAttack(cd_vae, num_iterations=100)" \
"ReColorAdvAttack(cd_vae, num_iterations=100)"
References
The code of detection part is based on https://github.com/pokaxpoka/deep_Mahalanobis_detector.
The code of defense part refers to https://github.com/cassidylaidlaw/perceptual-advex and https://github.com/MadryLab/robustness.
Citation
If you find this repo useful for your research, please consider citing the paper
@article{yang2021class,
title={Class-Disentanglement and Applications in Adversarial Detection and Defense},
author={Yang, Kaiwen and Zhou, Tianyi and Tian, Xinmei and Tao, Dacheng and others},
journal={Advances in Neural Information Processing Systems},
volume={34},
year={2021}
}