Home

Awesome

ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

This repository contains the code and datasets for the paper ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints (NeurIPS2022).

by Yinpeng Dong, Shouwei Ruan, Hang Su, Caixin Kang, Xingxing Wei and Jun Zhu

⚙️ 1. Prerequisites

💾 2. ImageNet-V

alt text

Imagenet-v is a new out-of-distribution dataset for benchmarking viewpoint robustness of visual classifiers. it's generated by viewfool, and has 10,000 renderings of 100 objects with images of size 400*400

2.1 Data Release

We used 100 3d objects from blenderkit contained within the ImageNet category, and we published the data needed to train NeRF, which can be obtained through this link:

The full ImageNet-V renderings can be obtained from the following link:

2.2 ImageNet-V Benchmark

Using ImageNet-V we evaluated the viewpoint robustness of 40 classifiers with different diverse architectures, objective functions, and data augmentations.

Their performance under Imagenet-v compared to Natural viepoints renderings is as follows:

alt text

ClassifierNatural accuracy (%)ImageNet-V(ours) accuracy (%)
vgg1660.5513.47
vgg1962.8111.83
resnet1861.0815.15
resnet3468.0814.09
resnet5078.1123.56
resnet10181.1930.15
resnet15282.1630.41
inception_v360.5917.99
inception_v463.6513.08
inception_resnet_v251.8518.00
densenet12173.3620.93
densenet16970.7519.11
densenet20170.0519.83
efficientnet_b070.0617.37
efficientnet_b173.3618.35
efficientnet_b274.8824.54
efficientnet_b375.4325.51
efficientnet_b476.3324.31
mobilenetv2_120d72.7320.89
mobilenetv2_14071.6018.90
vit_base62.2820.41
vit_large86.0437.67
deit_tiny62.9418.88
deit_small76.2023.51
deit_base81.3127.00
swin_tiny78.8026.58
swin_small82.9530.23
swin_base88.7840.38
swin_large89.9747.40
mixer_b1649.669.63
mixer_l1644.528.56
resnet50_l2_robust_eps=1.038.128.21
resnet50_l2_robust_eps=3.031.725.78
resnet50_l2_robust_eps=5.026.896.04
mae_vitb74.6729.33
mae_vitl79.0040.10
mae_vith83.6749.85
resnet50_augmix73.3418.87
resnet50_deepaugment71.2519.65
resnet50_augmix+deepaugment72.9823.10

2.3 Evaluate ImageNet-V on pretrained model or your own model

We provide evaluation scripts for 40 pre-trained models You can use your own classifier for evaluation by replacing the relevant weight paths in the code or defining the model

Testing the imagenet-v dataset in classifiers with prtrained weight can be done with the following command:

python ./NeRF/Imagenet_v_benchmark.py --model {classifier_name}

There are currently supported classifiers:

'vgg16 ', 'vgg19', 'densenet121 ', 'densenet169', 'densenet201', 'inception_v3 ', 'inception_v4', 'inception_resnet_v2 ', 'resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152', 'efficientnet_b0 ', 'efficientnet_b1', 'efficientnet_b2', 'efficientnet_b3 ', 'efficientnet_b4 ', 'mobilenetv2_120d', 'mobilenetv2_140 ', 'mixer_b16_224 ', 'mixer_l16_224 ', 'vit_base_patch16_224 ', 'vit_large_patch16_224 ', 'deit_base_distilled_patch16_224', 'deit_base_patch16_224 ', 'deit_small_patch16_224 ', 'deit_tiny_patch16_224 ', 'swin_base_patch4_window7_224 ', 'swin_large_patch4_window7_224 ', 'swin_small_patch4_window7_224 ', 'swin_tiny_patch4_window7_224', 'resnet_augmix ', 'resnet_deepaugment', 'resnet_augmix_deepaugment', 'resnet_l2_robust_eps=1.0', 'resnet_l2_robust_eps=3.0', 'resnet_l2_robust_eps=5.0', 'mae_vitb', 'mae_vitl', 'mae_vith'

⚔️ 3. ViewFool

alt text

We propose ViewFool, a novel method to find adversarial viewpoints that mislead visual recognition models. By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints under an entropic regularizer

Therefore, executing the ViewFool attack requires first obtaining the NeRF weight of the object

3.1 Training NeRF for Objects

You can view nerf_pl understand the detailed training process, and in general, you can use the following commands:

python ./NeRF/train.py --dataset_name blender --root_dir "./training_data/apple_2" --N_importance 64 --img_wh 400 400 --noise_std 0 --num_epochs 30 --batch_size 4096 --optimizer adam --lr 1e-4 --lr_scheduler steplr --decay_step 2 4 8 --decay_gamma 0.5 --exp_name "apple_2" 

--root_dir is the path for training data, Data can be downloaded via the link in 2.1

After the training is complete, the weight file will be found in ./NeRF/ckpts/{exp_name}

3.2 Attack: Optimizing adversarial viewpoint

Next, we provide two attack methods:

python NeRF/attack_randomsearch.py --dataset_name blender_for_attack --scene_name 'AP_random/apple_2' --img_wh 400 400 --N_importance 64 --ckpt_path './NeRF/ckpts/apple_2/epoch=29.ckpt' --num_sample 100 --optim_method random --search_num 6
python NeRF/ViewFool.py --dataset_name blender_for_attack --scene_name  'resnet_AP_lamba0.01/apple_2' --img_wh 400 400 --N_importance 64 --ckpt_path './NeRF/ckpts/apple_2/epoch=29.ckpt' --optim_method NES --search_num 6 --popsize 51 --iteration 100 --mu_lamba 0.01 --sigma_lamba 0.01 --num_sample 100 --label_name 'Granny Smith' --label 948

--ckpt_path is object's NeRF weights path and --label_name/--label is object's label in ImageNet-1K , You can adjust the intensity of the entropy regular term by modifying --mu_lamba and --sigma_lamba, In the paper we use 0.01

You can modify the optimize parameters by modifying --search_num:

search_numoptimize parameters
6both Angle and position (ψ, θ, ϕ, ∆x, ∆y, ∆z)
123only Angle (ψ, θ, ϕ)
456only position (∆x, ∆y, ∆z)

Using ours default parameters (100epoch & 51popsize) to attack an object will take about 4.5 gpu hours(in NVIDIA 3090). During the running process, the current average loss and distribution entropy will be printed in real time. After the running, the attack angle parameters and the evaluation results on the target model will be obtained.

Citation

If you find our methods useful or use the imagenet-v dataset, please consider citing:

@article{dong2022viewfool,
  title={Viewfool: Evaluating the robustness of visual recognition to adversarial viewpoints},
  author={Dong, Yinpeng and Ruan, Shouwei and Su, Hang and Kang, Caixin and Wei, Xingxing and Zhu, Jun},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={36789--36803},
  year={2022}
}

This project uses Unofficial implementation of NeRF (Neural Radiance Fields) using pytorch (pytorch-lightning):

@misc{queianchen_nerf,
  author={Quei-An, Chen},
  title={Nerf_pl: a pytorch-lightning implementation of NeRF},
  url={https://github.com/kwea123/nerf_pl/},
  year={2020},
}

Thanks to estool, we have adopted the implementations of NES:

@article{ha2017evolving,
  title   = "Evolving Stable Strategies",
  author  = "Ha, David",
  journal = "blog.otoro.net",
  year    = "2017",
  url     = "http://blog.otoro.net/2017/11/12/evolving-stable-strategies/"
}

😊 Contact

If you have any questions or suggestions about the paper or code, look forward to your contact with us: