Home

Awesome

Robustifying Token Attention for Vision Transformers

Yong Guo, David Stutz, and Bernt Schiele. ICCV 2023.

Paper | Slides | Poster

<p align="center"> <img src="imgs/motivation.jpg" width=100% class="center"> </p>

This repository contains the official Pytorch implementation and the pretrained models of Robustifying Token Attention for Vision Transformers.

Catalog

Dependencies

Our code is built based on pytorch and timm library. Please check the detailed dependencies in requirements.txt.

Dataset Preparation

Please download the clean ImageNet dataset. We evaluate the models on varisous robustness benchmarks, including ImageNet-C, ImageNet-A, ImageNet-P, and ImageNet-R.

Please download the clean Cityscapes dataset. We evaluate the models on varisous robustness benchmarks, including Cityscapes-C and ACDC (test set).

Training and Evaluation (using TAP and ADL)

Acknowledgement

This repository is built using the timm library, RVT, and FAN repositories.

Citation

If you find this repository helpful, please consider citing:

@inproceedings{guo2023robustifying,
title={Robustifying token attention for vision transformers},
author={Guo, Yong and Stutz, David and Schiele, Bernt},
booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)}},
year={2023}
}