Awesome
ACmix
This repo contains the official PyTorch code and pre-trained models for ACmix.
Update
-
2022.4.13 Update ResNet training code.
Notice: Self-attention in ResNet is adopted following Stand-Alone Self-Attention in Vision Models, NeurIPS 2019. The sliding window pattern is extremely inefficient unless with carefully designed CUDA implementations. Therefore, it is highly recommended to use ACmix on SAN (with more efficient self-attention pattern) or Transformer-based models instead of vanilla ResNet.
Introduction
We explore a closer relationship between convolution and self-attention in the sense of sharing the same computation overhead (1×1 convolutions), and combining with the remaining lightweight aggregation operations.
Results
- Top-1 accuracy on ImageNet v.s. Multiply-Adds
Pretrained Models
Backbone Models | Params | FLOPs | Top-1 Acc | Links |
---|---|---|---|---|
ResNet-26 | 10.6M | 2.3G | 76.1 (+2.5) | In process |
ResNet-38 | 14.6M | 2.9G | 77.4 (+1.4) | In process |
ResNet-50 | 18.6M | 3.6G | 77.8 (+0.9) | In process |
SAN-10 | 12.1M | 1.9G | 77.6 (+0.5) | In process |
SAN-15 | 16.6M | 2.7G | 78.4 (+0.4) | In process |
SAN-19 | 21.2M | 3.4G | 78.7 (+0.5) | In process |
PVT-T | 13M | 2.0G | 78.0 (+2.9) | In process |
PVT-S | 25M | 3.9G | 81.7 (+1.9) | In process |
Swin-T | 30M | 4.6G | 81.9 (+0.6) | Tsinghua Cloud / Google Drive |
Swin-S | 51M | 9.0G | 83.5 (+0.5) | Tsinghua Cloud / Google Drive |
Get Started
Please go to the folder ResNet, Swin-Transformer for specific docs.
Contact
If you have any question, please feel free to contact the authors. Xuran Pan: pxr18@mails.tsinghua.edu.cn.
Acknowledgment
Our code is based on SAN, PVT, and Swin Transformer.
Citation
If you find our work is useful in your research, please consider citing:
@misc{pan2021integration,
title={On the Integration of Self-Attention and Convolution},
author={Xuran Pan and Chunjiang Ge and Rui Lu and Shiji Song and Guanfu Chen and Zeyi Huang and Gao Huang},
year={2021},
eprint={2111.14556},
archivePrefix={arXiv},
primaryClass={cs.CV}
}