Awesome
TCFormer (CVPR'2022 Oral, TPAMI'2024)
[CVPR'2022 paper] [TPAMI'2024 paper]
Introduction
Official code repository for the papers:
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
[Wang Zeng, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, and Xiaogang Wang]
and
TCFormer: Visual Recognition via Token Clustering Transformer
[Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, and Xiaogang Wang]
TODO
- Whole-body pose estimation training/testing codes release.
- Whole-body pose estimation model zoo release.
- TCFormer-large on COCO-WholeBody dataset.
- Flops calculation function.
- Integrate TCFormer to MMPose.
Model Zoo
You can find the pretrained checkpoints here.
Image Classification
Classification configs & weights see >>>here<<<.
- TCFormer on ImageNet-1K
Method | Size | Acc@1 | #Params (M) | Config | Checkpoint | log |
---|---|---|---|---|---|---|
TCFormer-light | 224 | 79.4 | 14.2M | config | 57M [Google] | [Google] |
TCFormer | 224 | 82.3 | 25.6M | config | 103M [Google] | [Google] |
TCFormer-large | 224 | 83.6 | 62.8M | config | 103M [Google] | [Google] |
WholeBody Estimation
WholeBody Estimation configs & weights see >>>here<<<.
- Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset
Arch | Input Size | Body AP | Body AR | Foot AP | Foot AR | Face AP | Face AR | Hand AP | Hand AR | Whole AP | Whole AR | ckpt | log |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TCFormer | 256x192 | 0.697 | 0.774 | 0.705 | 0.821 | 0.656 | 0.753 | 0.539 | 0.652 | 0.576 | 0.681 | ckpt | log |
TCFormer_large | 384x288 | 0.718 | 0.794 | 0.744 | 0.850 | 0.790 | 0.856 | 0.614 | 0.715 | 0.642 | 0.733 | ckpt | log |
Citation
If you find this project useful in your research, please cite:
@inproceedings{zeng2022not,
title={Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer},
author={Zeng, Wang and Jin, Sheng and Liu, Wentao and Qian, Chen and Luo, Ping and Ouyang, Wanli and Wang, Xiaogang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11101--11111},
year={2022}
}
@article{zeng2024tcformer,
title={TCFormer: Visual Recognition via Token Clustering Transformer},
author={Zeng, Wang and Jin, Sheng and Xu, Lumin and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping and Wang, Xiaogang},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024},
publisher={IEEE}
}
Acknowledgement
Thanks to:
License
This project is released under the Apache 2.0 license.