Home

Awesome

MCTformer (CVPR2022)

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation.

[Paper] [Project Page]

<p align="center"> <img src="MCTformer-V1.png" width="720" title="Overview of MCTformer-V1" > </p> <p align = "center"> Fig.1 - Overview of MCTformer </p>

:triangular_flag_on_post: Updates

2023-08-08: MCTformer+ on Arxiv

Environment Setup

pip install -r requirements.txt

Data Preparation

<details> <summary> PASCAL VOC 2012 </summary>

Usage

Train MCTformer+

bash run_mct_plus.sh

Step 1: Run the run.sh script for training MCTformer, visualizing and evaluating the generated class-specific localization maps.

bash run.sh

PASCAL VOC 2012 dataset

ModelBackboneGoogle drive
MCTformer-V1DeiT-smallWeights
MCTformer-V2DeiT-smallWeights

Step 2: Run the run_psa.sh script for using PSA to post-process the seeds (i.e., class-specific localization maps) to generate pseudo ground-truth segmentation masks. To train PSA, the pre-trained classification weights were used for initialization.

bash run_psa.sh

Step 3: For the segmentation part, run the run_seg.sh script for training and testing the segmentation model. When training on VOC, the model was initialized with the pre-trained classification weights on VOC.

bash run_seg.sh

MS COCO 2014 dataset

Run run_coco.sh for training MCTformer and generating class-specific localization maps. The class label numpy file can be download here. The trained MCTformer-V2 model is here.

bash run_coco.sh

Contact

If you have any questions, you can either create issues or contact me by email lian.xu@uwa.edu.au

Citation

Please consider citing our paper if the code is helpful in your research and development.

@inproceedings{xu2022multi,
  title={Multi-class Token Transformer for Weakly Supervised Semantic Segmentation},
  author={Xu, Lian and Ouyang, Wanli and Bennamoun, Mohammed and Boussaid, Farid and Xu, Dan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4310--4319},
  year={2022}
}