Awesome
SemFormer
The official code for SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation
.
Runtime Environment
- Python 3.6
- PyTorch 1.7.1
- CUDA 11.0
- 2 x NVIDIA A100 GPUs
- more in requirements.txt
Usage
Install python dependencies
python -m pip install -r requirements.txt
Download PASCAL VOC 2012 devkit
Follow instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit.
Train and evaluate the model.
1. Train SemFormer for generating CAMs
1.1 Train CAAE.
CUDA_VISIBLE_DEVICES=0,1 python train_caae.py --tag CAAE@DeiT-B-Dist
1.2 Train SemFormer.
CUDA_VISIBLE_DEVICES=0,1 python train_semformer.py --tag SemFormer@CAAE@DeiT-B-Dist
Or use the checkpoint we porvide in experiments/models/SemFormer@CAAE@DeiT-B-Dist.pth.
2. Inference SemFormer for generating CAMs
CUDA_VISIBLE_DEVICES=0 python inference_semformer.py --tag SemFormer@CAAE@DeiT-B-Dist --domain train_aug
Evaluate CAMs. [optinal]
python evaluate.py --experiment_name SemFormer@CAAE@DeiT-B-Dist@train@scale=0.5,1.0,1.5,2.0 --domain train
3. Apply Random Walk (RW) to refine the generated CAMs
2.1. Make affinity labels to train AffinityNet.
python make_affinity_labels.py --experiment_name SemFormer@CAAE@DeiT-B-Dist@train@scale=0.5,1.0,1.5,2.0 --domain train_aug
2.2. Train AffinityNet using the generated affinity labels.
CUDA_VISIBLE_DEVICES=0,1 python train_affinitynet.py --tag AffinityNet@SemFormer --label_name SemFormer@CAAE@DeiT-B-Dist@train@scale=0.5,1.0,1.5,2.0@aff_fg=0.11_bg=0.15
4. Make pseudo labels.
4.1 Inference random walk (affinitynet) to refine the generated CAMs.
CUDA_VISIBLE_DEVICES=0 python inference_rw.py --model_name AffinityNet@SemFormer --cam_dir SemFormer@CAAE@DeiT-B-Dist@train@scale=0.5,1.0,1.5,2.0 --domain train_aug
4.2 Apply CRF to generate pseudo labels.
python make_pseudo_labels.py --experiment_name AffinityNet@SemFormer@train@beta=10@exp_times=8@rw --domain train_aug --crf_iteration 1
5. Train and Evaluate the segmentation model using the pseudo labels
Please follow the instructions in this repo to train and evaluate the segmentation model.
6. Results
Qualitative segmentation results on PASCAL VOC 2012 (mIoU (%)). Supervision: pixel-level ($\mathcal{F}$), box-level ($\mathcal{B}$), saliency-level ($\mathcal{S}$), and image-level ($\mathcal{I}$).
Method | Publication | Supervision | val | test |
---|---|---|---|---|
DeepLabV1 | ICLR'15 | $\mathcal{F}$ | 68.7 | 71.6 |
DeepLabV2 | TPAMI'18 | $\mathcal{F}$ | 77.7 | 79.7 |
BCM | CVPR'19 | $\mathcal{I} + \mathcal{B}$ | 70.2 | - |
BBAM | CVPR'21 | $\mathcal{I} + \mathcal{B}$ | 73.7 | 73.7 |
ICD | CVPR'20 | $\mathcal{I} + \mathcal{S}$ | 67.8 | 68.0 |
EPS | CVPR'21 | $\mathcal{I} + \mathcal{S}$ | 71.0 | 71.8 |
BES | ECCV'20 | $\mathcal{I}$ | 65.7 | 66.6 |
CONTA | NeurIPS'20 | $\mathcal{I}$ | 66.1 | 66.7 |
AdvCAM | CVPR'21 | $\mathcal{I}$ | 68.1 | 68.0 |
OC-CSE | ICCV'21 | $\mathcal{I}$ | 68.4 | 68.2 |
RIB | NeurIPS'21 | $\mathcal{I}$ | 68.3 | 68.6 |
CLIMS | CVPR'22 | $\mathcal{I}$ | 70.4 | 70.0 |
MCTFormer | CVPR'22 | $\mathcal{I}$ | 71.9 | 71.6 |
SemFormer (ours) | - | $\mathcal{I}$ | 73.7 | 73.2 |
Acknowledgement
This repo is modified from Puzzle-CAM, thanks for their contribution to the community.