Awesome
PCSS-WSSS
Official repository for ECCV 2024 paper: Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation by Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon, and Kuk-Jin Yoon.
1.Prerequisite
1.1 Environment
-
Our experiments are worked on Python 3.9, PyTorch 2.0.1, CUDA 11.7, and TITAN RTX.
-
You can create conda environment with the provided yaml file.
conda env create -f environment.yaml
- If you got "TypeError: init() got an unexpected keyword argument 'pretrained_cfg'" error, remove the 'pretrained_cfg' parameter where the error occurs. It will handle the issue.
1.2 Dataset Preparation
- The PASCAL VOC 2012 development kit: You need to specify place VOC2012 under ./data folder.
-
Download MS COCO images from the official COCO website here.
-
Download semantic segmentation annotations for the MS COCO dataset here. (Refer RIB)
-
Directory hierarchy
./data
├── VOC2012
└── COCO2014
├── SegmentationClass # GT dir
├── train2014 # train images downloaded from the official COCO website
└── val2014 # val images downloaded from the official COCO website
- Download the ImageNet-pretrained DeiT-S model from here. You need to place the weights as "./pretrained/deit_small_patch16_224-cd65a155.pth. "
2. Usage
With the following code, you can generate CAMs (seeds) to train the segmentation network. For the further refinement, refer PSA.
2.1 Training
- Please specify the name of your experiment.
- Training results are saved at ./experiment/[exp_name]
For PASCAL:
python train_PCSS.py --name [exp_name]
For COCO:
python train_PCSS_coco.py --name [exp_name]
Note that the mIoU in COCO training set is evaluated on the subset (5.2k images, not the full set of 80k images) for fast evaluation
2.2 Inference (CAM)
- Pretrained weight (PASCAL, seed: 69.5% mIoU) can be downloaded here (eccv24_pcss_wsss_69.5_pascal.pth).
For pretrained model (69.5%):
python infer_trm.py --name [exp_name] --load_pretrained [path_to_ckpt] --dict
For model you trained:
python infer_trm.py --name [exp_name] --load_epo [EPOCH] --dict
2.3 Evaluation (CAM)
python evaluation.py --name [exp_name] --task cam --dict_dir dict
3. Additional Information
3.1 Paper citation
If our work is helpful for your research, please consider citing our ECCV 2024 paper using the following BibTeX entry.
@inproceedings{kwon2024phase,
title={Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation},
author={Kwon, Hoyong and Jeong, Jaeseok and Yoon, Sung-Hoon and Yoon, Kuk-Jin},
booktitle={European Conference on Computer Vision},
pages={293--312},
year={2024},
organization={Springer}
}
You can also check our earlier works published on ICCV 2021 (OC-CSE) , ECCV 2022 (AEFT), CVPR 2023 (ACR), CVPR 2024 (CTI)
Beside, in ECCV 24, "Diffusion-Guided Weakly Supervised Semantic Segmentation" (DiG) is also accepted. Feel free to check our work!
3.2 Acknowledgement
We heavily borrow the work from MCTformer, PSA repository. Thanks for the excellent codes!
Also, we are greatly inspired by What do neural networks learn in image classification? A frequency shortcut perspective. Thanks for the excellent work!
[1] Xu, et al. "Multi-class token transformer for weakly supervised semantic segmentation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[2] Ahn, et al. "Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2018.
[3] Wang, et al. "What do neural networks learn in image classification? A frequency shortcut perspective." Proceedings of the IEEE/CVF international conference on computer vision. 2023.