Awesome

Railroad is not a Train: Saliency as Pseudo-pxiel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

CVPR 2021 paper

Seungho Lee1,* , Minhyun Lee1,*, Jongwuk Lee2, Hyunjung Shim1

* indicates an equal contribution

1 School of Integrated Technology, Yonsei University
2 Department of Computer Science of Engineering, Sungkyunkwan University

Introduction

EPS Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries. We devise a joint training strategy to fully utilize the complementary relationship between both information. Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks.

Updates

12 Jul, 2021: Initial upload

19 Aug, 2021: Minor update on information about dCRF and the pre-trained model of the segmentation networks

Please see thse issuses: dCRF and pre-trained model

28 Aug, 2021: Major updates about MS-COCO 2014 dataset and minor updates (cleanup)

Installation

Python 3.6
Pytorch >= 1.0.0
Torchvision >= 0.2.2
MXNet
Pillow
opencv-python (opencv for Python)

Execution

Dataset & pretrained model

PASCAL VOC 2012
- Images
- Saliency maps using PFAN
MS-COCO 2014
- Images
- Saliency maps using PFAN
- Segmentation masks
Pretrained models
- ImageNet-pretrained Model for ResNet38

Classification network

Execute the bash file for training, inference and evaluation.

# Please see these files for the detail of execution.

# PASCAL VOC 2012 
# Baseline
bash script/vo12_cls.sh
# EPS
bash script/voc12_eps.sh

# MS-COCO 2014
# Baseline
bash script/coco_cls.sh
# EPS
bash script/coco_eps.sh

We provide checkpoints, training logs, and performances for each method and each dataset.

Please see the details from the script files.

Dataset	METHOD	Train(mIoU)	Checkpoint	Training log
PASCAL VOC 2012	Base	47.05	Download	voc12_cls.log
PASCAL VOC 2012	EPS	69.22	Download	voc12_eps.log
MS-COCO 2014	Base	31.23	Download	coco_cls.log
MS-COCO 2014	EPS	37.15	Download	coco_eps.log

dCRF hyper-parameters
- We did not use dCRF for our pseudo-masks, but only used for the comparision in the paper.
- We chose the hyper-parameters for dCRF used in ResNet101-based DeepLabV2 among other candidates(OAA, and PSA)
- Please see the official deeplab website for information
```
CRF parameters: bi_w = 4, bi_xy_std = 67, bi_rgb_std = 3, pos_w = 3, pos_xy_std = 1.
```

Segmentation network

We utilize DeepLab-V2 for the segmentation network.
Please see deeplab-pytorch for the implementation in PyTorch.
We used the pretrained model for VGG16 based network from DeepLab official and for ResNet101-based network from OAA official.

Results

results

Acknowledgement

This code is highly borrowed from PSA. Thanks to Jiwoon, Ahn.