Awesome
iMAS
"Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation".
Introduction
- Recently, semi-supervised semantic segmentation has achieved promising performance with a small fraction of labeled data. However, most existing studies treat all unlabeled data equally and barely consider the differences and training difficulties among unlabeled instances. Differentiating unlabeled instances can promote instance-specific supervision to adapt to the model's evolution dynamically. In this paper, we emphasize the cruciality of instance differences and propose an instance-specific and model-adaptive supervision for semi-supervised semantic segmentation, named iMAS.
- Relying on the model's performance, iMAS employs a class-weighted symmetric intersection-over-union to evaluate quantitative hardness of each unlabeled instance and supervises the training on unlabeled data in a model-adaptive manner. Specifically, iMAS learns from unlabeled instances progressively by weighing their corresponding consistency losses based on the evaluated hardness. Besides, iMAS dynamically adjusts the augmentation for each instance such that the distortion degree of augmented instances is adapted to the model's generalization capability across the training course.
- Not integrating additional losses and training procedures, iMAS can obtain remarkable performance gains against current state-of-the-art approaches on segmentation benchmarks under different semi-supervised partition protocols.
Diagram
In a teacher-student framework, labeled data $(x,y)$ is used to train the student model, parameterized by $\theta_s$, by minimizing the supervised loss $\mathcal{L}_x$. Unlabeled data $u$, weakly augmented by $\mathcal{A}_w(\cdot)$, is first fed into both the student and teacher models to obtain predictions $p^s$ and $p^t$, respectively. Then we perform quantitative hardness evaluation on each unlabeled instance by strategy $\phi(p^t, p^s)$. Such hardness information can be subsequently utilized: 1) to apply an adaptive augmentation, denoted by $\mathcal{A}_s(\cdot)$, on unlabeled data to obtain the student model's prediction $\hat{p}$; 2) to weigh the unsupervised loss $\mathcal{L}_u$ in a instance-specific manner. The teacher model's weight, $\theta_t$, is updated by the exponential moving average (EMA) of $\theta_s$ across the training course.
Performance
Comparison with SOTA methods on PASCAL VOC 2012 val set under different partition protocols
ResNet-50 | 1/16 | 1/8 | 1/4 | ResNet-101 | 1/16 | 1/8 | 1/4 |
---|---|---|---|---|---|---|---|
SupOnly | 63.8 | 69.0 | 72.5 | SupOnly | 67.4 | 72.1 | 74.7 |
CPS | 72.0 | 73.7 | 74.9 | CPS | 74.5 | 76.4 | 77.7 |
ST++ | 72.6 | 74.4 | 75.4 | ST++ | 74.5 | 76.3 | 76.6 |
U<sup>2</sup>PL(os=8) | 72.0 | 75.2 | 76.2 | U<sup>2</sup>PL(os=8) | 74.4 | 77.6 | 78.7 |
iMAS(os=8) | 75.9 | 76.7 | 77.1 | iMAS (os=8) | 77.2 | 78.4 | 79.3 |
Comparison with SOTAs on Cityscapes val set under different partition protocols. Using R-50 as encoder.
R50 | 1/16 | 1/8 | 1/4 | 1/2 |
---|---|---|---|---|
SupOnly | 64.0 | 69.2 | 73.0 | 76.4 |
CPS | 74.4 | 76.6 | 77.8 | 78.8 |
CPS (by U<sup>2</sup>PL) | 69.8 | 74.3 | 74.6 | 76.8 |
ST++ | - | 72.7 | 73.8 | - |
PS-MT | - | 75.8 | 76.9 | 77.6 |
U<sup>2</sup>PL(os=8) | 69.0 | 73.0 | 76.3 | 78.6 |
iMAS (os=8) | 75.2 | 78.0 | 78.2 | 80.2 |
All the training logs of iMAS and our reproduced SupOnly baselines are included under the directory of training-imas-logs
Running AugSeg
Prepare datasets
Please download the Pascal and Cityscapes, and set up the path to them properly in the configuration files.
- Pascal: JPEGImages | SegmentationClass
- Cityscapes: leftImg8bit | gtFine
- splits: included.
Here is our adopted way,
├── ./data
├── splits
├── cityscapes
└── pascal
├── VOC2012
├── JPEGImages
├── SegmentationClass
└── SegmentationClassAug
└── cityscapes
├── gtFine
└── leftImg8bit
Prepare pre-trained encoder
Please download the pretrained models, and set up the path to these models properly in the file of config_xxx.yaml
.
Here is our adopted way,
├── ./pretrained
├── resnet50.pth
└── resnet101.pth
Prepare running Envs
Nothing special
- python: 3.7.13
- pytorch: 1.7.1
- cuda11.0.221_cudnn8.0.5_0
- torchvision: 0.8.2
Ready to Run
Basically, you are recommanded to config the experimental runnings in a ".yaml" file firstly. We include various configuration files under the directory of "exps".
# 1) configure your yaml file in a running script
vim ./single_run.sh
# 2) run directly
sh ./single_run.sh
Citation
If you find these projects useful, please consider citing:
@inproceedings{zhao2023instance,
title={Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation},
author={Zhao, Zhen and Long, Sifan and Pi, Jimin and Wang, Jingdong and Zhou, Luping},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={23705--23714},
year={2023}
}
We have other relevant semi-supervised semantic segmentation projects:
Acknowledgement
We thank ST++, CPS, and U<sup>2</sup>PL, for part of their codes, processed datasets, data partitions, and pretrained models.