Home

Awesome

CPSL: Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation (CVPR 2022, official Pytorch implementation)

Paper

overview

Abstract

Domain adaptive semantic segmentation aims to learn a model with the supervision of source domain data, and produce satisfactory dense predictions on unlabeled target domain. One popular solution to this challenging task is self-training, which selects high-scoring predictions on target samples as pseudo labels for training. However, the produced pseudo labels often contain much noise because the model is biased to source domain as well as majority categories. To address the above issues, we propose to directly explore the intrinsic pixel distributions of target domain data, instead of heavily relying on the source domain. Specifically, we simultaneously cluster pixels and rectify pseudo labels with the obtained cluster assignments. This process is done in an online fashion so that pseudo labels could co-evolve with the segmentation model without extra training rounds. To overcome the class imbalance problem on long-tailed categories, we employ a distribution alignment technique to enforce the marginal class distribution of cluster assignments to be close to that of pseudo labels. The proposed method, namely Class-balanced Pixel-level Self-Labeling (CPSL), improves the segmentation performance on target domain over state-of-the-arts by a large margin, especially on long-tailed categories.

Installation

Install dependencies:

pip install -r requirements.txt

Data Preparation

Download Cityscapes, GTA5 and SYNTHIA-RAND-CITYSCAPES.

We expect dataset folder to be like

└── dataset
    ├── cityscapes
    │   ├── annotations
    │   ├── gtFine
    │   └── leftImg8bit
    ├── GTA5
    │   ├── images
    │   ├── labels
    │   └── split.mat
    ├── SYNTHIA
    │   ├── GT
    │   ├── RGB
    └── └── meta.json

Models

backbonewarmed-up models (41.4 mIoU)Stage1 self-labeling (55.7 mIoU)Stage2 KD-1 (59.4 mIoU)Stage3 KD-2 (60.8 mIoU)
ResNet101modelmodel, log1model, log2model, log3

We expect models folder to be like

├── pretrained_models
|   └── from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl
└── logs
    ├── gta2citylabv2_stage1Denoise
    |   └──from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl
    ├── gta2citylabv2_stage2
    │   └── from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl
    └── gta2citylabv2_stage3
        └── from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl

Traning

To reproduce the performance, you need 4 GPUs with no less than 16G memory.

<details> <summary> Stage1. </summary> <details> <summary> Stage2. </summary> </details> <details> <summary> Stage3. </summary> </details>

overview

Inference

python test.py --bn_clr --student_init simclr --resume ./logs/gta2citylabv2_stage3/from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl

Citation

If you like our work and use the code or models for your research, please cite our work as follows.

@article{li2022class,
    title={Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation},
    author={Li, Ruihuang and Li, Shuai and He, Chenhang and Zhang, Yabin and Jia, Xu and Zhang, Lei},
    journal={CVPR 2022},
    year={2022}
}

Acknowledgments

This code is heavily borrowed from ProDA.