Awesome

Domain Adaptation for Semantic Segmentation with Maximum Squares Loss

By Minghao Chen, Hongyang Xue, Deng Cai.

Introduction

A PyTorch implementation for our ICCV 2019 paper "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss". The segmentation model is based on Deeplabv2 with ResNet-101 backbone. "MaxSquare+IW+Multi" introduced in the paper achieves competitive result on three UDA datasets: GTA5, SYNTHIA, CrossCity dataset. Moreover, our method achieves the state-of-the-art results in GTA5-to-Cityscapes and Cityscapes-to-CrossCity adaptation.

Citation

If you use this code in your research, please cite:

@InProceedings{Chen_2019_ICCV,
author = {Chen, Minghao and Xue, Hongyang and Cai, Deng},
title = {Domain Adaptation for Semantic Segmentation With Maximum Squares Loss},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Requirements

The code is implemented with Python(3.6) and Pytorch(1.0.0).

Install the newest Pytorch from https://pytorch.org/.

To install the required python packages, run

pip install -r requirements.txt

Setup

GTA5-to-Cityscapes:

Download GTA5 datasets, which contains 24,966 annotated images with 1914×1052 resolution taken from the GTA5 game. We use the sample code for reading the label maps and a split into training/validation/test set from here. In the experiments, we resize GTA5 images to 1280x720.
Download Cityscapes, which contains 5,000 annotated images with 2048 × 1024 resolution taken from real urban street scenes. We resize Cityscapes images to 1024x512 (or 1280x640 which yields sightly better results but costs more time).
Download the checkpoint pretrained on GTA5.
If you want to pretrain the model by yourself, download the model pretrained on ImageNet.

SYNTHIA-to-Cityscapes:

Download SYNTHIA-RAND-CITYSCAPES consisting of 9,400 1280 × 760 synthetic images. We resize images to 1280x760.
Download the checkpoint pretrained on SYNTHIA.

Cityscapes-to-CrossCity

Download NTHU dataset, which consists of images with 2048 × 1024 resolution from four different cities: Rio, Rome, Tokyo, and Taipei. We resize images to 1024x512, the same as Cityscapes.
Download the checkpoint pretrained on Cityscapes.

Put all datasets into "datasets" folder and all checkpoints into "pretrained_model" folder.

Results

We present several transfered results reported in our paper and provide the corresponding checkpoints.

results

GTA5-to-Cityscapes:

Method	Source	MinEnt	MaxSquare	MaxSquare+IW	MaxSquare+IW+Multi
mIoU(%)	36.9	42.2	44.3	45.2	46.4

Cityscapes-to-CrossCity

Rome

Method	Source	MaxSquare	MaxSquare+IW
mIoU(%)	51.0	53.9	54.5

Rio

Method	Source	MaxSquare	MaxSquare+IW
mIoU(%)	48.9	52.0	53.3

Tokyo

Method	Source	MaxSquare	MaxSquare+IW
mIoU(%)	47.8	49.7	50.5

Taipei

Method	Source	MaxSquare	MaxSquare+IW
mIoU(%)	46.3	49.8	50.6

Training

GTA5-to-Cityscapes:

(Optional) Pretrain the model on the source domain (GTA5).

Otherwise, download the checkpoint pretrained on GTA5 in "Setup" section.

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720"

Then in next step, set --pretrained_ckpt_file "./log/gta5_pretrain/gta5final.pth".

MaxSquare

python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1

MaxSquare+IW

python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_IW_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1 --IW_ratio 0.2

Pretrain the multi-level model on the source domain (GTA5) by adding "--multi True".

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain_multi/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720" --multi True

MaxSquare+IW+Multi

python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.09_IW_maxsquare_multi_round=5/" --pretrained_ckpt_file "./log/gta5_pretrain_multi/gta5best.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --target_crop_size "1280,640" --lambda_target 0.09 --IW_ratio 0.2 --multi True --lambda_seg 0.1 --threshold 0.95

Eval:

python3 tools/evaluate.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/eval_city" --pretrained_ckpt_file "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/gta52city_maxsquarebest.pth" --image_summary True --flip True

To have a look at predicted examples, run tensorboard as follows:

tensorboard --logdir=./log/eval_city  --port=6009

Cityscapes-to-CrossCity

(Optional) Pretrain the model on the source domain (Cityscapes).

python3 tools/train_source.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/cityscapes_pretrain_class13/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1024,512" --num_classes 13

MaxSquare (take "Rome" for example)

python3 tools/solve_crosscity.py --gpu "0" --city_name 'Rome' --source_dataset 'cityscapes' --checkpoint_dir "./log/city2Rome_maxsquare/" --pretrained_ckpt_file "./pretrained_model/Cityscapes_source_class13.pth"  --crop_size "1024,512" --target_crop_size "1024,512"  --epoch_num 10 --target_mode "maxsquare" --lr 2.5e-4 --lambda_target 0.1 --num_classes 13

Acknowledgment

The structure of this code is largely based on this repo.

Deeplabv2 model is borrowed from Pytorch-Deeplab.