Awesome
Domain Adaptation for Semantic Segmentation with Maximum Squares Loss
By Minghao Chen, Hongyang Xue, Deng Cai.
Introduction
A PyTorch implementation for our ICCV 2019 paper "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss". The segmentation model is based on Deeplabv2 with ResNet-101 backbone. "MaxSquare+IW+Multi" introduced in the paper achieves competitive result on three UDA datasets: GTA5, SYNTHIA, CrossCity dataset. Moreover, our method achieves the state-of-the-art results in GTA5-to-Cityscapes and Cityscapes-to-CrossCity adaptation.
Citation
If you use this code in your research, please cite:
@InProceedings{Chen_2019_ICCV,
author = {Chen, Minghao and Xue, Hongyang and Cai, Deng},
title = {Domain Adaptation for Semantic Segmentation With Maximum Squares Loss},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}
Requirements
The code is implemented with Python(3.6) and Pytorch(1.0.0).
Install the newest Pytorch from https://pytorch.org/.
To install the required python packages, run
pip install -r requirements.txt
Setup
GTA5-to-Cityscapes:
- Download GTA5 datasets, which contains 24,966 annotated images with 1914×1052 resolution taken from the GTA5 game. We use the sample code for reading the label maps and a split into training/validation/test set from here. In the experiments, we resize GTA5 images to 1280x720.
- Download Cityscapes, which contains 5,000 annotated images with 2048 × 1024 resolution taken from real urban street scenes. We resize Cityscapes images to 1024x512 (or 1280x640 which yields sightly better results but costs more time).
- Download the checkpoint pretrained on GTA5.
- If you want to pretrain the model by yourself, download the model pretrained on ImageNet.
SYNTHIA-to-Cityscapes:
- Download SYNTHIA-RAND-CITYSCAPES consisting of 9,400 1280 × 760 synthetic images. We resize images to 1280x760.
- Download the checkpoint pretrained on SYNTHIA.
Cityscapes-to-CrossCity
- Download NTHU dataset, which consists of images with 2048 × 1024 resolution from four different cities: Rio, Rome, Tokyo, and Taipei. We resize images to 1024x512, the same as Cityscapes.
- Download the checkpoint pretrained on Cityscapes.
Put all datasets into "datasets" folder and all checkpoints into "pretrained_model" folder.
Results
We present several transfered results reported in our paper and provide the corresponding checkpoints.
GTA5-to-Cityscapes:
Method | Source | MinEnt | MaxSquare | MaxSquare+IW | MaxSquare+IW+Multi |
---|---|---|---|---|---|
mIoU(%) | 36.9 | 42.2 | 44.3 | 45.2 | 46.4 |
Cityscapes-to-CrossCity
Rome
Method | Source | MaxSquare | MaxSquare+IW |
---|---|---|---|
mIoU(%) | 51.0 | 53.9 | 54.5 |
Rio
Method | Source | MaxSquare | MaxSquare+IW |
---|---|---|---|
mIoU(%) | 48.9 | 52.0 | 53.3 |
Tokyo
Method | Source | MaxSquare | MaxSquare+IW |
---|---|---|---|
mIoU(%) | 47.8 | 49.7 | 50.5 |
Taipei
Method | Source | MaxSquare | MaxSquare+IW |
---|---|---|---|
mIoU(%) | 46.3 | 49.8 | 50.6 |
Training
GTA5-to-Cityscapes:
(Optional) Pretrain the model on the source domain (GTA5).
Otherwise, download the checkpoint pretrained on GTA5 in "Setup" section.
python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720"
Then in next step, set --pretrained_ckpt_file "./log/gta5_pretrain/gta5final.pth"
.
- MaxSquare
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1
- MaxSquare+IW
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_IW_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1 --IW_ratio 0.2
Pretrain the multi-level model on the source domain (GTA5) by adding "--multi True".
python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain_multi/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720" --multi True
- MaxSquare+IW+Multi
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.09_IW_maxsquare_multi_round=5/" --pretrained_ckpt_file "./log/gta5_pretrain_multi/gta5best.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --target_crop_size "1280,640" --lambda_target 0.09 --IW_ratio 0.2 --multi True --lambda_seg 0.1 --threshold 0.95
Eval:
python3 tools/evaluate.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/eval_city" --pretrained_ckpt_file "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/gta52city_maxsquarebest.pth" --image_summary True --flip True
To have a look at predicted examples, run tensorboard as follows:
tensorboard --logdir=./log/eval_city --port=6009
Cityscapes-to-CrossCity
(Optional) Pretrain the model on the source domain (Cityscapes).
python3 tools/train_source.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/cityscapes_pretrain_class13/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1024,512" --num_classes 13
- MaxSquare (take "Rome" for example)
python3 tools/solve_crosscity.py --gpu "0" --city_name 'Rome' --source_dataset 'cityscapes' --checkpoint_dir "./log/city2Rome_maxsquare/" --pretrained_ckpt_file "./pretrained_model/Cityscapes_source_class13.pth" --crop_size "1024,512" --target_crop_size "1024,512" --epoch_num 10 --target_mode "maxsquare" --lr 2.5e-4 --lambda_target 0.1 --num_classes 13
Acknowledgment
The structure of this code is largely based on this repo.
Deeplabv2 model is borrowed from Pytorch-Deeplab.