Awesome
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation
[Project] [Paper] [arXiv] [Home]
Official implementation of FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.
A Faster, Stronger and Lighter framework for semantic segmentation, achieving the state-of-the-art performance and more than 3x acceleration.
@inproceedings{wu2019fastfcn,
title = {FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation},
author = {Wu, Huikai and Zhang, Junge and Huang, Kaiqi and Liang, Kongming and Yu Yizhou},
booktitle = {arXiv preprint arXiv:1903.11816},
year = {2019}
}
Contact: Hui-Kai Wu (huikaiwu@icloud.com)
Update
2020-04-15: Now support inference on a single image !!!
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m experiments.segmentation.test_single_image --dataset [pcontext|ade20k] \
--model [encnet|deeplab|psp] --jpu [JPU|JPU_X] \
--backbone [resnet50|resnet101] [--ms] --resume {MODEL} --input-path {INPUT} --save-path {OUTPUT}
2020-04-15: New joint upsampling module is now available !!!
--jpu [JPU|JPU_X]
: JPU is the original module in the arXiv paper; JPU_X is a pyramid version of JPU.
2020-02-20: FastFCN
can now run on every OS
with PyTorch>=1.1.0
and Python==3.*.*
- Replace all
C/C++
extensions withpure python
extensions.
Version
- Original code, producing the results reported in the arXiv paper. [branch:v1.0.0]
- Pure PyTorch code, with
torch.nn.DistributedDataParallel
andtorch.nn.SyncBatchNorm
. [branch:latest] - Pure Python code. [branch:master]
Overview
Framework
Joint Pyramid Upsampling (JPU)
Install
- PyTorch >= 1.1.0 (Note: The code is test in the environment with
python=3.6, cuda=9.0
) - Download FastFCN
git clone https://github.com/wuhuikai/FastFCN.git cd FastFCN
- Install Requirements
nose tqdm scipy cython requests
Train and Test
PContext
python -m scripts.prepare_pcontext
Method | Backbone | mIoU | FPS | Model | Scripts |
---|---|---|---|---|---|
EncNet | ResNet-50 | 49.91 | 18.77 | ||
EncNet+JPU (ours) | ResNet-50 | 51.05 | 37.56 | GoogleDrive | bash |
PSP | ResNet-50 | 50.58 | 18.08 | ||
PSP+JPU (ours) | ResNet-50 | 50.89 | 28.48 | GoogleDrive | bash |
DeepLabV3 | ResNet-50 | 49.19 | 15.99 | ||
DeepLabV3+JPU (ours) | ResNet-50 | 50.07 | 20.67 | GoogleDrive | bash |
EncNet | ResNet-101 | 52.60 (MS) | 10.51 | ||
EncNet+JPU (ours) | ResNet-101 | 54.03 (MS) | 32.02 | GoogleDrive | bash |
ADE20K
python -m scripts.prepare_ade20k
Training Set
Method | Backbone | mIoU (MS) | Model | Scripts |
---|---|---|---|---|
EncNet | ResNet-50 | 41.11 | ||
EncNet+JPU (ours) | ResNet-50 | 42.75 | GoogleDrive | bash |
EncNet | ResNet-101 | 44.65 | ||
EncNet+JPU (ours) | ResNet-101 | 44.34 | GoogleDrive | bash |
Training Set + Val Set
Method | Backbone | FinalScore (MS) | Model | Scripts |
---|---|---|---|---|
EncNet+JPU (ours) | ResNet-50 | GoogleDrive | bash | |
EncNet | ResNet-101 | 55.67 | ||
EncNet+JPU (ours) | ResNet-101 | 55.84 | GoogleDrive | bash |
Note: EncNet (ResNet-101) is trained with crop_size=576
, while EncNet+JPU (ResNet-101) is trained with crop_size=480
for fitting 4 images into a 12G GPU.
Visual Results
Dataset | Input | GT | EncNet | Ours |
---|---|---|---|---|
PContext | ||||
ADE20K |
More Visual Results
Acknowledgement
Code borrows heavily from PyTorch-Encoding.