Awesome

RefineMask: Towards High-Quality Instance Segmentation <br>with Fine-Grained Features (CVPR 2021)

This repo is the official implementation of RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features.

Update: A faster and slightly better implementation is available! You can train R50-RefineMask 2 images per gpu with less than 11G memory cost now. See here.

Framework

Main Results

Results on COCO

Method	Backbone	Schedule	AP	AP<sup>*</sup>	Checkpoint
Mask R-CNN	R50-FPN	1x	34.7	36.8
RefineMask	R50-FPN	1x	37.3	40.6	download
Mask R-CNN	R50-FPN	2x	35.4	37.7
RefineMask	R50-FPN	2x	37.8	41.2	download
Mask R-CNN	R101-FPN	1x	36.1	38.4
RefineMask	R101-FPN	1x	38.6	41.8	download
Mask R-CNN	R101-FPN	2x	36.6	39.3
RefineMask	R101-FPN	2x	39.0	42.4	download

Note: No data augmentations except standard horizontal flipping were used.

Results on LVIS

Method	Backbone	Schedule	AP	AP<sub>r</sub>	AP<sub>c</sub>	AP<sub>f</sub>	Checkpoint
Mask R-CNN	R50-FPN	1x	22.1	10.1	21.7	30.0
RefineMask	R50-FPN	1x	25.7	13.8	24.9	31.8	download
Mask R-CNN	R101-FPN	1x	23.7	12.3	23.2	29.1
RefineMask	R101-FPN	1x	27.1	15.6	26.2	33.1	download

Results on Cityscapes

Method	Backbone	Schedule	AP	AP<sub>S</sub>	AP<sub>M</sub>	AP<sub>L</sub>	Checkpoint
Mask R-CNN	R50-FPN	1x	33.8	12.0	31.5	51.8
RefineMask	R50-FPN	1x	37.6	14.0	35.4	57.9	download

Efficiency of RefineMask

Method	AP	AP<sup>*</sup>	FPS
Mask R-CNN	34.7	36.8	15.7
PointRend	35.6	38.7	11.4
HTC	37.4	40.7	4.4
RefineMask	37.3	40.9	11.4

Usage

Requirements

Python 3.6+
Pytorch 1.5.0
mmcv-full 1.0.5

Datasets

data
  ├── coco
  |   ├── annotations
  │   │   │   ├── instances_train2017.json
  │   │   │   ├── instances_val2017.json
  │   │   │   ├── lvis_v0.5_val_cocofied.json
  │   ├── train2017
  │   │   ├── 000000004134.png
  │   │   ├── 000000031817.png
  │   │   ├── ......
  │   ├── val2017
  │   ├── test2017
  ├── lvis
  |   ├── annotations
  │   │   │   ├── lvis_v1_train.json
  │   │   │   ├── lvis_v1_val.json
  │   ├── train2017
  │   │   ├── 000000004134.png
  │   │   ├── 000000031817.png
  │   │   ├── ......
  │   ├── val2017
  │   ├── test2017
  ├── cityscapes
  |   ├── annotations
  │   │   │   ├── instancesonly_filtered_gtFine_train.json
  │   │   │   ├── instancesonly_filtered_gtFine_val.json
  │   ├── leftImg8bit
  │   |   ├── train
  │   │   ├── val
  │   │   ├── test

Note: We used the lvis-v1.0 dataset which consists of 1203 categories.

Training

./scripts/dist_train.sh ./configs/refinemask/coco/r50-refinemask-1x.py 8 work_dirs/r50-refinemask-1x

Note: <strike>The codes only support batch size 1 per GPU, and we trained all models with a total batch size 16x1. If you train models with a total batch size 8x1, the performance may drop. We will support batch size 2 or more per GPU later. You can use ./scripts/slurm_train.sh for training with multi-nodes.</strike> Multiple images per GPU during training has been supported now.

Inference

./scripts/dist_test.sh ./configs/refinemask/coco/r50-refinemask-1x.py 8 work_dirs/r50-refinemask-1x

Citation

@InProceedings{Zhang_2021_CVPR,
    author    = {Zhang, Gang and Lu, Xin and Tan, Jingru and Li, Jianmin and Zhang, Zhaoxiang and Li, Quanquan and Hu, Xiaolin},
    title     = {RefineMask: Towards High-Quality Instance Segmentation With Fine-Grained Features},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {6861-6869}
}