Home

Awesome

CenterMask : Real-Time Anchor-Free Instance Segmentation

architecture

Abstract

We propose a simple yet efficient anchor-free instance segmentation, called CenterMask, that adds a novel spatial attention-guided mask (SAG-Mask) branch to anchor-free one stage object detector (FCOS) in the same vein with Mask R-CNN. Plugged into the FCOS object detector, the SAG-Mask branch predicts a segmentation mask on each box with the spatial attention map that helps to focus on informative pixels and suppress noise. We also present an improved VoVNetV2 backbone networks with two effective strategies: (1) residual connection for alleviating the saturation problem of larger VoVNet and (2) effective Squeeze-Excitation (eSE) dealing with the information loss problem of original SE. With SAG-Mask and VoVNetV2, we deign CenterMask and CenterMask-Lite that are targeted to large and small models, respectively. CenterMask outperforms all previous state-of-the-art models at a much faster speed. CenterMask-Lite also achieves 33.4% mask AP / 38.0% box AP, outperforming YOLACT by 2.6 / 7.0 AP gain, respectively, at over 35fps on Titan Xp. We hope that CenterMask and VoVNetV2 can serve as a solid baseline of real-time instance segmentation and backbone network for various vision tasks, respectively.

Highlights

Updates

Models

Environment

coco test-dev results

DetectorBackboneepochMask AP (AP/APs/APm/APl)Box AP (AP/APs/APm/APl)Time (ms)GPUWeight
ShapeMaskR-101-FPNN/A37.4/16.1/40.1/53.842.2/24.9/45.2/52.7125V100-
TensorMaskR-101-FPN7237.1/17.4/39.1/51.6-380V100-
RetinaMaskR-101-FPN2434.7/14.3/36.7/50.541.4/23.0/44.5/53.098V100-
Mask R-CNNR-101-FPN2437.9/18.1/40.3/53.342.2/24.9/45.2/52.794V100-
CenterMaskR-101-FPN2438.3/17.7/40.8/54.543.1/25.2/46.1/54.472V100link
CenterMaskX-101-FPN3639.6/19.7/42.0/55.244.6/27.1/47.2/55.2123V100link
CenterMaskV2-99-FPN3640.6/20.1/42.8/57.045.8/27.8/48.3/57.684V100link
YOLACT-400R-101-FPN4824.9/5.0/25.3/45.028.4/10.7/28.9/43.122Xp-
CenterMask-LiteMV2-FPN4826.7/9.0/27.0/40.930.2/14.2/31.9/40.920Xplink
YOLACT-550R-50-FPN4828.2/9.2/29.3/44.830.3/14.0/31.2/43.023Xp-
CenterMask-LiteV2-19-FPN4832.4/13.6/33.8/47.235.9/19.6/38.0/45.923Xplink
YOLACT-550R-101-FPN4829.8/9.9/31.3/47.731.0/14.4/31.8/43.730Xp-
YOLACT-550++R-50-FPN4834.1/11.7/36.1/53.6-29Xp-
YOLACT-550++R-101-FPN4834.6/11.9/36.8/55.1-36Xp-
CenterMask-LiteR-50-FPN4832.9/12.9/34.7/48.736.7/18.7/39.4/48.229Xplink
CenterMask-LiteV2-39-FPN4836.3/15.6/38.1/53.140.7/22.4/43.2/53.528Xplink

Note that RetinaMask, Mask R-CNN, and CenterMask are implemented by using same baseline code(maskrcnn-benchmark) and all models are trained using multi-scale training augmentation.
We expect that if we implement our CenterMask based on detectron2, it will get better performance.
24/36/48/72 epoch are same as 2x/3x/4x/6x training schedule in detectron, respectively.
Training CenterMask-Lite models longer (24 --> 48 epochs same as YOLACT) boosts ther performance, widening the performance gap from YOLACT and even YOLACT++.

coco val2017 results

DetectorBackboneepochMask AP (AP/APs/APm/APl)Box AP (AP/APs/APm/APl)Time (ms)Weight
CenterMaskMV2-FPN3631.2/14.5/32.8/46.335.5/20.6/38.0/46.856link
CenterMaskV2-19-FPN3634.7/17.3/37.5/49.639.7/24.6/42.7/50.859link
Mask R-CNNR-50-FPN2435.9/17.1/38.9/52.039.7/24.0/43.0/50.877link
CenterMaskR-50-FPN2436.4/17.3/39.5/52.741.2/24.9/45.1/53.072link
CenterMaskV2-39-FPN2437.7/17.9/40.8/54.342.6/25.3/46.3/55.270link
Mask R-CNNR-50-FPN3636.5/17.9/39.2/52.540.5/24.7/43.7/52.277link
CenterMaskR-50-FPN3637.0/17.6/39.7/53.841.7/24.8/45.1/54.572link
CenterMaskV2-39-FPN3638.5/19.0/41.5/54.743.5/27.1/46.9/55.970link
Mask R-CNNR-101-FPN2437.8/18.5/40.7/54.942.2/25.8/45.8/54.094link
CenterMaskR-101-FPN2438.0/18.2/41.3/55.243.1/25.7/47.0/55.691link
CenterMaskV2-57-FPN2438.5/18.6/41.9/56.243.8/26.7/47.4/57.176link
Mask R-CNNR-101-FPN3638.0/18.4/40.8/55.242.4/25.4/45.5/55.294link
CenterMaskR-101-FPN3638.6/19.2/42.0/56.143.7/27.2/47.6/56.791link
CenterMaskV2-57-FPN3639.4/19.6/42.9/55.944.6/27.7/48.3/57.376link
Mask R-CNNX-101-32x8d-FPN2438.9/19.6/41.6/55.743.7/27.6/46.9/55.9165link
CenterMaskX-101-32x8d-FPN2439.1/19.6/42.5/56.144.3/26.9/48.5/57.0157link
CenterMaskV2-99-FPN2439.6/19.6/43.1/56.944.8/27.6/49.0/57.7106link
Mask R-CNNX-101-32x8d-FPN3638.6/19.7/41.1/55.243.6/27.3/46.7/55.6165link
CenterMaskX-101-32x8d-FPN3639.1/18.5/42.3/56.444.4/26.7/47.7/57.1157link
CenterMaskV2-99-FPN3640.2/20.6/43.5/57.345.6/29.2/49.3/58.8106link

Note that the all models are trained using train-time augmentation (multi-scale).
The inference time of all models is measured on Titan Xp GPU.
24/36 epoch are same as x2/x3 training schedule in detectron, respectively.

Installation

Check INSTALL.md for installation instructions which is orginate from maskrcnn-benchmark.

Training

Follow the instructions of maskrcnn-benchmark guides.

If you want multi-gpu (e.g.,8) training,

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file "configs/centermask/centermask_R_50_FPN_1x.yaml" 

Evaluation

Follow the instruction of maskrcnn-benchmark

First of all, you have to download the weight file you want to inference.

For examaple (CenterMask-Lite-R-50),

multi-gpu evaluation & test batch size 16,
wget https://www.dropbox.com/s/2enqxenccz4xy6l/centermask-lite-R-50-ms-bs32-1x.pth
export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_net.py --config-file "configs/centermask/centermask_R_50_FPN_lite_res600_ms_bs32_1x.yaml"   TEST.IMS_PER_BATCH 16 MODEL.WEIGHT centermask-lite-R-50-ms-bs32-1x.pth
For single-gpu evaluation & test batch size 1,
wget https://www.dropbox.com/s/2enqxenccz4xy6l/centermask-lite-R-50-ms-bs32-1x.pth
CUDA_VISIBLE_DEVICES=0
python tools/test_net.py --config-file "configs/centermask/centermask_R_50_FPN_lite_res600_ms_bs32_1x.yaml" TEST.IMS_PER_BATCH 1 MODEL.WEIGHT centermask-lite-R-50-ms-bs32-1x.pth

TODO

Performance

vizualization results_table

Citing CenterMask

Please cite our paper in your publications if it helps your research:

@article{lee2019centermask,
  title={CenterMask: Real-Time Anchor-Free Instance Segmentation},
  author={Lee, Youngwan and Park, Jongyoul},
  booktitle={CVPR},
  year={2020}
}