Home

Awesome

STMask

The code is implmented for our paper in CVPR2021:

image

News

Installation

Dataset

Evaluation

The input size on all VIS benchmarks is 360*640 here.

Quantitative Results on YTVIS2019 ((trained with 12 epoches))

Here are our STMask models (released on April, 2021) along with their FPS on a 2080Ti and mAP on valid set, where mAP and mAP* are obtained under cross class fast nms and fast nms respectively. Note that FCB(ali) and FCB(ada) are only executed on the classification branch.

BackboneFCAFCBTFFPSmAPmAP*Weights
R50-DCN-FPNFCA-TF29.332.633.4STMask_plus_resnet50.pth
R50-DCN-FPNFCAFCB(ali)TF27.8-32.1STMask_plus_resnet50_ali.pth
R50-DCN-FPNFCAFCB(ada)TF28.632.833.0STMask_plus_resnet50_ada.pth
R101-DCN-FPNFCA-TF24.536.036.3STMask_plus_base.pth
R101-DCN-FPNFCAFCB(ali)TF22.136.337.1STMask_plus_base_ali.pth
R101-DCN-FPNFCAFCB(ada)TF23.436.837.9STMask_plus_base_ada.pth

Quantitative Results on YTVIS2021 (trained with 12 epoches)

BackboneFCAFCBTFmAP*WeightsResults
R50-DCN-FPNFCA-TF30.6STMask_plus_resnet50_YTVIS2021.pth-
R50-DCN-FPNFCAFCB(ada)TF31.1STMask_plus_resnet50_ada_YTVIS2021.pthstdout.txt
R101-DCN-FPNFCA-TF33.7STMask_plus_base_YTVIS2021.pth-
R101-DCN-FPNFCAFCB(ada)TF34.6STMask_plus_base_ada_YTVIS2021.pthstdout.txt

Quantitative Results on OVIS (trained with 20 epoches)

BackboneFCAFCBTFmAP*WeightsResults
R50-DCN-FPNFCA-TF15.4STMask_plus_resnet50_OVIS.pth-
R50-DCN-FPNFCAFCB(ada)TF15.4STMask_plus_resnet50_ada_OVIS.pthstdout.txt
R101-DCN-FPNFCA-TF17.3STMask_plus_base_OVIS.pthstdout.txt
R101-DCN-FPNFCAFCB(ada)TF15.8STMask_plus_base_ada_OVIS.pth-

To evalute the model, put the corresponding weights file in the ./weights directory and run one of the following commands. The name of each config is everything before the numbers in the file name (e.g., STMask_plus_base for STMask_plus_base.pth). Here all STMask models are trained based on yolact_plus_base_54_80000.pth or yolact_plus_resnet_54_80000.pth from Yolact++ here.

Quantitative Results on COCO

We also provide quantitative results of Yolcat++ with our proposed feature calibration for anchors and boxes on COCO (w/o temporal fusion module). Here are the results on COCO valid set.

Image SizeBackboneFCAFCBB_APM_APWeights
[550,550]R50-DCN-FPNFCA-34.532.9yolact_plus_resnet50_54.pth
[550,550]R50-DCN-FPNFCAFCB(ali)34.633.3yolact_plus_resnet50_ali_54.pth
[550,550]R50-DCN-FPNFCAFCB(ada)34.733.2yolact_plus_resnet50_ada_54.pth
[550,550]R101-DCN-FPNFCA-35.733.3yolact_plus_base_54.pth
[550,550]R101-DCN-FPNFCAFCB(ali)35.634.1yolact_plus_base_ali_54.pth
[550,550]R101-DCN-FPNFCAFCB(ada)36.434.8yolact_plus_baseada_54.pth

Inference

# Output a YTVOSEval json to submit to the website.
# This command will create './weights/results.json' for instance segmentation.
python eval.py --config=STMask_plus_base_ada_config --trained_model=weights/STMask_plus_base_ada.pth --mask_det_file=weights/results.json
# Output a visual segmentation results
python eval.py --config=STMask_plus_base_ada_config --trained_model=weights/STMask_plus_base_ada.pth --mask_det_file=weights/results.json --display

Training

By default, we train on YouTubeVOS2019 dataset. Make sure to download the entire dataset using the commands above.

# Trains STMask_plus_base_config with a batch_size of 8.
CUDA_VISIBLE_DEVICES=0,1 python train.py --config=STMask_plus_base_config --batch_size=8 --lr=1e-4 --save_folder=weights/weights_r101


# Resume training STMask_plus_base with a specific weight file and start from the iteration specified in the weight file's name.
CUDA_VISIBLE_DEVICES=0,1 python train.py --config=STMask_plus_base_config --resume=weights/STMask_plus_base_10_32100.pth 

Citation

If you use STMask or this code base in your work, please cite

@inproceedings{STMask-CVPR2021,
  author    = {Minghan Li and Shuai Li and Lida Li and Lei Zhang},
  title     = {Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation},
  booktitle = {CVPR},
  year      = {2021},
}

Contact

For questions about our paper or code, please contact Li Minghan (liminghan0330@gmail.com or minghancs.li@connect.polyu.hk).