Awesome
BoxVIS: Video Instance Segmentation with Box Annotation
Minghan LI and Lei ZHANG
<div align="center"> <img src="imgs/BoxVIS_overview.jpg" width="80%" height="100%"/> </div><br/>Updates
July 13, 2023
: Paper has been updated.June 30, 2023
: Code and trained models are available now.March 28, 2023
: Paper is available now.
Installation
See installation instructions.
Datasets
See Datasets preparation.
Getting Started
We provide a script train_net_boxvis.py
, that is made to train all the configs provided in BoxVIS.
Training: download pretrained weights of Mask2Former and save it into the path 'pretrained/*.pth', then run:
sh run.sh
Inference: download trained weights, and save it into the path 'pretrained/*.pth', then run:
sh test.sh
Quantitative performance comparison
<div align="center"> <img src="imgs/sota_yt21_coco.jpg" width="80%" height="100%"/> </div><br/><a name="CitingBoxVIS"></a>Citing BoxVIS
If you use BoxVIS in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@misc{li2023boxvis,
title={BoxVIS: Video Instance Segmentation with Box Annotations},
author={Minghan Li and Lei Zhang},
year={2023},
eprint={2303.14618},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgement
Our code is largely based on Detectron2, Mask2Former, MinVIS, and VITA. We are truly grateful for their excellent work.