Awesome
<div align="center">DVIS: Decoupled Video Instance Segmentation Framework
Tao Zhang, XingYe Tian, Yu Wu, ShunPing Ji, Xuebo Wang, Yuan Zhang, Pengfei Wan
<img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/pipeline.png" width="800"/> </div>News
- DVIS-DAQ achieves 57.1 AP on the OVIS dataset and also sets a new SOTA performance on YTVIS19/21 and VIPSeg. The code will be released in DVIS-DAQ. The paper is available at DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries and the project page can be found in project page.
- The improved version of DVIS, DVIS++, is now available. Please refer to DVIS++ for more information. DVIS++ achieves 41.2 AP, 56.7 AP, and 52.0 AP, as well as 48.6 mIOU and 44.2 VPQ in OVIS, YTVIS19, YTVIS21, VSPW, and VIPSeg, respectively. Additionally, OV-DVIS++ supports open-vocabulary universal video segmentation.
- DVIS achieved 1st place in the VPS Track of the PVUW challenge at CVPR 2023.
2023.5.25
- DVIS has been accepted by ICCV 2023.
2023.7.15
- DVIS achieved 1st place in the VIS Track of the 5th LSVOS challenge at ICCV 2023.
2023.8.15
Features
- DVIS is a universal video segmentation framework that supports VIS, VPS and VSS.
- DVIS can run in both online and offline modes.
- DVIS achieved SOTA performance on YTVIS, OVIS, VIPSeg and VSPW datasets.
- DVIS can complete training and inference on GPUs with only 11G memory.
Demos
<img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_0.gif" width="400"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_1.gif" width="370"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_2.gif" width="215"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_4.gif" width="290"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_5.gif" width="290"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_6.gif" width="400"/> <img src="https://github.com/zhang-tao-whu/paper_images/blob/master/dvis/demo_7.gif" width="400"/>
Installation
See Installation Instructions.
Getting Started
See Preparing Datasets for DVIS.
See Getting Started with DVIS.
Model Zoo
Trained models are available for download in the DVIS Model Zoo.
<a name="CitingDVIS"></a>Citing DVIS
@article{DVIS,
title={DVIS: Decoupled Video Instance Segmentation Framework},
author={Zhang, Tao and Tian, Xingye and Wu, Yu and Ji, Shunping and Wang, Xuebo and Zhang, Yuan and Wan, Pengfei},
journal={arXiv preprint arXiv:2306.03413},
year={2023}
}
@article{zhang2023vis1st,
title={1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation},
author={Zhang, Tao and Tian, Xingye and Zhou, Yikang and Wu, Yu and Ji, Shunping and Yan, Cilin and Wang, Xuebo and Tao, Xin and Zhang, Yuan and Wan, Pengfei},
journal={arXiv preprint arXiv:2308.14392},
year={2023}
}
@article{zhang2023vps1st,
title={1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation},
author={Zhang, Tao and Tian, Xingye and Wei, Haoran and Wu, Yu and Ji, Shunping and Wang, Xuebo and Zhang, Yuan and Wan, Pengfei},
journal={arXiv preprint arXiv:2306.04091},
year={2023}
}
Acknowledgement
This repo is largely based on Mask2Former, MinVIS and VITA. Thanks for their excellent works.