Awesome
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
Minghan LI, Shuai LI, Wangmeng XAING, Lei ZHANG
[arXiv
]
Updates
March 31, 2023
: Trained models are released.March 28, 2023
: Code and paper are now available!
Installation
See installation instructions.
Getting Started
We provide a script train_net.py
, that is made to train all the configs provided in MDQE.
Before training: To train a model with "train_net.py" on VIS, first setup the corresponding datasets following Preparing Datasets for MDQE.
Then download pretrained weights in the Model Zoo into the path 'pretrained/coco/*.pth', and run:
python train_net.py --num-gpus 8 \
--config-file configs/R50_ovis_360.yaml
To evaluate a model's performance, use
python train_net.py \
--config-file configs/R50_ovis_360.yaml \
--eval-only \
MODEL.WEIGHTS /path/to/checkpoint_file
<a name="ModelZoo"></a>Model Zoo
Pretrained weights on COCO
Name | R50 | Swin-L |
---|---|---|
MDQE | model, config | model, config |
OVIS
Name | Backbone | Frames | AP | Download |
---|---|---|---|---|
MDQE | R50 | f4+360p | 30.7 | model, config |
MDQE | R50 | f4+640p | 32.3 | model, config |
MDQE | Swin-L | f2+480p | 41.0 | model, config |
MDQE | Swin-L | f2+640p | 42.6 | model, config |
YouTubeVIS-2021
Name | Backbone | Frames | AP | Download |
---|---|---|---|---|
MDQE | R50 | f4+360p | 46.6 | model, config |
MDQE | Swin-L | f3+360p | 55.5 | model, config |
YouTubeVIS-2019
Name | Backbone | Frames | AP | Download |
---|---|---|---|---|
MDQE | R50 | f4+360p | 47.8 | model, config |
MDQE | Swin-L | f3+360p | 59.9 | model, config |
License
The majority of MDQE is licensed under the Apache-2.0 License. However, portions of the project are available under separate license terms: Detectron2(Apache-2.0 License), IFC(Apache-2.0 License), VITA(Apache-2.0 License), and Deformable-DETR(Apache-2.0 License).
<a name="CitingMDQE"></a>Citing MDQE
If you use MDQE in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@misc{li2023mdqe,
title={MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos},
author={Minghan Li and Shuai Li and Wangmeng Xiang and Lei Zhang},
year={2023},
eprint={2303.14395},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgement
Our code is largely based on Detectron2, IFC, Deformable DETR and VITA. We are truly grateful for their excellent work.