Awesome

VIPMT

This is the implementation of our paper: Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation that has been accepted to IEEE International Conference on Computer Vision (ICCV) 2023.

Environment

conda create -n VIPMT python=3.6
conda activate VIPMT
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch
conda install opencv cython
pip install easydict imgaug

Usage

Preparation

Download the 2019 version of Youtube-VIS dataset.
Download VSPW 480P dataset.
Put the dataset in the ./data folder.

data
└─ Youtube-VOS
    └─ train
        └─ Annotations
        └─ JPEGImages
        └─ train.json
└─ VSPW_480p
    └─ data

Install cocoapi for Youtube-VIS.
Download the ImageNet pretrained backbone and put it into the pretrain_model folder.

pretrain_model
└─ resnet50_v2.pth

Update config/config.py.

Training

python train.py --group 1 --batch_size 4

Inference

python test.py --group 1

References

Part of the code is based upon: IPMT, DANet. Thanks for their great work!