Awesome
BMVOS
This is the official PyTorch implementation of our paper:
<img src="https://github.com/user-attachments/assets/6ff1d000-c251-4cbf-a556-4e60a848c256" width=800>Pixel-Level Bijective Matching for Video Object Segmentation, WACV 2022
Suhwan Cho, Heansung Lee, Minjung Kim, Sungjun Jang, Sangyoun Lee
Link: [WACV] [arXiv]
You can also find other related papers at awesome-video-object-segmentation.
Abstract
In conventional semi-supervised VOS methods, the query frame pixels select the best-matching pixels in the reference frame and transfer the information from those pixels without any consideration of reference frame options. As there is no limitation to the number of reference frame pixels being referenced, background distractors in the query frame will get high foreground scores and can disrupt the prediction. To mitigate this issue, we introduce a bijective matching mechanism to find the best matches from the query frame to the reference frame and also vice versa. In addition, to take advantage of the property of a video that an object usually occupies similar positions in consecutive frames, we propose a mask embedding module.
Preparation
1. Download DAVIS and YouTube-VOS from the official websites.
2. Download our custom split for the YouTube-VOS training set.
3. Replace dataset paths in "run.py" file with your dataset paths.
Training
Please follow the instructions in TBD.
Testing
1. Open the "run.py" file.
2. Choose a pre-trained model.
3. Start BMVOS testing!
python run.py
Attachments
pre-trained model (davis)
pre-trained model (ytvos)
pre-computed results
Note
Code and models are only available for non-commercial research purposes.
If you have any questions, please feel free to contact me :)
E-mail: suhwanx@gmail.com