Awesome
Moving Object Segmentation: All You Need Is SAM (and Flow)
Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman
Visual Geometry Group, Department of Engineering Science, University of Oxford
<a src="https://img.shields.io/badge/cs.CV-2404.12389-b31b1b?logo=arxiv&logoColor=red" href="https://arxiv.org/abs/2404.12389"> <img src="https://img.shields.io/badge/cs.CV-2404.12389-b31b1b?logo=arxiv&logoColor=red"></a> <a href="https://www.robots.ox.ac.uk/~vgg/research/flowsam/" alt="Project page"> <img alt="Project page" src="https://img.shields.io/badge/project_page-flowsam-blue"></a> <p align="center"> <img src="resources/teaser.png" width="750"/> </p>Requirements
pytorch=2.0.0
,
Pillow
,
opencv
,
einops
,
tensorboardX
Segment Anything can be installed following the official repository here, or by
pip install git+https://github.com/facebookresearch/segment-anything.git
Datasets
Training datasets
- Synthetic training data from OCLR_paper can be downloaded from here.
- DAVIS2017 (and DAVIS2016) can be downloaded here.
- DAVIS2017-motion has the same sequences with DAVIS2017, but the annotations are curated to cater for jointly moving objects, which can be downloaded from here.
Evaluation datasets
- DAVIS datasets can be obtained following the instructions above.
- YTVOS2018-motion is a subset selected from training split of YTVOS2018. These selected sequences are used for evaluation, with predominantly moving objects involved (i.e., objects can be discovered based on their motion). For more details and downloading instructions, please follow this link.
- Other datasets such as SegTrackv2, FBMS-59 and MoCA_filter can be downloaded and preprocessed following the protocol described in motiongrouping.
Optical flow estimation
In this work, optical flow is estimated by RAFT, with the code provided in the flow
folder.
Path configuration
The data paths can be specified in data/dataset_config.py
.
Checkpoints and results
- The pretrained original SAM checkpoints can be downloaded here
- The pretrained flowsam model checkpoints can be downloaded here.
- Our predicted masks on benchmarks datasets can be found here.
Inference
To run FlowI-SAM,
python evaluation.py --model=flowisam --dataset=dvs16 --flow_gaps=1,-1,2,-2 \
--max_obj=5 --num_gridside=10 --ckpt_path={} --save_path={}
To run FlowP-SAM,
python evaluation.py --model=flowpsam --dataset=dvs16 --flow_gaps=1,-1,2,-2 \
--max_obj=10 --num_gridside=20 --ckpt_path={} --save_path={}
where --flow_gaps
denotes the frame gaps of flow inputs
<br>
--max_obj
indicates the maximum number of predicted object masks
<br>
--num_gridside
indicates the number of uniform grid point inputs (e.g., "10" correponds to 10 x 10 points)
<br>
--ckpt_path
specifies the model checkpoint path
<br>
--save_path
specifies the path to save predicted masks (if not specified, no masks will be saved)
To run the code on your own data, (or datasets without GT multi-object segmentation, e.g., SegTrackv2, FBMS-59, MoCA_filter, etc.)
- Set
--dataset=example
, and arrange you data as the following:
{data_name}/
├── JPEGImages/
│ └── {category_name}/
│ ├── 00000.jpg
│ └── ......
├── FlowImages_gap1/
│ └── {category_name}/
│ ├── 00000.png
│ └── ......
├── ...... (More flow images)
- Add you own dataset information in
config_eval_dataloader()
indata/dataset_config.py
(under "example" dataset)
To perform sequence-level mask association (in other words, matching the identities of masks throughout the sequence) for multi-object datasets,
python seq_level_postprocess.py --dataset=dvs17m --mask_dir={} --save_path={}
For single-object cases usually the first mask of each frame would suffice.
Evaluation benchmarks:
- For DAVIS2016, use the DAVIS2016 official evaluator.
- For DAVIS2017, use the DAVIS2017 official evaluator.
- For DAVIS2017-motion, following the evaluation protocol introduced in OCLR_paper.
- For MoCA_filter, use the evaluator provided in motiongrouping.
Training
python train.py --model={} --dataset=dvs16 --model_save_path={}
where --model
specifies the model to be trained (flowisam
or flowpsam
)
<br>
--model_save_path
indicates the path to save logs and model ckpts
Citation
If you find this repository helpful, please consider citing our work:
@InProceedings{xie2024flowsam,
title={Moving Object Segmentation: All You Need Is SAM (and Flow)},
author={Junyu Xie and Charig Yang and Weidi Xie and Andrew Zisserman},
booktitle={ACCV},
year={2024}
}
Reference
Segment Anything: https://github.com/facebookresearch/segment-anything