Home

Awesome

Moving Object Segmentation: All You Need Is SAM (and Flow)

Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman

Visual Geometry Group, Department of Engineering Science, University of Oxford

<a src="https://img.shields.io/badge/cs.CV-2404.12389-b31b1b?logo=arxiv&logoColor=red" href="https://arxiv.org/abs/2404.12389"> <img src="https://img.shields.io/badge/cs.CV-2404.12389-b31b1b?logo=arxiv&logoColor=red"></a> <a href="https://www.robots.ox.ac.uk/~vgg/research/flowsam/" alt="Project page"> <img alt="Project page" src="https://img.shields.io/badge/project_page-flowsam-blue"></a> <p align="center"> <img src="resources/teaser.png" width="750"/> </p>

Requirements

pytorch=2.0.0, Pillow, opencv, einops, tensorboardX

Segment Anything can be installed following the official repository here, or by

pip install git+https://github.com/facebookresearch/segment-anything.git

Datasets

Training datasets

Evaluation datasets

Optical flow estimation

In this work, optical flow is estimated by RAFT, with the code provided in the flow folder.

Path configuration

The data paths can be specified in data/dataset_config.py.

Checkpoints and results

Inference

To run FlowI-SAM,

python evaluation.py --model=flowisam --dataset=dvs16 --flow_gaps=1,-1,2,-2 \
                      --max_obj=5 --num_gridside=10 --ckpt_path={} --save_path={}

To run FlowP-SAM,

python evaluation.py --model=flowpsam --dataset=dvs16 --flow_gaps=1,-1,2,-2 \
                      --max_obj=10 --num_gridside=20 --ckpt_path={} --save_path={}

where --flow_gaps denotes the frame gaps of flow inputs <br>            --max_obj indicates the maximum number of predicted object masks <br>            --num_gridside indicates the number of uniform grid point inputs (e.g., "10" correponds to 10 x 10 points) <br>            --ckpt_path specifies the model checkpoint path <br>            --save_path specifies the path to save predicted masks (if not specified, no masks will be saved)

To run the code on your own data, (or datasets without GT multi-object segmentation, e.g., SegTrackv2, FBMS-59, MoCA_filter, etc.)

{data_name}/
├── JPEGImages/
│   └── {category_name}/
│       ├── 00000.jpg
│       └── ......
├── FlowImages_gap1/
│   └── {category_name}/
│       ├── 00000.png
│       └── ......
├── ...... (More flow images)

To perform sequence-level mask association (in other words, matching the identities of masks throughout the sequence) for multi-object datasets,

python seq_level_postprocess.py --dataset=dvs17m --mask_dir={} --save_path={}

For single-object cases usually the first mask of each frame would suffice.

Evaluation benchmarks:

Training

python train.py --model={} --dataset=dvs16 --model_save_path={}

where --model specifies the model to be trained (flowisam or flowpsam) <br>            --model_save_path indicates the path to save logs and model ckpts

Citation

If you find this repository helpful, please consider citing our work:

@InProceedings{xie2024flowsam,
  title={Moving Object Segmentation: All You Need Is SAM (and Flow)},
  author={Junyu Xie and Charig Yang and Weidi Xie and Andrew Zisserman},
  booktitle={ACCV},
  year={2024}
}

Reference

Segment Anything: https://github.com/facebookresearch/segment-anything