Home

Awesome

HCPN

Code for paper: Hierarchical Co-attention Propagation Network for Zero-Shot Video Object Segmentation

Requirements

Training

Download Datasets

  1. Download the DAVIS-2017 dataset from DAVIS.
  2. Download the YouTube-VOS dataset from YouTube-VOS.
  3. Download the YouTube-hed and DAVIS-hed datasets from DuBox code: 1gih.
  4. Download the YouTube-ctr and DAVIS-ctr datasets from GoogleDriver.
  5. The optical flow files are obtained by RAFT, we provide demo code that can be run directly on path flow. We also provide optical flow of YouTube-VOS (18G) in DuBox code: w9yn, optical flow of DAVIS can be found in Section Testing.

Dataset Format

Please ensure the datasets are organized as following format.

YouTube-VOS
|----train
      |----Annotations
      |----Annotations_ctr
      |----JPEGImages
      |----YouTube-flow
      |----YouTube-hed
      |----meta.json
|----valid
      |----Annotations
      |----JPEGImages
      |----meta.json
DAVIS
      |----Annotations
      |----Annotations_ctr
      |----ImageSets
      |----JPEGImages
      |----davis-flow
      |----davis-hed

Run train.py

Change your dataset paths, then run python train.py for training model.

We also provide multi-GPU parallel code based on apex. Run CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 train_apex.py for distributed training in Pytorch.

Note:

Please change the path in two codes (libs/utils/config_davis.pyand libs/utils/config_youtubevos.py) to your own dataset path.

Testing

If you want to test the model results directly, you can follow the settings below.

  1. Download the pretrained model from GoogleDrive and put it into the "model/HCPN" files.

  2. Download the optical flow of DAVIS from GoogleDrive.

The code directory structure is as follows.

HCPN
  |----libs
  |----model
  |----apply_densecrf_davis.py
  |----args.py
  |----train.py
  |----test.py
  1. Change your path in test.py, then run python test.py.

  2. Evaluation code from DAVIS_Evaluation, the python version is available atPyDavis16EvalToolbox.

Results

If you are not able to run our code but interested in our results, the segmentation results can be downloaded from GoogleDrive.

  1. DAVIS-16:

In the inference stage, we ran using the 512x512 size of DAVIS (480p).

Mean J&FJ scoreF score
85.685.885.4
  1. Youtube-Objects:
AirplaneBirdBoatCarCatCowDogHorseMotorbikeTrainMean
84.579.667.387.874.171.276.566.265.859.773.3
  1. FBMS:
Mean J
78.3
  1. DAVIS-17:
Mean J&FJ scoreF score
70.768.772.7

DAVIS-2017

Video for Demo

Demo_DAVIS2016

Demo_YouTube-Objects

Demo_FBMS

Demo_DAVIS17

Acknowledge

  1. Motion-Attentive Transition for Zero-Shot Video Object Segmentation, AAAI 2020 (https://github.com/tfzhou/MATNet)
  2. Video Object Segmentation Using Space-Time Memory Networks, ICCV 2019 (https://github.com/seoungwugoh/STM)
  3. See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks, CVPR 2019 (https://github.com/carrierlxk/COSNet)