Home

Awesome

Recurrent Multimodal Interaction for Referring Image Segmentation

This repository contains code for Recurrent Multimodal Interaction for Referring Image Segmentation, ICCV 2017.

If you use the code, please cite

@inproceedings{liu2017recurrent,
  title={Recurrent Multimodal Interaction for Referring Image Segmentation},
  author={Liu, Chenxi and Lin, Zhe and Shen, Xiaohui and Yang, Jimei and Lu, Xin and Yuille, Alan},
  booktitle={{ICCV}},
  year={2017}
}

Setup

Data Preparation

python build_batches.py -d Gref -t train
python build_batches.py -d Gref -t val
python build_batches.py -d unc -t train
python build_batches.py -d unc -t val
python build_batches.py -d unc -t testA
python build_batches.py -d unc -t testB
python build_batches.py -d unc+ -t train
python build_batches.py -d unc+ -t val
python build_batches.py -d unc+ -t testA
python build_batches.py -d unc+ -t testB
python build_batches.py -d referit -t trainval
python build_batches.py -d referit -t test

Training and Testing

Specify several options/flags and then run main.py:

For example, to train the ResNet + LSTM model on Google-Ref using GPU 2, run

python main.py -m train -w resnet -n LSTM -d Gref -t train -g 2 -i 750000 -s 50000

To test the 650000-iteration snapshot of the DeepLab + RMI model on UNC testA set using GPU 1 (with visualization and Dense CRF), run

python main.py -m test -w deeplab -n RMI -d unc -t testA -g 1 -i 650000 -v -c

Miscellaneous

Code and data under util/ and data/referit/ are borrowed from text_objseg and slightly modified for compatibility with Tensorflow 1.2.1.

TODO

Add TensorBoard support.