Awesome

Grounding Referring Expressions in Images by Variational Context

This repository contains the code for the following paper:

Hanwang Zhang, Yulei Niu, Shih-Fu Chang, Grounding Referring Expressions in Images by Variational Context. In CVPR, 2018. (PDF)

@article{zhang2018grounding,
  title={Grounding Referring Expressions in Images by Variational Context},
  author={Zhang, Hanwang and Niu, Yulei and Chang, Shih-Fu},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Note: part of this repository is built upon cmn, speaker_listener_reinforcer and refer.

Requirements and Dependencies

Python 3 (Anaconda recommended)
TensorFlow (v1.3.0 or higher)
Clone

# Make sure to clone with --recursive
git clone --recursive https://github.com/yuleiniu/vc.git

The recursive will help also clone the refer API and cmn API repo.

Install other dependencies by simply run:

  pip install -r requirements.txt

Preprocessing

Download the model weights of Faster-RCNN VGG-16 network converted from Caffe model:

  ./data/models/download_vgg_params.sh

Download the GloVe matrix for word embedding:

  ./data/word_embedding/download_embed_matrix.sh

Re-build the NMS lib and the ROIPooling operation following cmn. Simply run:

  ./submodule/cmn.sh

Preprocess data for the use of referring expression following speaker_listener_reinforcer and refer (implemented by Python 2) , and save the results into data/raw. Simply run:

  ./submodule/refer.sh

Extract features

Extract region features for RefCOCO/RefCOCO+/RefCOCOg, run:

  python prepare_data.py --dataset refcoco  #(for RefCOCO)
  python prepare_data.py --dataset refcoco+ #(for RefCOCO+)
  python prepare_data.py --dataset refcocog #(for RefCOCOg)

Train

To train the model under supervised setting, run:

  python train.py --dataset refcoco  #(for RefCOCO)
  python train.py --dataset refcoco+ #(for RefCOCO+)
  python train.py --dataset refcocog #(for RefCOCOg)

To train the model under unsupervised setting, run:

  python train.py --dataset refcoco  --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO)
  python train.py --dataset refcoco+ --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO+)
  python train.py --dataset refcocog --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCOg)

Evaluation

To test the model, run:

  python test.py --dataset refcoco  (for RefCOCO)  --checkpoint /path/to/checkpoint
  python test.py --dataset refcoco+ (for RefCOCO+) --checkpoint /path/to/checkpoint
  python test.py --dataset refcocog (for RefCOCOg) --checkpoint /path/to/checkpoint