Home

Awesome

Grounding Referring Expressions in Images by Variational Context

This repository contains the code for the following paper:

@article{zhang2018grounding,
  title={Grounding Referring Expressions in Images by Variational Context},
  author={Zhang, Hanwang and Niu, Yulei and Chang, Shih-Fu},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Note: part of this repository is built upon cmn, speaker_listener_reinforcer and refer.

Requirements and Dependencies

# Make sure to clone with --recursive
git clone --recursive https://github.com/yuleiniu/vc.git

The recursive will help also clone the refer API and cmn API repo.

  pip install -r requirements.txt

Preprocessing

  ./data/models/download_vgg_params.sh
  ./data/word_embedding/download_embed_matrix.sh
  ./submodule/cmn.sh
  ./submodule/refer.sh

Extract features

  python prepare_data.py --dataset refcoco  #(for RefCOCO)
  python prepare_data.py --dataset refcoco+ #(for RefCOCO+)
  python prepare_data.py --dataset refcocog #(for RefCOCOg)

Train

  python train.py --dataset refcoco  #(for RefCOCO)
  python train.py --dataset refcoco+ #(for RefCOCO+)
  python train.py --dataset refcocog #(for RefCOCOg)
  python train.py --dataset refcoco  --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO)
  python train.py --dataset refcoco+ --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO+)
  python train.py --dataset refcocog --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCOg)

Evaluation

  python test.py --dataset refcoco  (for RefCOCO)  --checkpoint /path/to/checkpoint
  python test.py --dataset refcoco+ (for RefCOCO+) --checkpoint /path/to/checkpoint
  python test.py --dataset refcocog (for RefCOCOg) --checkpoint /path/to/checkpoint