Home

Awesome

Text-guided Attention Model for Image Captioning

Created by Jonghwan Mun, Minsu Cho and Bohyung Han at POSTECH cvlab. <br /> If you want to know details of our paper, please refer to arXiv preprint or visit our project page. <br /> Also, if you use this code in a publication, please cite our paper using following bibtex.

   @inproceedings{mun2017textguided,
      title={Text-guided Attention Model for Image Captioning},
      author={Mun, Jonghwan and Cho, Minsu and Han, Bohyung},
      booktitle={AAAI},
      year={2017}
   }

Dependencies (This project is tested on linux 14.04 64bit with gpu Titan)

Dependencies for torch

  1. torch ['https://github.com/torch/distro']
  2. cutorch (luarocks install cutorch)
  3. cunn (luarocks install cunn)
  4. cudnn ['https://github.com/soumith/cudnn.torch']
  5. display ['https://github.com/szym/display']
  6. cv ['https://github.com/VisionLabs/torch-opencv']
  7. hdf5 (luarocks install hdf5)
  8. image (luarocks install image)
  9. loadcaffe ['https://github.com/szagoruyko/loadcaffe']

Dependencies for python (we test on python 2.7.11 with anaconda 4.0)

  1. json
  2. h5py
  3. cPickle
  4. numpy <br /> Maybe all dependencies for python are installed if you use anaconda.

Download pre-trained model

bash get_pretrained_model.sh

Running (data construction, training, testing)

bash running_script.sh

Licence

This software is being made available for research purpose only. Check LICENSE file for details.

Acknowledgements

This work is funded by the Samsung Electronics Co., (DMC R&D center). <br /> Also, thanks to Andrej Karpathy since this work is implemented based on his code (https://github.com/karpathy/neuraltalk2)