Awesome
Text-guided Attention Model for Image Captioning
Created by Jonghwan Mun, Minsu Cho and Bohyung Han at POSTECH cvlab. <br /> If you want to know details of our paper, please refer to arXiv preprint or visit our project page. <br /> Also, if you use this code in a publication, please cite our paper using following bibtex.
@inproceedings{mun2017textguided,
title={Text-guided Attention Model for Image Captioning},
author={Mun, Jonghwan and Cho, Minsu and Han, Bohyung},
booktitle={AAAI},
year={2017}
}
Dependencies (This project is tested on linux 14.04 64bit with gpu Titan)
Dependencies for torch
- torch ['https://github.com/torch/distro']
- cutorch (luarocks install cutorch)
- cunn (luarocks install cunn)
- cudnn ['https://github.com/soumith/cudnn.torch']
- display ['https://github.com/szym/display']
- cv ['https://github.com/VisionLabs/torch-opencv']
- hdf5 (luarocks install hdf5)
- image (luarocks install image)
- loadcaffe ['https://github.com/szagoruyko/loadcaffe']
Dependencies for python (we test on python 2.7.11 with anaconda 4.0)
- json
- h5py
- cPickle
- numpy <br /> Maybe all dependencies for python are installed if you use anaconda.
Download pre-trained model
bash get_pretrained_model.sh
Running (data construction, training, testing)
bash running_script.sh
Licence
This software is being made available for research purpose only. Check LICENSE file for details.
Acknowledgements
This work is funded by the Samsung Electronics Co., (DMC R&D center). <br /> Also, thanks to Andrej Karpathy since this work is implemented based on his code (https://github.com/karpathy/neuraltalk2)