Awesome
(tl;dr)
2M iterations finetuned checkpoint file | Released under MIT License
1M iterations checkpoint file | Released under MIT License
word_counts.txt (at this repository)
model.ckpt-2000000.index (at this repository. Place it in the same folder as the model checkpoint used.)
model.ckpt-1000000.index (at this repository. Place it in the same folder as the model checkpoint used.)
Show and Tell : A Neural Image Caption Generator
Pretrained model for Tensorflow implementation found at tensorflow/models of the image-to-text paper described at:
"Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge."
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan.
Full text available at: http://arxiv.org/abs/1609.06647
Contact
Kranthi Kiran GV (KranthiGV | kranthi.gv@gmail.com)
Generating Captions
Steps
-
Follow the steps at im2txt to clone the repository, install bazel, etc.
-
Download the desired model checkpoint:
2M iterations finetuned checkpoint file | Released under MIT License
1M iterations checkpoint file | Released under MIT License -
Clone the repository: git clone https://github.com/KranthiGV/Pretrained-Show-and-Tell-model.git
# Path to checkpoint file.
# Notice there's no data-00000-of-00001 in the CHECKPOINT_PATH environment variable
# Also make sure you place model.ckpt-2000000.index (which is cloned from the repository)
# in the same location as model.ckpt-2000000.data-00000-of-00001
# You can use model.ckpt-1000000.data-00000-of-00001 similarly
CHECKPOINT_PATH="/path/to/model.ckpt-2000000"
# Vocabulary file generated by the preprocessing script.
# Since the tokenizer could be of a different version, use the word_counts.txt file supplied.
VOCAB_FILE="/path/to/word_counts.txt"
# JPEG image file to caption.
IMAGE_FILE="/path/to/image.jpeg"
# Build the inference binary.
bazel build -c opt im2txt/run_inference
# Run inference to generate captions.
bazel-bin/im2txt/run_inference \
--checkpoint_path=${CHECKPOINT_PATH} \
--vocab_file=${VOCAB_FILE} \
--input_files=${IMAGE_FILE}
Extras
- Graph.pbtxt is uploaded on request.
- Training stats are uploaded for use with tensorboard.
tensorboard --logdir="./extras/tensorboard/"