Awesome
Official implementation for DeCap
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Published at ICLR 2023
Paper link: DeCap
Data
Download coco_train to data
.
Download cc3m_train to data
.
Training
./train_coco.sh
or
./train_cc3m.sh
Inferece
See inference_decap.ipynb
.
Pretrained models
Train on coco captions: model_coco
Train on CC3M: Soon
Citation
@inproceedings{lidecap,
title={DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training},
author={Li, Wei and Zhu, Linchao and Wen, Longyin and Yang, Yi},
booktitle={The Eleventh International Conference on Learning Representations}
}
Acknowledgments
This repository is heavily based on ClipCap. For training we used the data of COCO dataset and Conceptual Captions.