Awesome

Learning Distinct and Representative Modes for Image Captioning (Neurips 2022)

pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu116
pip install transformers yacs scipy

Follow the instructions in VLP.

python -m modecap.train data_dir PATH_TO_DATA
python -m modecap.inference data_dir PATH_TO_DATA model_path PATH_TO_MODEL