Awesome
Cross-Modal Contrastive Learning for Robust Reasoning in VQA
This repo is an implementation upon METER backbone with PyTorch Lightning. Here is an implementation in PyTorch.
Data preparation and pretrained models
Please follow METER and ViLT to prepare the datasets and download the pretrained checkpoints released by METER. Modify data_root
and log_dir
in config.py
.
Finetune on VQA data
train
python run.py with num_gpus=1 \
num_nodes=1 \
task_finetune_vqa_clip_bert \
per_gpu_batchsize=8 \
load_path=result/official_released/meter_clip16_288_roberta_pretrain.ckpt \
clip16 text_roberta \
image_size=224 \
nce=True \
test_only=False \
seed=0 \
exp_name=finetune_vqa_cmcl
test
python run.py with num_gpus=1 \
num_nodes=1 \
task_finetune_vqa_clip_bert \
per_gpu_batchsize=8 \
load_path=path/to/finetuned/ckpt \
clip16 text_roberta \
image_size=224 \
nce=True \
test_only=True \
seed=0 \
exp_name=finetune_vqa_cmcl