Awesome
PubMedCLIP in Medical Visual Question Answering
This repository includes PubMedCLIP, the fine-tuned version of CLIP with ROCO image--caption pairs. We also provide the pipelines for encorporating PubMedCLIP as the alternative pre-trained visual encoder in MEVF and QCR medical visual question answering pipelines. Our experiments illustrate that PubMedCLIP results in up tp 3% improvement in the medical visual question answering.
Citation
If you use this work in academic publication, please cite the paper by Sedigheh Eslami, Christoph Meinel, and Gerard de Melo.
BibTeX entry:
@inproceedings{eslami2023pubmedclip,
title={PubMedCLIP: How Much Does CLIP Benefit Visual Question Answering in the Medical Domain?},
author={Eslami, Sedigheh and Meinel, Christoph and De Melo, Gerard},
booktitle={Findings of the Association for Computational Linguistics: EACL 2023},
pages={1151--1163},
year={2023}
}