Home

Awesome

!!! Check out our new paper and model improved for Meta-learning Medical Visual Question Answering.

Mixture of Enhanced Visual Features (MEVF)

This repository is the implementation of MEVF for the visual question answering task in medical domain. Our model achieved 43.9 for open-ended and 75.1 for close-end on VQA-RAD dataset. For the detail, please refer to link.

This repository is based on and inspired by @Jin-Hwa Kim's work. We sincerely thank for their sharing of the codes.

Overview of bilinear attention networks

Prerequisites

Please install dependence package by run following command:

pip install -r requirements.txt

Preprocessing

All data should be downloaded via link. The downloaded file should be extracted to data_RAD/ directory.

Training

Train MEVF model with Stacked Attention Network

$ python3 main.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/SAN_MEVF

Train MEVF model with Bilinear Attention Network

$ python3 main.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/BAN_MEVF

The training scores will be printed every epoch.

SAN+proposalBAN+proposal
Open-ended40.743.9
Close-ended74.175.1

Pretrained models and Testing

In this repo, we include the pre-trained weight of MAML and CDAE which are used for initializing the feature extraction modules.

The MAML model data_RAD/pretrained_maml.weights is trained by using official source code link.

The CDAE model data_RAD/pretrained_ae.pth is trained by code provided in train_cdae.py. For reproducing the pretrained model, please check the instruction provided in that file.

We also provide the pretrained models reported as the best single model in the paper.

For SAN_MEVF pretrained model. Please download the link and move to saved_models/SAN_MEVF/. The trained SAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/SAN_MEVF --epoch 19 --output results/SAN_MEVF

For BAN_MEVF pretrained model. Please download the link and move to saved_models/BAN_MEVF/. The trained BAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/BAN_MEVF --epoch 19 --output results/BAN_MEVF

The result json file can be found in the directory results/.

Citation

Please cite these papers in your publications if it helps your research:

@inproceedings{aioz_mevf_miccai19,
  author={Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran},
  title={Overcoming Data Limitation in Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2019}
}

If you find that our meta-learning work for MedVQA is useful, you could cite the following paper:

@inproceedings{aioz_mmq_miccai21,
  author={Tuong Do and Binh X. Nguyen and Erman Tjiputra and Minh Tran and Quang D. Tran and Anh Nguyen},
  title={Multiple Meta-model Quantifying for Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2021}
}

License

MIT License

More information

AIOZ AI Homepage: https://ai.aioz.io

AIOZ Network: https://aioz.network