Home

Awesome

RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance

Authors: Chantal Pellegrini*, Ege Özsoy*, Benjamin Busam, Nassir Navab, Matthias Keicher

✨ News ✨


<img align="right" src="figs/example.png" alt="teaser" width="50%" style="margin-left: 20px">

Conversational AI tools that can generate and discuss clinically correct radiology reports for a given medical image have the potential to transform radiology. Such a human-in-the-loop radiology assistant could facilitate a collaborative diagnostic process, thus saving time and improving the quality of reports. Towards this goal, we introduce RaDialog, the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog. RaDialog effectively integrates visual image features and structured pathology findings with a large language model (LLM) while simultaneously adapting it to a specialized domain using parameter-efficient fine-tuning. To keep the conversational abilities of the underlying LLM, we propose a comprehensive, semi-automatically labeled, image-grounded instruct dataset for chest X-ray radiology tasks. By training with this dataset, our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions, serving as a foundational step toward clinical dialog systems.

Installation

Environment Setup:

1) RaDialog Environment

2) CheXbert Environment

Prepare the Data and Models:

1) Download pretrained models

2) Download MIMIC-CXR

3) Create sectioned report data

4) Prepare the instruct dataset

Data for RaDialog-RG:

Data for RaDialog-INS:

Run Demo:

Evaluate RaDialog on MIMIC-CXR test set:

Train RaDialog:

1) CheXbert classifier Training

2) Alignment Module Pretraining

3) LLM Training

Train RaDialog-RG:

Train RaDialog-INS:

To use a model from a checkpoint, you'll need to perform the following steps:

Reference

When using our model or dataset, please cite:

@article{pellegrini2023radialog,
  title={RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance},
  author={Pellegrini, Chantal and {\"O}zsoy, Ege and Busam, Benjamin and Navab, Nassir and Keicher, Matthias},
  journal={arXiv preprint arXiv:2311.18681},
  year={2023}
}