Awesome
MedRegA: Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
<a href="https://arxiv.org/abs/2410.18387"><img src="https://img.shields.io/badge/Paper-arxiv-green.svg?style=flat-square"></a> <a href="https://medrega.github.io/"><img src="https://img.shields.io/badge/Project-Website-blue.svg?style=flat-square"></a> <a href="https://huggingface.co/Luxuriant16/medrega"><img src="https://img.shields.io/badge/Model-Hugging Face-red.svg?style=flat-square"></a>
MedRegA, an interpretable bilingual generalist model for diverse biomedical tasks, represented by its outstanding ability to leverage regional information. MedRegA can perceive 8 modalities covering almost all the body parts, showcasing significant versatility.
<img src="asset\intro.png" width=70% >Overview
💡We establish Region-Centric tasks with a large-scale dataset, MedRegInstruct, where each sample is paired with coordinates of body structures or lesions.
💡Based on the proposed dataset, we develop a Region-Aware medical MLLM, MedRegA, as a bilingual generalist medical AI system to perform both image-level and region-level medical vision-language tasks, demonstrating impressive versatility.
Schedule
- Release the model.
- Release the demo code.
- Release the evaluation code.
- Release the training code.
- Release the data.
Environment
Please refer to InternVL Installation to build the environment.
Demo
Run the demo:
torchrun --nproc-per-node=1 src/demo.py
Acknowledgement
We refer to the codes from InternVL. Thank the authors for releasing their code.