Home

Awesome

News

Papers
Toward Expanding the Scope of Radiology Report Summarization to Multiple Anatomies and ModalitiesDataset
Overview of the RadSum23 Shared Task on Multi-modal and Multi-anatomical Radiology Report SummarizationChallenge
Improving the Factual Correctness of Radiology Report Generation with Semantic RewardsReplicate

ViLMedic: a framework for research at the intersection of vision and language in medical AI

<p align="center"> <img src="https://vilmedic.app/favicon/favicon-64x64.png" alt="" style="width: 14px;"> ViLMedic has a dedicated website at: <a href="https://vilmedic.app/">https://vilmedic.app/</a> </p> <p align="center"> <img src="vilmedic/logo.png" width="190px"> <br /> <br /> <a href="https://github.com/jbdel/vilmedic/blob/master/LICENSE"><img alt="MIT License" src="https://img.shields.io/badge/license-MIT-red.svg" /></a> <img src="https://img.shields.io/badge/Stanford-Medicine-red" /> </p>
@inproceedings{delbrouck-etal-2022-vilmedic,
    title = "{V}i{LM}edic: a framework for research at the intersection of vision and language in medical {AI}",
    author = "Delbrouck, Jean-benoit  and
      Saab, Khaled  and
      Varma, Maya  and
      Eyuboglu, Sabri  and
      Chambon, Pierre  and
      Dunnmon, Jared  and
      Zambrano, Juan  and
      Chaudhari, Akshay  and
      Langlotz, Curtis",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-demo.3",
    pages = "23--34",
}

Quickstart and documentation

<p align="center"> Rendez-vous at: <a href="https://vilmedic.app/installation/">https://vilmedic.app/installation/</a> </p>

Implemented solutions

ViLMedic replicates solutions from the multimodal medical literature.

Solutions
Medical Visual Question Answering
SYSU-HCP at VQA-Med 2021
Radiology report generation
Generating Radiology Reports via Memory-driven Transformer
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation
Radiology report summarization
Multimodal Radiology Report Summarization
Multimodal self-supervised Learning
Contrastive Learning of Medical Visual Representations from Paired Images and Text
DALLE: Zero-Shot Text-to-Image Generation
CLIP: Learning Transferable Visual Models From Natural Language Supervision
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition

Blocks

Blocks
Natural Language Processing
HuggingFace transformer encoder and decoder
HuggingFace transformer beam-search and model ensembling :fire:
NLG metrics (BLEU, ROUGE, METEOR, MAUVE) and Radiology Reports Generation metrics (F1-CheXbert)
RadGraph
Vision
All PyTorch VisualEncoder architectures
Vision Transformer
TorchXRayVision
Losses
All PyTorch losses
ConVirt loss
GLoRIA loss
InfoNCE loss
SuperLoss
Reinforcement Learning
Self-critical Sequence Training (HuggingFace compliant) :fire:
PPO optimization (HuggingFace compliant)

Citation

If you use ViLMedic in your work or use any models published in ViLMedic, please cite:

License

ViLMedic is MIT-licensed. The license applies to the pre-trained models as well.