Home

Awesome

Awesome-Healthcare-Foundation-Models

Awesome

Curated list of awesome large AI models (LAMs), or foundation models, in healthcare. We organize the current LAMs into four categories: large language models (LLMs), large vision models (LVMs), large audio models, and large multi-modal models (LMMs). The areas that these LAMs are applied to include but not limited to bioinformatics, medical diagnosis, medical imaging, medical informatics, medical education, public health, and medical robotics.

We welcome contributions to this repository to add more resources. Please submit a pull request if you want to contribute!

News

We are excited to annouce a IEEE J-BHI special issue on Biomedical and Health Foundation Models. Please refer to the call-for-papers for more details.

Topics of interest include but not limited to:

  1. Basic research on new theories, principles, and structures of biomedical and health foundation models
  2. Basic research on the interpretability and explainability of biomedical and health foundation models
  3. Prompt engineering in biomedical and health foundation models
  4. Data engineering in biomedical and health foundation models
  5. Large-scale biomedical and health dataset
  6. Multi-modal learning and alignment for biomedical and health foundation models
  7. Efficient computing for biomedical and health foundation models
  8. Adversarial robustness of biomedical and health foundation models
  9. Applications of foundation models in biomedical and health informatics
  10. New evaluation paradigms for biomedical and health foundation models
  11. New computer systems for biomedical and health foundation models
  12. Decentralised methods for developing and deploying biomedical and health foundation models
  13. Foundation model ethics, safety, privacy, and regulations in biomedicine and healthcare

Please help spread the word and contribute if you are interested or already working on these topics!

Table of Contents

Survey

This repository is largely based on the following paper:

Large AI Models in Health Informatics: Applications, Challenges, and the Future <br /> Jianing Qiu, Lin Li, Jiankai Sun, Jiachuan Peng, Peilun Shi, Ruiyang Zhang, Yinzhao Dong, Kyle Lam, Frank P.-W. Lo, Bo Xiao, Wu Yuan, Ningli Wang, Dong Xu, and Benny Lo <br />

If you find this repository helpful, please consider citing:

@article{qiu2023large,
  title={Large ai models in health informatics: Applications, challenges, and the future},
  author={Qiu, Jianing and Li, Lin and Sun, Jiankai and Peng, Jiachuan and Shi, Peilun and Zhang, Ruiyang and Dong, Yinzhao and Lam, Kyle and Lo, Frank P-W and Xiao, Bo and others},
  journal={IEEE Journal of Biomedical and Health Informatics},
  year={2023},
  publisher={IEEE}
}

Large Language Models

Healthcare Domain

General Domain

Large Vision Models

Healthcare Domain

General Domain

CNNs:

Vision Transformers:

CNNs + ViTs:

Large Audio Models

Healthcare Domain

General Domain

Large Multi-modal Models

Healthcare Domain

General Domain

Multi-modal Chatbot

Representation learning:

Text-to-image generation:

Applications of Large AI Models in Healthcare

Note that some of the following models were not targeted at healthcare applications initially but may have the potential to be transferred to the healthcare domain or inspire future development.

Bioinformatics

Medical Diagnosis

Medical Imaging

Medical Informatics

Medical Education

Public Health

Medical Robotics

AI Legislation

Large-scale Datasets in Biomedical and Health Informatics

Open Source

DatasetDescription
Big Fantastic Datasbase2.1 B protein sequences, 393 B amino acids
Observed Antibody Space558 M antibody sequences
RNAcentral34 M ncRNA sequences, 22 M secondary structure
ZINC201.4B compounds from 310 catalogs from 150 companies
MIMIC-CXR65K patients, 337K chest X-ray images and 227K radiology reports
MedMNIST v2708K 2D medical images, 10K 3D medical images
Medical Meadow1.5M data points containing a wide range of medical language processing tasks
Endo-FM database33K endoscopic videos, up to 5M frames
SurgVLP database25K laparoscopic video-text pairs from 1k surgical lecture videos

Private or Upon Approval

DatasetDescription
Mount Sinai ECG Data2.1 M patients, containing 8.5 M discrete ECG recordings
Google DR Dev. Dataset239 K unique individuals, 1.6 M fundus images
UF Health IDR Clinical Note Database290 M clinical notes, with up to 82 B medical words
Clinical Practice Research Datalink11.3 M patients covering data on demographics, symptoms, diagnoses, etc