Home

Awesome

Medical Masked Autoencoders

Paper

This repository provides the official implementation of training Vision Transformers (ViT) for (2D) medical imaging tasks as well as the usage of the pre-trained ViTs in the following paper:

<b>Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification</b> <br/> Junfei Xiao, Yutong Bai, Alan Yuille, Zongwei Zhou <br/> Johns Hopkins University <br/> IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023 <br/> paper | code

TO DO

Image reconstruction demo

<p align="center"><img src="figures/fig_reconstruction.png" width="100%"></p>

Installing Requirements

Our codebase follows the MAE Official and uses some additional packages. You may use one of the following commands to build environments with Conda and Pip.

Conda:

conda create -n medical_mae -f medical_mae.yml 

Pip:

conda create -n medical_mae python=3.8
conda activate medical_mae
pip install -r requirements.txt 

Preparing Datasets:

The MIMIC-CXR, CheXpert, and ChestX-ray14 datasets are public available on their official sites. You can download or request the access to them under the agreements.

You may also download them through the following links for research only and follow the official agreements.

MIMIC-CXR (JPG): https://physionet.org/content/mimic-cxr-jpg/2.0.0/

CheXpert (v1.0-small): https://www.kaggle.com/datasets/ashery/chexpert

ChestX-ray14 : https://www.kaggle.com/datasets/nih-chest-xrays/data

Pre-training on ImageNet or Chest X-rays

The pre-training instruction is in PRETRAIN.md.

Fine-tuning with pre-trained checkpoints

The fine-tuning instruction is in FINETUNE.md.

The following table provides the pre-trained checkpoints used in Table 1:

You can download all the weights in the following table with this link (google drive).

ModelPretrained DatasetMethodPretrainedFinetuned (NIH Chest X-ray)mAUC
DenseNet-121ImageNetCategorizationtorchvision officialgoogle drive82.2
ResNet-50ImageNetMoCo v2google drivegoogle drive80.9
ResNet-50ImageNetBYOLgoogle drivegoogle drive81.0
ResNet-50ImageNetSwAVgoogle drivegoogle drive81.5
DenseNet-121X-rays (0.3M)MoCo v2google drivegoogle drive80.6
DenseNet 121X-rays (0.3M)MAEgoogle drivegoogle drive81.2
ViT-Small/16ImageNetCategorizationDeiT Officialgoogle drive79.6
ViT-Small/16ImageNetMAEgoogle drivegoogle drive78.6
ViT-Small/16X-rays (0.3M)MAEgoogle drivegoogle drive82.3
ViT-Base/16X-rays (0.5M)MAEgoogle drivegoogle drive83.0
ModelPretrained DatasetFinetuned (Chest X-ray)mAUCFinetuned (CheXpert)mAUCFinetuned (COVIDx)Accuracy
ViT-Small/16X-rays (0.3M)google drive82.3google drive89.2google drive95.2
ViT-Base/16X-rays (0.5M)google drive83.0google drive89.3google drive95.3

Citation

If you use this code or use our pre-trained weights for your research, please cite our papers:

@inproceedings{xiao2023delving,
  title={Delving into masked autoencoders for multi-label thorax disease classification},
  author={Xiao, Junfei and Bai, Yutong and Yuille, Alan and Zhou, Zongwei},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3588--3600},
  year={2023}
}

License

This repo is under Apache 2.0 license.

Acknowledgement

This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research.

Our code is built upon facebookresearch/mae.