Home

Awesome

Advancing Radiograph Representation Learning with Masked Record Modeling (MRM)

This repository includes an official implementation of paper: Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23).

Some code is borrowed from MAE, huggingface, and REFERS.

1 Environmental preparation and quick start

Environmental requirements

If you are using anaconda/miniconda, we provide an easy way to prepare the environment for pre-training and finetuning of classification:

  conda env create -f environment.yaml
  pip install -r requirements.txt

2 How to load the pre-trained model

Download the pre-trained weight first!

import torch
import torch.nn as nn
from functools import partial
import timm
assert timm.__version__ == "0.6.12"  # version check
from timm.models.vision_transformer import VisionTransformer

def vit_base_patch16(**kwargs):
    model = VisionTransformer(norm_layer=partial(nn.LayerNorm, eps=1e-6),**kwargs)
    return model

# model definition
model = vit_base_patch16(num_classes=14,drop_path_rate=0.1,global_pool="avg")
checkpoint_model = torch.load("./MRM.pth", map_location="cpu")["model"]
# load the pre-trained model
model.load_state_dict(checkpoint_model, strict=False)

3 Pre-training

3.1 Data preparation for pre-training

      image_path, report_content
      /path/to/img1.jpg, FINAL REPORT  EXAMINATION: ...
      /path/to/img2.jpg, FINAL REPORT  CHEST: ...
      ...,...

3.2 Start pre-training

      chmod a+x run.sh
      ./run.sh

4 Fine-tuning of classification (take NIH ChestX-ray 14 dataset as the example)

4.1 Data preparation

      NIH_ChestX-ray/
            all_classes/
                  xxxx1.png
                  xxxx2.png
                  ...
                  xxxxn.png
            train_1.txt
            trian_10.txt
            train_list.txt
            val_list.txt
            test_list.txt

4.2 Start fine-tuning (take 1 percent data as the example)

      chmod a+x finetuning_1percent.sh
      ./finetuning_1percent.sh

4.3 More fine-tuning hyperparameters

RSNAwarm-up setpstotal stepslearning rate
1%5020003e-3
10%200100005e-4
100%2000500005e-4
CheXpertwarm-up setpstotal stepslearning rate
1%15020003e-3
10%1500600005e-4
100%150002000005e-4
Covidwarm-up setpstotal stepslearning rate
100%5010003e-2

5 Fine-tuning of segmentation

5.1 Data preparation

      siim/
            images/
                  training/
                        xxxx1.png
                        xxxx2.png
                        ...
                        xxxxn.png
                  validation/
                        ...
                  test/
                        ...

            annotations/
                  training/
                        xxxx1.png
                        xxxx2.png
                        ...
                        xxxxn.png
                  validation/
                        ...
                  test/
                        ...

5.2 Necessary files for segmentation

We conduct all experiments of segmentation by MMSegmentaiton (version 0.25.0) and it is necessary to set the environment and comprehend the code structures of MMSegmentaiton in advance.

Here we provide the necessary configuration files for reproducing the experiments in the directory Siim_Segmentation. After modifying MMSegmentaiton framework with provided files, start fine-tuning and evaluation with ft.sh and test.sh, respectively.

6 Links to download datasets

7 Datasets splits

In the directory DatasetsSplits, we provide dataset splits that may be helpful for organizing the datasets.

We give the train/valid/test splits of CheXpert, NIH ChestX-ray, and RSNA Pneumonia.

For COVID-19 Image Data Collection, we randomly split the train/valid/test set 5 times and we provide the images in the images directory.

For SIIM-ACR_Pneumothorax, please organize the directories of images and annotations as section 5.1 mentioned according to the given splits.