Home

Awesome

FocusMAE

This is the official implementation for the CVPR 2024 paper FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders.

DATA PREPARATION

Refer to the instructions in VideoMAE v2 Repository for this step.

Additionally, we suggest using the --test_randomization argument while testing for best results.

For the region priors using FasterRCNN model, obtain the region proposals in a JSON file for each video using this code.

Specify the path for folder containing the json files in the dataloader.

DATASET

We contribute additional videos to our Ultrasound video dataset (GBUSV). The complete dataset comprises of 59 videos with malignancy and 32 videos which are benign. The dataset donload instructions are available in this link.

The COVID-19 CT Dataset can be obtained here

INSTALLATION

The required packages are in the file requirements.txt, and you can run the following command to install the environment

conda create --name videomae python=3.8 -y
conda activate videomae

conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 -c pytorch

pip install -r requirements.txt

Note:

USAGE INSTRUCTIONS

The folder scripts contains files for Finetuning and Pre-training.

In each script specify the following:

OUTPUT_DIR :

MODEL_PATH :

Our pretrained models and checkpoints can be downloaded from this link : CVPR Weigths

Model NameLink
Pre-trained model for GBC Datasethttps://tinyurl.com/3s6567c3
Finetuning ckpt - Fold_0 GBC datasethttps://tinyurl.com/4y2phujr
Finetuning ckpt - Fold_1 GBC datasethttps://tinyurl.com/ajazhb79
Finetuning ckpt - Fold_2 GBC datasethttps://tinyurl.com/3jptv2dp
Finetuning ckpt - Fold_3 GBC datasethttps://tinyurl.com/2r9ywuzj
Finetuning ckpt - Fold_4 GBC datasethttps://tinyurl.com/25zuures
Pretrained model for CT Datasethere
Finetuning ckpt - CT Datasethere

Additionally, we provide our training and testing scripts as examples which can be used as follows bash scripts/finetune_train.sh

Acknowledgements

We thank VideoMAE, VideoMAEv2, and AdaMAE authors for publicly releasing their code. We have built our code-base on top of these fabulous repositories.