Home

Awesome

R-MAE: Regions Meet Masked Autoencoders

R-MAE

This repository is an official implementation of R-MAE: Regions Meet Masked Autoencoders in PyTorch. It is based on the BoxeR repository, but adapted for self-supervised pre-training.

Citing R-MAE

If you find R-MAE useful in your research, please consider citing:

@article{nguyen2023rmae,
  title={R-MAE: Regions Meet Masked Autoencoders},
  author={Duy{-}Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen},
  journal={arXiv preprint arXiv:2306.05411},
  year={2023}
}

Installation

Requirements

Usage

Dataset preparation

The datasets are assumed to exist in a directory specified by the environment variable $E2E_DATASETS. If the environment variable is not specified, it will be set to be .data. Under this directory, our code will look for datasets in the structure described below.

$E2E_DATASETS/
├── coco/
└── imnet/

For pre-training on COCO, please download COCO 2017 dataset and organize them as following:

$E2E_DATASETS/
└── coco/
	├── annotations/
		├── instances_train2017.json
		├── instances_val2017.json
		└── image_info_unlabeled2017.json
	└── image/
		├── train2017/
		├── fh_train2017/
		├── val2017/
		├── fh_val2017/
		├── unlabeled2017/
		└── fh_unlabeled2017/

You can generate FH masks for our pre-training by running create_fh_mask_for_coco.py in tools/preprocess.

For pre-training on ImageNet, please download ImageNet dataset and organize them as following:

$E2E_DATASETS/
└── imagenet/
	├── train/
	├── fh_train/
	├── val/
	└── fh_val/

You can generate FH masks for our pre-training by running create_fh_mask_for_imnet.py in tools/preprocess.

Training

Our script is able to automatically detect the number of available gpus on a single node. It works best with Slurm system when it can auto-detect the number of available gpus along with nodes. The command for pre-training R-MAE is simple as following:

python tools/run.py --config ${CONFIG_PATH} --model ${MODEL} --task ${TASK} --dataset ${DATASET}

For example,

python tools/run.py --config pretrain/config/RMAE/COCO/rmae_fh_4000eps.yaml --model rmae --task pretrain --dataset coco
python tools/run.py --config pretrain/config/RMAE/ImageNet/rmae_fh_800eps.yaml --model rmae --task pretrain --dataset imnet

Some tips to speed-up training

python tools/run.py --config ${CONFIG_PATH} --model ${MODEL} --task ${TASK} --dataset ${DATASET} dataset_config.${DATASET}_${TASK}.sampler=shard
python tools/run.py --config ${CONFIG_PATH} --model ${MODEL} --task ${TASK} --dataset ${DATASET} training.iter_per_update=2
python tools/run.py --config ${CONFIG_PATH} --model ${MODEL} --task ${TASK} --dataset ${DATASET} training.use_fp16=(float16 or bfloat16)

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.