Home

Awesome

MAE-Lite

A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang, Jin Gao*, Zeming Li, Xiaoqin Zhang, Weiming Hu
ICML 2023

News

Introduction

MAE-Lite focuses on exploring the pre-training of lightweight Vision Transformers (ViTs). This repo provide the code and models for the study in the paper.

Getting Started

Installation

Setup conda environment:

# Create environment
conda create -n mae-lite python=3.7 -y
conda activate mae-lite

# Instaill requirements
conda install pytorch==1.9.0 torchvision==0.10.0 -c pytorch -y

# Clone MAE-Lite
git clone https://github.com/wangsr126/mae-lite.git
cd mae-lite

# Install other requirements
pip3 install -r requirements.txt
python3 setup.py build develop --user

Data Preparation

Prepare the ImageNet data in <BASE_FOLDER>/data/imagenet/imagenet_train, <BASE_FOLDER>/data/imagenet/imagenet_val.

Pre-Training

To pre-train ViT-Tiny with our recommended MAE recipe:

# 4096 batch-sizes on 8 GPUs:
cd projects/mae_lite
ssl_train -b 4096 -d 0-7 -e 400 -f mae_lite_exp.py --amp \
--exp-options exp_name=mae_lite/mae_tiny_400e

Fine-Tuning on ImageNet

Please download the pre-trained models, e.g.,

download MAE-Tiny to <BASE_FOLDER>/checkpoints/mae_tiny_400e.pth.tar

To fine-tune with the improved recipe:

# 1024 batch-sizes on 8 GPUs:
cd projects/eval_tools
ssl_train -b 1024 -d 0-7 -e 300 -f finetuning_exp.py --amp \
[--ckpt <checkpoint-path>] --exp-options pretrain_exp_name=mae_lite/mae_tiny_400e

Evaluation of fine-tuned models

download MAE-Tiny-FT to <BASE_FOLDER>/checkpoints/mae_tiny_400e_ft_300e.pth.tar

# 1024 batch-sizes on 1 GPUs:
python mae_lite/tools/eval.py -b 1024 -d 0 -f projects/eval_tools/finetuning_exp.py \
--ckpt <BASE_FOLDER>/checkpoints/mae_tiny_400e_ft_300e.pth.tar \
--exp-options pretrain_exp_name=mae_lite/mae_tiny_400e/ft_eval

And you will get "Top1: 77.978" if all right.

download MAE-Tiny-FT-RPE to <BASE_FOLDER>/checkpoints/mae_tiny_400e_ft_rpe_1000e.pth.tar

# 1024 batch-sizes on 1 GPUs:
python mae_lite/tools/eval.py -b 1024 -d 0 -f projects/eval_tools/finetuning_rpe_exp.py \
--ckpt <BASE_FOLDER>/checkpoints/mae_tiny_400e_ft_rpe_1000e.pth.tar \
--exp-options pretrain_exp_name=mae_lite/mae_tiny_400e/ft_rpe_eval

And you will get "Top1: 79.002" if all right.

Pre-Training with Distillation

Please refer to DISTILL.md.

Transfer to Other Datasets

Please refer to TRANSFER.md.

Transfer to Detection Tasks

Please refer to DETECTION.md.

Experiments of MoCo-v3

Please refer to MOCOV3.md.

Models Analysis Tools

Please refer to VISUAL.md.

Main Results

pre-train codepre-train</br> epochsfine-tune recipefine-tune epochaccuracyckpt
--impr.30075.8link
mae_lite400---link
impr.30078.0link
impr.+RPE100079.0link
mae_lite_distill400---link
impr.30078.4link

Citation

Please cite the following paper if this repo helps your research:

@misc{wang2023closer,
      title={A Closer Look at Self-Supervised Lightweight Vision Transformers}, 
      author={Shaoru Wang and Jin Gao and Zeming Li and Xiaoqin Zhang and Weiming Hu},
      journal={arXiv preprint arXiv:2205.14443},
      year={2023},
}

Acknowledge

We thank for the code implementation from timm, MAE, MoCo-v3.

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.