Home

Awesome

Demo project for training DeiT with Mesa

For more details, please checkout our paper: Mesa: A Memory-saving Training Framework for Transformers.

Usage

  1. Install Mesa from here.
  2. Install timm.
    pip install timm==0.3.2
    
  3. To train a model.
    conda activate mesa
    bash scripts/run.sh [model] [gpus]
    
    # For example, to train DeiT-Ti with Mesa on 1 GPU
    
    bash scripts/run.sh deit_tiny_patch16_224 1
    # You may need to change the `--data-path` and `--data-set` (CIFAR or IMNET) in scripts/run.sh to make sure you have the correct path to dataset.
    

Results on ImageNet

ModelParam (M)FLOPs (G)Train MemoryTop-1 (%)
DeiT-Ti51.34,17171.9
DeiT-Ti w/ Mesa51.31,85872.1
DeiT-S224.68,45979.8
DeiT-S w/ Mesa224.63,84080.0
DeiT-B8617.517,69181.8
DeiT-B w/ Mesa8617.58,61681.8

Acknowledgments

This repository has adopted codes from DeiT, we thank the authors for their open-sourced code.