Awesome
DynaMixer: A Vision MLP Architecture with Dynamic Mixing[arxiv]
This is a Pytorch implementation of our paper DynaMixer, ICML 2022.
Comparison with Recent MLP-like Models
Model | Sub-model | Parameters | Top 1 Acc. |
---|---|---|---|
Cycle-MLP | T | 28M | 81.3 |
ViP | Small/7 | 25M | 81.5 |
Hire-MLP | S | 33M | 82.1 |
DynaMixer | S | 26M | 82.7 |
Cycle-MLP | S | 50M | 82.9 |
ViP | Medium/7 | 55M | 82.7 |
Hire-MLP | B | 58M | 83.2 |
DynaMixer | M | 57M | 83.7 |
Cycle-MLP | B | 88M | 83.4 |
ViP | Large/7 | 88M | 83.2 |
Hire-MLP | L | 96M | 83.8 |
DynaMixer | L | 97M | 84.3 |
Requirements
Environment
torch==1.9.0
torchvision>=0.10.0
pyyaml
timm==0.4.12
fvcore
apex if you use 'apex amp'
Data
data prepare: ImageNet with the following folder structure, you can extract imagenet by this script.
│imagenet/
├──train/
│ ├── n01440764
│ │ ├── n01440764_10026.JPEG
│ │ ├── n01440764_10027.JPEG
│ │ ├── ......
│ ├── ......
├──val/
│ ├── n01440764
│ │ ├── ILSVRC2012_val_00000293.JPEG
│ │ ├── ILSVRC2012_val_00002138.JPEG
│ │ ├── ......
│ ├── ......
Training
Command line for training on 8 GPUs (V100)
train dynamixer_s:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /path/to/imagenet --model dynamixer_s -b 256 -j 8 --opt adamw --epochs 300 --sched cosine --apex-amp --img-size 224 --drop-path 0.1 --lr 2e-3 --weight-decay 0.05 --remode pixel --reprob 0.25 --aa rand-m9-mstd0.5-inc1 --smoothing 0.1 --mixup 0.8 --cutmix 1.0 --warmup-lr 1e-6 --warmup-epochs 20
train dynamixer_m:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /path/to/imagenet --model dynamixer_m -b 128 -j 8 --opt adamw --epochs 300 --sched cosine --apex-amp --img-size 224 --drop-path 0.1 --lr 2e-3 --weight-decay 0.05 --remode pixel --reprob 0.25 --aa rand-m9-mstd0.5-inc1 --smoothing 0.1 --mixup 0.8 --cutmix 1.0 --warmup-lr 1e-6 --warmup-epochs 20
train dynamixer_l:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /path/to/imagenet --model dynamixer_l -b 64 -j 8 --opt adamw --epochs 300 --sched cosine --apex-amp --img-size 224 --drop-path 0.3 --lr 2e-3 --weight-decay 0.05 --remode pixel --reprob 0.25 --aa rand-m9-mstd0.5-inc1 --smoothing 0.1 --mixup 0.8 --cutmix 1.0 --warmup-lr 1e-6 --warmup-epochs 20
Reference
You may want to cite:
@inproceedings{wang2022dynamixer,
title={Dynamixer: a vision MLP architecture with dynamic mixing},
author={Wang, Ziyu and Jiang, Wenhao and Zhu, Yiming M and Yuan, Li and Song, Yibing and Liu, Wei},
booktitle={International Conference on Machine Learning},
pages={22691--22701},
year={2022},
organization={PMLR}
}
Acknowledgement
The code is based on following repos:
https://github.com/Andrew-Qibin/VisionPermutator
https://github.com/ShoufaChen/CycleMLP.
Thanks for their wonderful works.
License
Dynamixer is released under MIT License.