Home

Awesome

MEAL-V2

This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks" by Zhiqiang Shen and Marios Savvides from Carnegie Mellon University.

<div align=center> <img width=70% src="https://user-images.githubusercontent.com/3794909/92182326-6f78c400-ee19-11ea-80e4-2d6e4d73ce82.png"/> </div>

In this paper, we introduce a simple yet effective approach that can boost the vanilla ResNet-50 to 80%+ Top-1 accuracy on ImageNet without any tricks. Generally, our method is based on the recently proposed MEAL, i.e., ensemble knowledge distillation via discriminators. We further simplify it through 1) adopting the similarity loss and discriminator only on the final outputs and 2) using the average of softmax probabilities from all teacher ensembles as the stronger supervision for distillation. One crucial perspective of our method is that the one-hot/hard label should not be used in the distillation process. We show that such a simple framework can achieve state-of-the-art results without involving any commonly-used tricks, such as 1) architecture modification; 2) outside training data beyond ImageNet; 3) autoaug/randaug; 4) cosine learning rate; 5) mixup/cutmix training; 6) label smoothing; etc.

Citation

If you find our code is helpful for your research, please cite:

@article{shen2020mealv2,
  title={MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks},
  author={Shen, Zhiqiang and Savvides, Marios},
  journal={arXiv preprint arXiv:2009.08453},
  year={2020}
}

News

[Dec. 5, 2021] New: Add FKD training support. We highly recommend to use FKD for training MEAL V2 models, which will be 2~4x faster with similar accuracy.

Add --cos if you would like to train with cosine learning rate.

New: Basically, adding back tricks (cosine lr, etc.) into MEAL V2 can consistently improve the accuracy:

<div align=center> <img width=45% src="images/MEAL-V2_more_tricks_top1.png"/> <img width=45% src="images/MEAL-V2_more_tricks_top5.png"/> </div>

New: Add CutMix training support, use --w-cutmix to enable it.

[Mar. 19, 2021] Long version of MEAL V2 is available on: arXiv or paper.

[Dec. 16, 2020] MEAL V2 is now available in PyTorch Hub.

[Nov. 3, 2020] Short version of MEAL V2 has been accepted in NeurIPS 2020 Beyond BackPropagation: Novel Ideas for Training Neural Architectures workshop. Long version is coming soon.

Preparation

1. Requirements:

This repo is tested with:

But it should be runnable with other PyTorch versions.

2. Data:

Results & Models

We provide pre-trained models with different trainings, we report in the table training/validation resolution, #parameters, Top-1 and Top-5 accuracy on ImageNet validation set:

ModelsResolution#ParametersTop-1/Top-5Trained models
MEAL-V1 w/ ResNet5022425.6M78.21/94.01GitHub
MEAL-V2 w/ ResNet1822411.7M73.19/90.82Download (46.8M)
MEAL-V2 w/ ResNet5022425.6M80.67/95.09Download (102.6M)
MEAL-V2 w/ ResNet5038025.6M81.72/95.81Download (102.6M)
MEAL-V2 + CutMix w/ ResNet5022425.6M80.98/95.35Download (102.6M)
MEAL-V2 w/ MobileNet V3-Small 0.752242.04M67.60/87.23Download (8.3M)
MEAL-V2 w/ MobileNet V3-Small 1.02242.54M69.65/88.71Download (10.3M)
MEAL-V2 w/ MobileNet V3-Large 1.02245.48M76.92/93.32Download (22.1M)
MEAL-V2 w/ EfficientNet-B02245.29M78.29/93.95Download (21.5M)

Training & Testing

1. Training:

Please reduce the --batch-size if you get ''out of memory'' error. We also notice that more training epochs can slightly improve the performance.

2. Testing:

change --res with other image resolution [224/380] and -a with other model architecture [tf_mobilenetv3_small_100; tf_mobilenetv3_large_100; tf_efficientnet_b0] to test other trained models.

Contact

Zhiqiang Shen, CMU (zhiqians at andrew.cmu.edu)

Any comments or suggestions are welcome!