Home

Awesome

EfficientTrain++ (TPAMI 2024 & ICCV 2023)

This repo releases the code and pre-trained models of EfficientTrain++, an off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.

[TPAMI 2024] EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training
Yulin Wang, Yang Yue, Rui Lu, Yizeng Han, Shiji Song, and Gao Huang
Tsinghua University, BAAI
[arXiv]

[ICCV 2023] EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
Yulin Wang, Yang Yue, Rui Lu, Tianjiao Liu, Zhao Zhong, Shiji Song, and Gao Huang
Tsinghua University, Huawei, BAAI
[arXiv]

Overview

We present a novel curriculum learning approach for the efficient training of foundation visual backbones. Our algorithm, EfficientTrain++, is simple, general, yet surprisingly effective. As an off-the-shelf approach, it reduces the training time of various popular models (e.g., ResNet, ConvNeXt, DeiT, PVT, Swin, CSWin, and CAFormer) by 1.5−3.0× on ImageNet-1K/22K without sacrificing accuracy. It also demonstrates efficacy in self-supervised learning (e.g., MAE).

<p align="center"> <img src="./imgs/overview.png" width= "450"> </p>

Highlights of our work

Catalog

Installation

We support PyTorch>=2.0.0 and torchvision>=0.15.1. Please install them following the official instructions.

Clone this repo and install the required packages:

git clone https://github.com/LeapLabTHU/EfficientTrain
pip install timm==0.4.12 tensorboardX six

The instructions for preparing ImageNet-1K/22K datasets can be found here.

Training

See TRAINING.md for the training instructions.

Pre-trained models & evaluation & fine-tuning

See EVAL.md for the pre-trained models and the instructions for evaluating or fine-tuning them.

Results

Supervised learning on ImageNet-1K

<p align="center"> <img src="./imgs/in_1k.png" width= "900"> </p>

ImageNet-22K pre-training

<p align="center"> <img src="./imgs/in_22k.png" width= "900"> </p>

Supervised learning on ImageNet-1K (varying training budgets)

<p align="center"> <img src="./imgs/vary_epoch.png" width= "900"> </p> <p align="center"> <img src="./imgs/300ep.png" width= "450"> </p>

Object detection and instance segmentation on COCO

<p align="center"> <img src="./imgs/coco.png" width= "450"> </p>

Semantic segmentation on ADE20K

<p align="center"> <img src="./imgs/seg.png" width= "450"> </p>

Self-supervised learning results on top of MAE

<p align="center"> <img src="./imgs/mae.png" width= "450"> </p>

TODO

This repo is still being updated. If you need anything, no matter it is listed in the following or not, please send an e-mail to me (wang-yl19@mails.tsinghua.edu.cn).

Acknowledgments

This repo is mainly developed on the top of ConvNeXt, we sincerely thank them for their efficient and neat codebase. This repo is also built using DeiT and timm.

Citation

If you find this work valuable or use our code in your own research, please consider citing us:

@article{wang2024EfficientTrain_pp,
        title = {EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training},
       author = {Wang, Yulin and Yue, Yang and Lu, Rui and Han, Yizeng and Song, Shiji and Huang, Gao},
      journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
         year = {2024},
          doi = {10.1109/TPAMI.2024.3401036}
}
@inproceedings{wang2023EfficientTrain,
        title = {EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones},
       author = {Wang, Yulin and Yue, Yang and Lu, Rui and Liu, Tianjiao and Zhong, Zhao and Song, Shiji and Huang, Gao},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
         year = {2023}
}