Home

Awesome

A ConvNet for the 2020s

Official PyTorch implementation of ConvNeXt, from the following paper:

A ConvNet for the 2020s. CVPR 2022.
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell and Saining Xie
Facebook AI Research, UC Berkeley
[arXiv][video]


<p align="center"> <img src="https://user-images.githubusercontent.com/8370623/180626875-fe958128-6102-4f01-9ca4-e3a30c3148f9.png" width=100% height=100% class="center"> </p>

We propose ConvNeXt, a pure ConvNet model constructed entirely from standard ConvNet modules. ConvNeXt is accurate, efficient, scalable and very simple in design.

Catalog

<!-- ✅ ⬜️ -->

Results and Pre-trained Models

ImageNet-1K trained models

nameresolutionacc@1#paramsFLOPsmodel
ConvNeXt-T224x22482.128M4.5Gmodel
ConvNeXt-S224x22483.150M8.7Gmodel
ConvNeXt-B224x22483.889M15.4Gmodel
ConvNeXt-B384x38485.189M45.0Gmodel
ConvNeXt-L224x22484.3198M34.4Gmodel
ConvNeXt-L384x38485.5198M101.0Gmodel

ImageNet-22K trained models

nameresolutionacc@1#paramsFLOPs22k model1k model
ConvNeXt-T224x22482.929M4.5Gmodelmodel
ConvNeXt-T384x38484.129M13.1G-model
ConvNeXt-S224x22484.650M8.7Gmodelmodel
ConvNeXt-S384x38485.850M25.5G-model
ConvNeXt-B224x22485.889M15.4Gmodelmodel
ConvNeXt-B384x38486.889M47.0G-model
ConvNeXt-L224x22486.6198M34.4Gmodelmodel
ConvNeXt-L384x38487.5198M101.0G-model
ConvNeXt-XL224x22487.0350M60.9Gmodelmodel
ConvNeXt-XL384x38487.8350M179.0G-model

ImageNet-1K trained models (isotropic)

nameresolutionacc@1#paramsFLOPsmodel
ConvNeXt-S224x22478.722M4.3Gmodel
ConvNeXt-B224x22482.087M16.9Gmodel
ConvNeXt-L224x22482.6306M59.7Gmodel

Installation

Please check INSTALL.md for installation instructions.

Evaluation

We give an example evaluation command for a ImageNet-22K pre-trained, then ImageNet-1K fine-tuned ConvNeXt-B:

Single-GPU

python main.py --model convnext_base --eval true \
--resume https://dl.fbaipublicfiles.com/convnext/convnext_base_22k_1k_224.pth \
--input_size 224 --drop_path 0.2 \
--data_path /path/to/imagenet-1k

Multi-GPU

python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --eval true \
--resume https://dl.fbaipublicfiles.com/convnext/convnext_base_22k_1k_224.pth \
--input_size 224 --drop_path 0.2 \
--data_path /path/to/imagenet-1k

This should give

* Acc@1 85.820 Acc@5 97.868 loss 0.563

Training

See TRAINING.md for training and fine-tuning instructions.

Acknowledgement

This repository is built using the timm library, DeiT and BEiT repositories.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Citation

If you find this repository helpful, please consider citing:

@Article{liu2022convnet,
  author  = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
  title   = {A ConvNet for the 2020s},
  journal = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year    = {2022},
}