AlexNet ('One weird trick for parallelizing convolutional neural networks') | 62.3M | 1,132.33M | 40.96 | 18.24 | 2014 |
VGG-16 ('Very Deep Convolutional Networks for Large-Scale Image Recognition') | 138.3M | ? | 26.78 | 8.69 | 2014 |
ResNet-10 ('Deep Residual Learning for Image Recognition') | 5.5M | 894.04M | 34.69 | 14.36 | 2015 |
ResNet-18 ('Deep Residual Learning for Image Recognition') | 11.7M | 1,820.41M | 28.53 | 9.82 | 2015 |
ResNet-34 ('Deep Residual Learning for Image Recognition') | 21.8M | 3,672.68M | 24.84 | 7.80 | 2015 |
ResNet-50 ('Deep Residual Learning for Image Recognition') | 25.5M | 3,877.95M | 22.28 | 6.33 | 2015 |
InceptionV3 ('Rethinking the Inception Architecture for Computer Vision') | 23.8M | ? | 21.2 | 5.6 | 2015 |
PreResNet-18 ('Identity Mappings in Deep Residual Networks') | 11.7M | 1,820.56M | 28.43 | 9.72 | 2016 |
PreResNet-34 ('Identity Mappings in Deep Residual Networks') | 21.8M | 3,672.83M | 24.89 | 7.74 | 2016 |
PreResNet-50 ('Identity Mappings in Deep Residual Networks') | 25.6M | 3,875.44M | 22.40 | 6.47 | 2016 |
DenseNet-121 ('Densely Connected Convolutional Networks') | 8.0M | 2,872.13M | 23.48 | 7.04 | 2016 |
DenseNet-161 ('Densely Connected Convolutional Networks') | 28.7M | 7,793.16M | 22.86 | 6.44 | 2016 |
PyramidNet-101 ('Deep Pyramidal Residual Networks') | 42.5M | 8,743.54M | 21.98 | 6.20 | 2016 |
ResNeXt-14(32x4d) ('Aggregated Residual Transformations for Deep Neural Networks') | 9.5M | 1,603.46M | 30.32 | 11.46 | 2016 |
ResNeXt-26(32x4d) ('Aggregated Residual Transformations for Deep Neural Networks') | 15.4M | 2,488.07M | 24.14 | 7.46 | 2016 |
WRN-50-2 ('Wide Residual Networks') | 68.9M | 11,405.42M | 22.53 | 6.41 | 2016 |
Xception ('Xception: Deep Learning with Depthwise Separable Convolutions') | 22,855,952 | 8,403.63M | 20.97 | 5.49 | 2016 |
InceptionV4 ('Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning') | 42,679,816 | 12,304.93M | 20.64 | 5.29 | 2016 |
InceptionResNetV2 ('Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning') | 55,843,464 | 13,188.64M | 19.93 | 4.90 | 2016 |
PolyNet ('PolyNet: A Pursuit of Structural Diversity in Very Deep Networks') | 95,366,600 | 34,821.34M | 19.10 | 4.52 | 2016 |
DarkNet Ref ('Darknet: Open source neural networks in C') | 7,319,416 | 367.59M | 38.58 | 17.18 | 2016 |
DarkNet Tiny ('Darknet: Open source neural networks in C') | 1,042,104 | 500.85M | 40.74 | 17.84 | 2016 |
DarkNet 53 ('Darknet: Open source neural networks in C') | 41,609,928 | 7,133.86M | 21.75 | 5.64 | 2016 |
SqueezeResNet1.1 ('SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size') | 1,235,496 | 352.02M | 40.09 | 18.21 | 2016 |
SqueezeNet1.1 ('SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size') | 1,235,496 | 352.02M | 39.31 | 17.72 | 2016 |
ResAttNet-92 ('Residual Attention Network for Image Classification') | 51.3M | ? | 19.5 | 4.8 | 2017 |
CondenseNet (G=C=8) ('CondenseNet: An Efficient DenseNet using Learned Group Convolutions') | 4.8M | ? | 26.2 | 8.3 | 2017 |
DPN-68 ('Dual Path Networks') | 12,611,602 | 2,351.84M | 23.24 | 6.79 | 2017 |
ShuffleNet x1.0 (g=1) ('ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices') | 1,531,936 | 148.13M | 34.93 | 13.89 | 2017 |
DiracNetV2-18 ('DiracNets: Training Very Deep Neural Networks Without Skip-Connections') | 11,511,784 | 1,796.62M | 31.47 | 11.70 | 2017 |
DiracNetV2-34 ('DiracNets: Training Very Deep Neural Networks Without Skip-Connections') | 21,616,232 | 3,646.93M | 28.75 | 9.93 | 2017 |
SENet-16 ('Squeeze-and-Excitation Networks') | 31,366,168 | 5,081.30M | 25.65 | 8.20 | 2017 |
SENet-154 ('Squeeze-and-Excitation Networks') | 115,088,984 | 20,745.78M | 18.62 | 4.61 | 2017 |
MobileNet ('MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications') | 4,231,976 | 579.80M | 26.61 | 8.95 | 2017 |
NASNet-A 4@1056 ('Learning Transferable Architectures for Scalable Image Recognition') | 5,289,978 | 584.90M | 25.68 | 8.16 | 2017 |
NASNet-A 6@4032('Learning Transferable Architectures for Scalable Image Recognition') | 88,753,150 | 23,976.44M | 18.14 | 4.21 | 2017 |
DLA-34 ('Deep Layer Aggregation') | 15,742,104 | 3,071.37M | 25.36 | 7.94 | 2017 |
AirNet50-1x64d (r=2) ('Attention Inspiring Receptive-Fields Network for Learning Invariant Representations') | 27.43M | ? | 22.48 | 6.21 | 2018 |
BAM-ResNet-50 ('BAM: Bottleneck Attention Module') | 25.92M | ? | 23.68 | 6.96 | 2018 |
CBAM-ResNet-50 ('CBAM: Convolutional Block Attention Module') | 28.1M | ? | 23.02 | 6.38 | 2018 |
1.0-SqNxt-23v5 ('SqueezeNext: Hardware-Aware Neural Network Design') | 921,816 | 285.82M | 40.77 | 17.85 | 2018 |
1.5-SqNxt-23v5 ('SqueezeNext: Hardware-Aware Neural Network Design') | 1,953,616 | 550.97M | 33.81 | 13.01 | 2018 |
2.0-SqNxt-23v5 ('SqueezeNext: Hardware-Aware Neural Network Design') | 3,366,344 | 897.60M | 29.63 | 10.66 | 2018 |
ShuffleNetV2 ('ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design') | 2,278,604 | 149.72M | 31.44 | 11.63 | 2018 |
456-MENet-24×1(g=3) ('Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications') | 5.3M | ? | 28.4 | 9.8 | 2018 |
FD-MobileNet ('FD-MobileNet: Improved MobileNet with A Fast Downsampling Strategy') | 2,901,288 | 147.46M | 34.23 | 13.38 | 2018 |
MobileNetV2 ('MobileNetV2: Inverted Residuals and Linear Bottlenecks') | 3,504,960 | 329.36M | 26.97 | 8.87 | 2018 |
IGCV3 ('IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks') | 3.5M | ? | 28.22 | 9.54 | 2018 |
DARTS ('DARTS: Differentiable Architecture Search') | 4.9M | ? | 26.9 | 9.0 | 2018 |
PNASNet-5 ('Progressive Neural Architecture Search') | 5.1M | ? | 25.8 | 8.1 | 2018 |
AmoebaNet-C ('Regularized Evolution for Image Classifier Architecture Search') | 5.1M | ? | 24.3 | 7.6 | 2018 |
MnasNet ('MnasNet: Platform-Aware Neural Architecture Search for Mobile') | 4,308,816 | 317.67M | 31.58 | 11.74 | 2018 |
IBN-Net50-a ('Two at Once: Enhancing Learning andGeneralization Capacities via IBN-Net') | ? | ? | 22.54 | 6.32 | 2018 |
MarginNet ('Large Margin Deep Networks for Classification') | ? | ? | 22.0 | ? | 2018 |
A^2 Net ('A^2-Nets: Double Attention Networks') | ? | ? | 23.0 | 6.5 | 2018 |
FishNeXt-150 ('FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction') | 26.2M | ? | 21.5 | ? | 2018 |
Shape-ResNet ('IMAGENET-TRAINED CNNS ARE BIASED TOWARDS TEXTURE; INCREASING SHAPE BIAS IMPROVES ACCURACY AND ROBUSTNESS') | 25.5M | ? | 23.28 | 6.72 | 2019 |
SimCNN(k=3 train) ('Greedy Layerwise Learning Can Scale to ImageNet') | ? | ? | 28.4 | 10.2 | 2019 |
SKNet-50 ('Selective Kernel Networks') | 27.5M | ? | 20.79 | ? | 2019 |
SRM-ResNet-50 ('SRM : A Style-based Recalibration Module for Convolutional Neural Networks') | 25.62M | ? | 22.87 | 6.49 | 2019 |
EfficientNet-B0 ('EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks') | 5,288,548 | 414.31M | 24.77 | 7.52 | 2019 |
EfficientNet-B7b ('EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks') | 66,347,960 | 39,010.98M | 15.94 | 3.22 | 2019 |
ProxylessNAS ('PROXYLESSNAS: DIRECT NEURAL ARCHITECTURE SEARCH ON TARGET TASK AND HARDWARE') | ? | ? | 24.9 | 7.5 | 2019 |
MixNet-L ('MixNet: Mixed Depthwise Convolutional Kernels') | 7.3M | ? | 21.1 | 5.8 | 2019 |
ECA-Net50 ('ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks') | 24.37M | 3.86G | 22.52 | 6.32 | 2019 |
ECA-Net101 ('ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks') | 7.3M | 7.35G | 21.35 | 5.66 | 2019 |
ACNet-Densenet121 ('ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks') | ? | ? | 24.18 | 7.23 | 2019 |
LIP-ResNet-50 ('LIP: Local Importance-based Pooling') | 23.9M | 5.33G | 21.81 | 6.04 | 2019 |
LIP-ResNet-101 ('LIP: Local Importance-based Pooling') | 42.9M | 9.06G | 20.67 | 5.40 | 2019 |
LIP-DenseNet-BC-121 ('LIP: Local Importance-based Pooling') | 8.7M | 4.13G | 23.36 | 6.84 | 2019 |
MuffNet_1.0 ('MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning') | 2.3M | 146M | 30.1 | ? | 2019 |
MuffNet_1.5 ('MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning') | 3.4M | 300M | 26.9 | ? | 2019 |
ResNet-34-Bin-5 ('Making Convolutional Networks Shift-Invariant Again') | 21.8M | 3,672.68M | 25.80 | ? | 2019 |
ResNet-50-Bin-5 ('Making Convolutional Networks Shift-Invariant Again') | 25.5M | 3,877.95M | 22.96 | ? | 2019 |
MobileNetV2-Bin-5 ('Making Convolutional Networks Shift-Invariant Again') | 3,504,960 | 329.36M | 27.50 | ? | 2019 |
FixRes ResNeXt101 WSL ('Fixing the train-test resolution discrepancy') | 829M | ? | 13.6 | 2.0 | 2019 |
Noisy Student*(L2) ('Self-training with Noisy Student improves ImageNet classification') | 480M | ? | 12.6 | 1.8 | 2019 |
TResNet-M ('TResNet: High Performance GPU-Dedicated Architecture') | 29.4M | 5.5G | 19.3 | ? | 2020 |
DA-NAS-C ('DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search') | ? | 467M | 23.8 | ? | 2020 |
ResNeSt-50 ('ResNeSt: Split-Attention Networks') | 27.5M | 5.39G | 18.87 | ? | 2020 |
ResNeSt-101 ('ResNeSt: Split-Attention Networks') | 48.3M | 10.2G | 17.73 | ? | 2020 |
ResNet-50-FReLU ('Funnel Activation for Visual Recognition') | 25.5M | 3.87G | 22.40 | ? | 2020 |
ResNet-101-FReLU ('Funnel Activation for Visual Recognition') | 44.5M | 7.6G | 22.10 | ? | 2020 |
ResNet-50-MEALv2 ('MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks') | 25.6M | ? | 19.33 | 4.91 | 2020 |
ResNet-50-MEALv2 + CutMix ('MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks') | 25.6M | ? | 19.02 | 4.65 | 2020 |
MobileNet V3-Large-MEALv2 ('MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks') | 5.48M | ? | 23.08 | 6.68 | 2020 |
EfficientNet-B0-MEALv2 ('MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks') | 5.29M | ? | 21.71 | 6.05 | 2020 |
T2T-ViT-7 ('Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet') | 4.2M | 0.6G | 28.8 | ? | 2021 |
T2T-ViT-14 ('Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet') | 19.4M | 4.8G | 19.4 | ? | 2021 |
T2T-ViT-19 ('Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet') | 39.0M | 8.0G | 18.8 | ? | 2021 |
NFNet-F0 ('High-Performance Large-Scale Image Recognition Without Normalization') | 71.5M | 12.38G | 16.4 | 3.2 | 2021 |
NFNet-F1 ('High-Performance Large-Scale Image Recognition Without Normalization') | 132.6M | 35.54G | 15.4 | 2.9 | 2021 |
NFNet-F6+SAM ('High-Performance Large-Scale Image Recognition Without Normalization') | 438.4M | 377.28G | 13.5 | 2.1 | 2021 |
EfficientNetV2-S ('EfficientNetV2: Smaller Models and Faster Training') | 24M | 8.8G | 16.1 | ? | 2021 |
EfficientNetV2-M ('EfficientNetV2: Smaller Models and Faster Training') | 55M | 24G | 14.9 | ? | 2021 |
EfficientNetV2-L ('EfficientNetV2: Smaller Models and Faster Training') | 121M | 53G | 14.3 | ? | 2021 |
EfficientNetV2-S (21k) ('EfficientNetV2: Smaller Models and Faster Training') | 24M | 8.8G | 15.0 | ? | 2021 |
EfficientNetV2-M (21k) ('EfficientNetV2: Smaller Models and Faster Training') | 55M | 24G | 13.9 | ? | 2021 |
EfficientNetV2-L (21k) ('EfficientNetV2: Smaller Models and Faster Training') | 121M | 53G | 13.2 | ? | 2021 |