Home

Awesome

PWC PWC PWC PWC

MetaFormer

A repository for the code used to create and train the model defined in “MetaFormer: A Unified Meta Framework for Fine-Grained Recognition” arxiv:2203.02751 Image text Moreover, MetaFormer is similar to CoAtNet. Therefore, this repo can also be seen as a reference PyTorch implementation of “CoAtNet: Marrying Convolution and Attention for All Data Sizes” arxiv:2106.04803 Image text

Model zoo

nameresolution1k model21k modeliNat21 model
MetaFormer-0224x224metafg_0_1k_224metafg_0_21k_224-
MetaFormer-1224x224metafg_1_1k_224metafg_1_21k_224-
MetaFormer-2224x224metafg_2_1k_224metafg_2_21k_224-
MetaFormer-0384x384metafg_0_1k_384metafg_0_21k_384metafg_0_inat21_384
MetaFormer-1384x384metafg_1_1k_384metafg_1_21k_384metafg_1_inat21_384
MetaFormer-2384x384metafg_2_1k_384metafg_2_21k_384metafg_2_inat21_384

You can also get model by https://pan.baidu.com/s/1ZGEDoWWU7Z0vx0VCjEbe6g (password:3uiq).

Usage

python module

pip install torch==1.5.1 torchvision==0.6.1
pip install timm==0.4.5
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install opencv-python==4.5.1.48 yacs==0.1.8

data preparation

Download inat21,18,17,CUB,NABirds,stanfordcars, and aircraft, put them in respective folders (<root>/datasets/<dataset_name>) and Unzip file. The folder sturture as follow:

datasets
  |————inraturelist2021
  |       └——————train
  |       └——————val
  |       └——————train.json
  |       └——————val.json
  |————inraturelist2018
  |       └——————train_val_images
  |       └——————train2018.json
  |       └——————val2018.json
  |       └——————train2018_locations.json
  |       └——————val2018_locations.json
  |       └——————categories.json.json
  |————inraturelist2017
  |       └——————train_val_images
  |       └——————train2017.json
  |       └——————val2017.json
  |       └——————train2017_locations.json
  |       └——————val2017_locations.json
  |————cub-200
  |       └——————...
  |————nabirds
  |       └——————...
  |————stanfordcars
  |       └——————car_ims
  |       └——————cars_annos.mat
  |————aircraft
  |       └——————...

Training

You can dowmload pre-trained model from model zoo, and put them under <root>/pretrained. To train MetaFG on datasets, run:

python3 -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345  main.py --cfg <config-file> --dataset <dataset-name> --pretrain <pretainedmodel-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]

<dataset-name>:inaturelist2021,inaturelist2018,inaturelist2017,cub-200,nabirds,stanfordcars,aircraft For CUB-200-2011, run:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py --cfg ./configs/MetaFG_1_224.yaml --batch-size 32 --tag cub-200_v1 --lr 5e-5 --min-lr 5e-7 --warmup-lr 5e-8 --epochs 300 --warmup-epochs 20 --dataset cub-200 --pretrain ./pretrained_model/<xxxx>.pth --accumulation-steps 2 --opts DATA.IMG_SIZE 384  

note that final learning rate is total_bs/512.

Eval

To evaluate model on dataset,run:

python3 -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345  main.py --eval --cfg <config-file> --dataset <dataset-name> --resume <checkpoint> [--batch-size <batch-size-per-gpu>]

Main Result

ImageNet-1k

NameResolution#Param#FLOPSThroughputTop-1 acc
MetaFormer-0224x22428M4.6G840.182.9
MetaFormer-1224x22445M8.5G444.883.9
MetaFormer-2224x22481M16.9G438.984.1
MetaFormer-0384x38428M13.4G349.484.2
MetaFormer-1384x38445M24.7G165.384.4
MetaFormer-2384x38481M49.7G132.784.6

Fine-grained Datasets

Result on fine-grained datasets with different pre-trained model.

NamePretrainCUBNABirdsiNat2017iNat2018CarsAircraft
MetaFormer-0ImageNet-1k89.689.175.779.595.091.2
MetaFormer-0ImageNet-21k89.789.575.879.994.691.2
MetaFormer-0iNaturalist 202191.891.578.382.995.187.4
MetaFormer-1ImageNet-1k89.789.478.281.994.990.8
MetaFormer-1ImageNet-21k91.391.679.483.295.092.6
MetaFormer-1iNaturalist 202192.392.782.087.595.092.5
MetaFormer-2ImageNet-1k89.789.779.082.695.092.4
MetaFormer-2ImageNet-21k91.892.280.484.395.192.9
MetaFormer-2iNaturalist 202192.993.082.887.795.492.8

Results in iNaturalist 2019, iNaturalist 2018, and iNaturalist 2021 with meta-information.

NamePretrainMeta addediNat2017iNat2018iNat2021
MetaFormer-0ImageNet-1kN75.779.588.4
MetaFormer-0ImageNet-1kY79.8(+4.1)85.4(+5.9)92.6(+4.2)
MetaFormer-1ImageNet-1kN78.281.990.2
MetaFormer-1ImageNet-1kY81.3(+3.1)86.5(+4.6)93.4(+3.2)
MetaFormer-2ImageNet-1kN79.082.689.8
MetaFormer-2ImageNet-1kY82.0(+3.0)86.8(+4.2)93.2(+3.4)
MetaFormer-2ImageNet-21kN80.484.390.3
MetaFormer-2ImageNet-21kY83.4(+3.0)88.7(+4.4)93.6(+3.3)

Citation

@article{MetaFormer,
  title={MetaFormer: A Unified Meta Framework for Fine-Grained Recognition},
  author={Diao, Qishuai and Jiang, Yi and Wen, Bin and Sun, Jia and Yuan, Zehuan},
  journal={arXiv preprint arXiv:2203.02751},
  year={2022},
}

Acknowledgement

Many thanks for swin-transformer.A part of the code is borrowed from it.