Home

Awesome

🌎[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions(GLMCοΌ‰

by Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang

This is the official implementation of Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

🎬Video | πŸ’»Slide | πŸ”₯Poster

Update 2023/5/23

Thank you very much for the question from @CxC-ssjg. In our code for the Cifar10Imbalance and Cifar100Imbalance classes, when generating imbalanced data, we used np.random.choice for random sampling of samples. However, we did not set the "replace" parameter in the method to False, which could result in multiple repeated samples of a particular sample, thereby reducing the diversity of the dataset. Based on @CxC-ssjg's advice, we set replace to False and fine-tuned our model accordingly. As a result, we observed a significant improvement in performance compared to the results reported in the paper. We have provided an update on the latest results and made the model publicly available. Once again, thank you, @CxC-ssjg, for your valuable question.

DatesetIFGLMCGLMC(Updated)GLMC(Updated) + MaxNorm
CIFAR-100-LT10055.88%57.97%58.41%
CIFAR-100-LT5061.08%63.78%64.57%
CIFAR-100-LT1070.74%73.40%74.28%
CIFAR-10-LT10087.75%88.50%89.58%
CIFAR-10-LT5090.18%91.04%92.04%
CIFAR-10-LT1094.04%94.87%95.00%

Update 2023/5/15

Apologies for the oversight in our paper regarding the incorrect upload of the results for CIFAR-10. We have updated our GitHub repository and reported the final results for CIFAR-10-LT. Compared to the latest state-of-the-art work by BCL[1], our results are still 3% higher. We have also uploaded the latest paper on arXiv, and you can find it at the following link: Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

The experimental setup was as follows:

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 4

CIFAR-10-LT

MethodIFModelTop-1 Acc(%)
GLMC100ResNet-3287.75%
GLMC50ResNet-3290.18%
GLMC10ResNet-3294.04%
GLMC + MaxNorm100ResNet-3287.57%
GLMC + MaxNorm50ResNet-3290.22%
GLMC + MaxNorm10ResNet-3294.03%

[1] Jianggang Zhu, ZhengWang, Jingjing Chen, Yi-Ping Phoebe Chen, and Yu-Gang Jiang. Balanced contrastive learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6908–6917, 2022. 2, 3, 5, 6

πŸ’₯Meanwhile, We supplemented the experiment on iNaturelist2018 and achieved the state-of-the-art.

MethodModelManyMedFewAllmodel
GLMCResNeXt-5064.6073.1673.0172.21Download

Overview

<div align="center"><img src="https://user-images.githubusercontent.com/48430480/223947913-edbdd463-d6e1-4ae7-8e8d-b846c002a20d.png"></div>

An overview of our GLMC: two types of mixed-label augmented images are processed by an encoder network and a projection head to obtain the representation $h_g$ and $h_l$. Then a prediction head transforms the two representations to output $u_g$ and $u_l$. We minimize their negative cosine similarity as an auxiliary loss in the supervised loss. $sg(*)$ denotes stop gradient operation.

image

We propose an efficient one-stage training strategy for long-tailed visual recognition called Global and Local Mixture Consistency cumulative learning (GLMC). Our core ideas are twofold: (1) a global and local mixture consistency loss improves the robustness of the feature extractor. Specifically, we generate two augmented batches by the global MixUp and local CutMix from the same batch data, respectively, and then use cosine similarity to minimize the difference. (2) A cumulative head-tail soft label reweighted loss mitigates the head class bias problem. We use empirical class frequencies to reweight the mixed label of the head-tail class for long-tailed data and then balance the conventional loss and the rebalanced loss with a coefficient accumulated by epochs.

Getting Started

Requirements

All codes are written by Python 3.9 with

Preparing Datasets

Download the datasets CIFAR-10, CIFAR-100, ImageNet, and iNaturalist18 to GLMC-2023/data. The directory should look like

GLMC-2023/data
β”œβ”€β”€ CIFAR-100-python
β”œβ”€β”€ CIFAR-10-batches-py
β”œβ”€β”€ ImageNet
|   └── train
|   └── val
β”œβ”€β”€ train_val2018
└── data_txt
    └── ImageNet_LT_val.txt
    └── ImageNet_LT_train.txt
    └── iNaturalist18_train.txt
    └── iNaturalist18_val.txt
    

Training

for CIFAR-10-LT

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 1

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.02 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 1

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.1 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2 --label_weighting 1  --contrast_weight 2

for CIFAR-100-LT

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2  --contrast_weight 4

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.02 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2  --label_weighting 1.2  --contrast_weight 6

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.1 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.2  --label_weighting 1.2  --contrast_weight 4

for ImageNet-LT

python main.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --beta 0.5 --lr 0.1 --epochs 135 -b 120 --momentum 0.9 --weight_decay 2e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

for iNaturelist2018

python main.py --dataset iNaturelist2018 -a resnext50_32x4d --num_classes 8142 --beta 0.5 --lr 0.1 --epochs 120 -b 128 --momentum 0.9 --weight_decay 1e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

Testing

python test.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --resume model_path

Result and Pretrained models

CIFAR-10-LT

MethodIFModelTop-1 Acc(%)
GLMC100ResNet-3287.75%
GLMC50ResNet-3290.18%
GLMC10ResNet-3294.04%
GLMC + MaxNorm100ResNet-3287.57%
GLMC + MaxNorm50ResNet-3290.22%
GLMC + MaxNorm10ResNet-3294.03%

CIFAR-100-LT

MethodIFModelTop-1 Acc(%)
GLMC100ResNet-3255.88
GLMC50ResNet-3261.08
GLMC10ResNet-3270.74
GLMC + MaxNorm100ResNet-3257.11
GLMC + MaxNorm50ResNet-3262.32
GLMC + MaxNorm10ResNet-3272.33

ImageNet-LT

MethodModelManyMedFewAllmodel
GLMCResNeXt-5070.152.430.456.3Download
GLMC + BSResNeXt-5064.7655.6742.1957.21Download

iNaturelist2018

MethodModelManyMedFewAllmodel
GLMCResNeXt-5064.6073.1673.0172.21Download

Citation

If you find this code useful for your research, please consider citing our paper<br>

@inproceedings{
du2023global,
title={Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions},
author={Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang},
booktitle={Conference on Computer Vision and Pattern Recognition 2023},
year={2023},
url={https://arxiv.org/abs/2305.08661}
}