Home

Awesome

Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts

This is the source code for the paper: Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts (ICML 2024).

Requirements

To install requirements, run:

conda create -n lift python=3.8 -y
conda activate lift
conda install pytorch==2.0.0 torchvision==0.15.0 pytorch-cuda=11.7 -c pytorch -c nvidia
conda install tensorboard
pip install -r requirements.txt

We encourage installing the latest dependencies. If there are any incompatibilities, please install the dependencies with the following versions.

numpy==1.24.3
scipy==1.10.1
scikit-learn==1.2.1
yacs==0.1.8
tqdm==4.64.1
ftfy==6.1.1
regex==2022.7.9
timm==0.6.12

Hardware

Most experiments can be reproduced using a single GPU with 20GB of memory (larger models such as ViT-L require more memory).

Quick Start on the CIFAR-100-LT dataset

# run LIFT on CIFAR-100-LT (with imbalanced ratio=100)
python main.py -d cifar100_ir100 -m clip_vit_b16 adaptformer True

By running the above command, you can automatically download the CIFAR-100 dataset and run the method (LIFT).

Running on Large-scale Long-tailed Datasets

Prepare the Dataset

Download the dataset Places, ImageNet, and iNaturalist 2018.

Put files in the following locations and change the path in the data configure files in configs/data:

Path/To/Dataset
├─ train
│  ├─ airfield
|  |  ├─ 00000001.jpg
|  |  └─ ......
│  └─ ......
└─ val
   ├─ airfield
   |  ├─ Places365_val_00000435.jpg
   |  └─ ......
   └─ ......
Path/To/Dataset
├─ train
│  ├─ n01440764
|  |  ├─ n01440764_18.JPEG
|  |  └─ ......
│  └─ ......
└─ val
   ├─ n01440764
   |  ├─ ILSVRC2012_val_00000293.JPEG
   |  └─ ......
   └─ ......
Path/To/Dataset
└─ train_val2018
   ├─ Actinopterygii
   |  ├─ 2229
   |  |  ├─ 2c5596da5091695e44b5604c2a53c477.jpg
   |  |  └─ ......
   |  └─ ......
   └─ ......

Reproduction

To reproduce the main result in the paper, please run

# run LIFT on ImageNet-LT
python main.py -d imagenet_lt -m clip_vit_b16 adaptformer True

# run LIFT on Places-LT
python main.py -d places_lt -m clip_vit_b16 adaptformer True

# run LIFT on iNaturalist 2018
python main.py -d inat2018 -m clip_vit_b16 adaptformer True num_epochs 20

For other experiments, please refer to scripts for reproduction commands.

Detailed Usage

To train and test the proposed method on more settings, run

python main.py -d [data] -m [model] [options]

The [data] can be the name of a .yaml file in configs/data, including imagenet_lt, places_lt, inat2018, cifar100_ir100, cifar100_ir50, cifar100_ir10, etc.

The [model] can be the name of a .yaml file in configs/model, including clip_rn50, clip_vit_b16, in21k_vit_b16, etc.

Note that using only -d and -m options denotes only fine-tuning the classifier. Please use additional [options] for more settings.

Moreover, [options] can facilitate modifying the configure options in utils/config.py. Following are some examples.

You can also refer to scripts for example commands.

Acknowledgment

We thank the authors for the following repositories for code reference: [OLTR], [Classifier-Balancing], [Dassl], [CoOp].

Citation

If you find this repo useful for your work, please cite as:

@inproceedings{shi2024longtail,
  title={Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts},
  author={Jiang-Xin Shi and Tong Wei and Zhi Zhou and Jie-Jing Shao and Xin-Yan Han and Yu-Feng Li},
  booktitle={Proceedings of the 41st International Conference on Machine Learning},
  year={2024}
}