Awesome
Generative Multi-modal Models are Good Class Incremental Learners
This is the official code for our CVPR paper: <a href='https://arxiv.org/abs/2403.18383.pdf'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
Getting Started
Installation
Git clone our repository, creating a python environment and activate it via the following command
git clone https://github.com/DoubleClass/GMM
cd GMM
conda env create -f env_GMM.yaml
conda activate GMM
pip install git+https://github.com/openai/CLIP.git
Vicuna
You can get the LLM Vicuna in huggingface-vicuna-7b, or you can Dowload it via baidunetdisk. (code: s3pu) or google-drive
Then set the downloaded vicuna folder path here and the initial checkpoint here
EVA_VIT_G
The code will automatically downloading the eva_vit_g.pth, we alse put it here or huggingface, you can manually download it and put it in the cache dir: .cache/torch/hub/checkpoints
bert-base-uncased
The code will automatically downloading this, but in case you don't have access to huggingface, we also put it here , you can manually download it and alse put it in cache dir: .cache/huggingface/hub/models--bert-base-uncased
datasets
ImageNet-R
You can download it here
Then set the dataset folder path here
Besides, you need to customize the dataset for the GPT fine-tuning process. We prepare a example here you can follow: download or google-dirve
After downloaded the customized dataset, you can set the data root path here.
Training
After setting all model and dataset config, you can run the following command to start fine-tuning.
python train.py --cfg-path train_configs/minigpt4_stage2_finetune.yaml
Testing
After training, you will get a model checkpoint of the last continual learning stage. put the path to scipts in eval_all.sh and specify a results directory.
Then set the results path in the get_score_all.py
Run the script:
bash eval_all.sh
Citation
If you find our repo useful for your research, please consider citing our paper:
@article{cao2024GMM,
title={Generative Multi-modal Models are Good Class Incremental Learners},
author={Cao, Xusheng and Lu, Haori and Huang, Linlan and Liu, Xialei and Cheng, Ming-Ming},
journal={IEEE Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
License
This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.
Contact
For technical questions, please contact <a href="caoxusheng@mail.nankai.edu.cn">caoxusheng@mail.nankai.edu.cn</a>