Awesome
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
Here is the official Pytorch implementation of AGM proposed in "Boosting Multi-modal Model Performance with Adaptive Gradient Modulation".
Paper Title: Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
Authors: Hong Li<sup> * </sup>, Xingyu Li<sup> * </sup>, Pengbo Hu, Yinuo Lei, Chunxiao Li, Yi Zhou
Accepted by: ICCV 2023
Dataset
1. AV-MNIST
This dataset can be downloaded from here.
2. CREMA-D
This dataset can be downloaded from here. Data preprocessing can refer to here.
3. UR-Funny
This raw dataset can be downloaded from here. Also, the processed data can be obtained from here.
4. AVE
This dataset can be downloaded from here.
5. CMU-MOSEI
This dataset can be downloaded from here.
Training
Environment config
- Python: 3.9.13
- CUDA Version: 11.3
- Pytorch: 1.12.1
- Torchvision: 0.13.1
Train
To train the model using the following command:
python main.py --data_root '' --device cuda:0 --methods Normal --modality Multimodal --fusion_type late_fusion --random_seed 999 --expt_dir checkpoint --expt_name test --batch_size 64 --EPOCHS 100 --learning_rate 0.0001 --dataset AV-MNIST --alpha 2.5 --SHAPE_contribution False
Citation
@inproceedings{li2023boosting,
title={Boosting Multi-modal Model Performance with Adaptive Gradient Modulation},
author={Li, Hong and Li, Xingyu and Hu, Pengbo and Lei, Yinuo and Li, Chunxiao and Zhou, Yi},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={22214--22224},
year={2023}
}