Home

Awesome

Boosting Multi-modal Model Performance with Adaptive Gradient Modulation

Here is the official Pytorch implementation of AGM proposed in "Boosting Multi-modal Model Performance with Adaptive Gradient Modulation".

Paper Title: Boosting Multi-modal Model Performance with Adaptive Gradient Modulation

Authors: Hong Li<sup> * </sup>, Xingyu Li<sup> * </sup>, Pengbo Hu, Yinuo Lei, Chunxiao Li, Yi Zhou

Accepted by: ICCV 2023

[arXiv] [ICCV Proceedings]

Dataset

1. AV-MNIST

This dataset can be downloaded from here.

2. CREMA-D

This dataset can be downloaded from here. Data preprocessing can refer to here.

3. UR-Funny

This raw dataset can be downloaded from here. Also, the processed data can be obtained from here.

4. AVE

This dataset can be downloaded from here.

5. CMU-MOSEI

This dataset can be downloaded from here.

Training

Environment config

  1. Python: 3.9.13
  2. CUDA Version: 11.3
  3. Pytorch: 1.12.1
  4. Torchvision: 0.13.1

Train

To train the model using the following command:

python main.py --data_root '' --device cuda:0 --methods Normal --modality Multimodal --fusion_type late_fusion --random_seed 999 --expt_dir checkpoint --expt_name test --batch_size 64 --EPOCHS 100 --learning_rate 0.0001 --dataset AV-MNIST --alpha 2.5 --SHAPE_contribution False

Citation

@inproceedings{li2023boosting,
  title={Boosting Multi-modal Model Performance with Adaptive Gradient Modulation},
  author={Li, Hong and Li, Xingyu and Hu, Pengbo and Lei, Yinuo and Li, Chunxiao and Zhou, Yi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={22214--22224},
  year={2023}
}