Home

Awesome

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception

Junyu Gao, Mengyuan Chen, Changsheng Xu

Code for CVPR 2023 paper Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception

Paper Overview

Weakly-supervised Audio-Visual Video Parsing

<img src="./graph/task.png" style="width:70%; display: block; margin: auto">

Overview of CMPAE

<img src="./graph/framework_corrected.png" style="width:80%; display: block; margin: auto"> **Typo**: It should be noted that in the framework graph of the paper, we incorrectly labeled the name of "Absence/Presence Evidence Collecter". Here's the correct version. We are sorry for the typo.

Get Started

Dependencies

Here we list our used requirements and dependencies.

Prepare data

  1. Please download the preprocessed audio and visual features from https://github.com/YapengTian/AVVP-ECCV20.
  2. Put the downloaded features into data/feats/, and put the annotation files into data/annotations/.

Train your own models

Run ./train.sh.

Test the pre-trained model

Download the checkpoint file from Google Drive, and put it into save/pretrained/. Then run ./test.sh.

Citation

If you find the code useful in your research, please consider citing it:

@inproceedings{junyu2023CVPR_CMPAE,
  author = {Gao, Junyu and Chen, Mengyuan and Xu, Changsheng},
  title = {Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2023}
}

License

See MIT License

Acknowledgement

This repo contains modified codes from:

We sincerely thank the owners of the great repos!