Awesome
Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu and Limin Wang
Code for ECCV 2022 paper Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Paper Overview
Modality-specific label noise
<img src="./static/intro.png" style="width:70%; display: block; margin: auto">The procedure of modality-specific label denoising
<img src="./static/method.png" style="width:70%; display: block; margin: auto">The results on LLP dataset
<img src="./static/results.png" style="width:70%; display: block; margin: auto">Get Started
Prepare data
- Please download the preprocessed audio and visual features from https://github.com/YapengTian/AVVP-ECCV20.
- Put the downloaded features into data/feats/.
Train the model
1.Train noise estimator:
python main.py --mode train_noise_estimator --save_model true --model_save_dir ckpt --checkpoint noise_estimater.pt
2.Calculate noise ratios:
python main.py --mode calculate_noise_ratio --model_save_dir ckpt --checkpoint noise_estimater.pt --noise_ratio_file noise_ratios.npz
3.Train model with label denoising:
python main.py --mode train_label_denoising --save_model true --model_save_dir ckpt --checkpoint JoMoLD.pt --noise_ratio_file noise_ratios.npz
Test
We provide the pre-trained JoMoLD checkpoint for evaluation. Please download and put the checkpoint into "./ckpt" directory and use the following command to test:
python main.py --mode test_JoMoLD --model_save_dir ckpt --checkpoint JoMoLD.pt
Citation
If you find this work useful, please consider citing it.
<pre><code>@article{cheng2022joint, title={Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing}, author={Cheng, Haoyue and Liu, Zhaoyang and Zhou, Hang and Qian, Chen and Wu, Wayne and Wang, Limin}, journal={Proceedings of the European Conference on Computer Vision (ECCV)}, year={2022} } </code></pre>