Home

Awesome

Annotation-free Audio-Visual Segmentation

Official implementation of Annotation-free Audio-Visual Segmentation .

This paper has been accepted by WACV 2024, the project page is https://jinxiang-liu.github.io/anno-free-AVS/.



Requirements

Installation

Create a conda environment and install dependencies:

conda create -n sama python=3.10.11
conda activate sama

pip install -r requirements.txt

Dataset

1. Download the datasets

2. Configure the dataset locations

After downloading the datasets with annotations, please declare the directory and file locations in the configs/sam_avs_adapter.yaml file.


Get Started

Evaluation

Model weights: All the weights including the image backbone from SAM, audio backbone for VGGish and our pretrained models are obtained with the OneDrive link.

Test

bash scripts/synthetic_test.sh
bash scripts/s4_test.sh
bash scripts/ms3_test.sh

Training

bash scripts/synthetic_train.sh
bash scripts/s4_train.sh
bash scripts/ms3_train.sh

Citation

@inproceedings{liu2024annotation,
  title={Annotation-free audio-visual segmentation},
  author={Liu, Jinxiang and Wang, Yu and Ju, Chen and Ma, Chaofan and Zhang, Ya and Xie, Weidi},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={5604--5614},
  year={2024}
}

Contact

If you have any question, feel free to contact jinxliu#sjtu.edu.cn (replace # with @).