Home

Awesome

Audio-Visual Floorplan Reconstruction

AV-Map Model

This is the code accompanying the work:

Audio-Visual Floorplan Reconstruction<br/> Senthil Purushwalkam, Sebastian Vicenc Amengual Gari, Vamsi Krishna Ithapu, Carl Schissler, Philip Robinson, Abhinav Gupta, Kristen Grauman<br/> arXiv preprint arXiv:2012.15470 (2020) <br/> Webpage | Paper <br/>

Cite

If you find this repository useful in your own research, please consider citing both papers:

@article{purushwalkam2020audio,
  title={Audio-visual floorplan reconstruction},
  author={Purushwalkam, Senthil and Gari, Sebastian Vicenc Amengual and Ithapu, Vamsi Krishna and Schissler, Carl and Robinson, Philip and Gupta, Abhinav and Grauman, Kristen},
  journal={arXiv preprint arXiv:2012.15470},
  year={2020}
}

@inproceedings{chen2020soundspaces,
    title = {SoundSpaces: Audio-Visual Navigation in 3D Environments},
    author = {Chen, Changan and Jain, Unnat and Schissler, Carl and Gari, Sebastia Vicenc Amengual and Al-Halah, Ziad and Ithapu, Vamsi Krishna and Robinson, Philip and Grauman, Kristen},
    year = {2020},
    booktitle={ECCV},
}

Prerequisites

The code has been tested using Python v3.7.9, PyTorch v1.7.1 and Hydra v1.0.4.

Choose a project directory ($PROJ_DIR) where you plan to store all the data.

Downloading the data

We release the set of rendered images and audio clips that were used to train and test our models. This data can be downloaded from here. Untar this data in $PROJ_DIR.

Download the SoundSpaces dataset

We use the ambisonic impulse response data for Matterport3D released in the SoundSpaces dataset. Follow this link and download the dataset to the same $PROJ_DIR directory chosen above.

Setup

Edit the project_dir entry in configs/avmap/environment/default.yaml to point to the $PROJ_DIR containing the above downloaded datasets.

Training a model

Device Generated Audio Setting

# RGB + Audio Model 
PYTHONPATH=. python main.py logging.name=devgen_rgba  model.rgb_model.use_model=True model.audio_model.use_model=True data/audio_clip=freq_sweep_signal data.source_at_receiver=True
# RGB Only Model 
PYTHONPATH=. python main.py logging.name=devgen_rgb  model.rgb_model.use_model=True model.audio_model.use_model=False data/audio_clip=freq_sweep_signal data.source_at_receiver=True
# Audio Only Model 
PYTHONPATH=. python main.py logging.name=devgen_a  model.rgb_model.use_model=False model.audio_model.use_model=True data/audio_clip=freq_sweep_signal data.source_at_receiver=True

Environment Generated Audio Setting

# Set data.n_sources > 20 for all room setting
# RGB + Audio Model 
PYTHONPATH=. python main.py logging.name=envgen_rgba  model.rgb_model.use_model=True model.audio_model.use_model=True data/audio_clip=env_gen data.source_at_receiver=False data.n_sources=100
# Audio Only Model 
PYTHONPATH=. python main.py logging.name=envgen_a  model.rgb_model.use_model=False model.audio_model.use_model=True data/audio_clip=env_gen data.source_at_receiver=False  data.n_sources=100

Testing a model

Models can be tested using the same commands as above by appending environment.evaluate_path=<path_to_checkpoint> model.pool_steps=True at the end of the corresponding training command.