Home

Awesome

Finding Fallen Objects

Official implementation of CVPR 2022 paper "Finding Fallen Objects Via Asynchronous Audio-Visual Integration".

Usage

Data

Download the dataset from here, and extract it in the project root.

The dataset sub-directory contains the necessary information of a case to be loaded into our environment. The .wav files within it are the recorded audio of object falling in each case.

The perception sub-directory contains some information helpful for utilizing our environment. Each .json file contains several fields for the case.

Prerequisite

The environment is based on TDW. We tested it on version 1.8.29, which you can download TDW_Linux.tar.gz from here.

You should follow this to install NVIDIA and X on your linux server. If you need to run this environment in docker, you need also install nvidia-docker following this.

After downloading TDW_Linux.tar.gz, extract it into the docker directory. The executable TDW should be located at docker/TDW/TDW.x86_64.

tdw environment setup:

conda create -n tdw
conda activate tdw
pip install gym pyastar magnebot==1.3.2 tdw==1.8.29

planner environment setup:

conda create -n planner
conda activate planner
pip install librosa scikit-image pystar2d docker-compose tdw==1.8.29
pip install 'git+https://github.com/facebookresearch/detectron2.git'
cd env/openai_baselines
pip install -e .

Launch the environment

Launch

You can then launch the environment via

conda activate tdw
python interface.py --display=<display> --split=<split> --port=<port>

Validate

You can use the docker/test.py script to validate the installation for either case. Use port 2590 when launching, or you should edit it in the test script.

The environment will output some information in env_log/ after each case.

obs contains following entries:

info contains following entries:

Use the following numbers for action

If you want to run multiple environments in parralel, e.g. for training, we borrow the code from openai/baselines (slightly modified) so that you can run:

from env.envs import make_vec_envs
envs = make_vec_envs('find_fallen-v0', num_processes, log_dir, device, True, spaces=(observation_space, action_space), port=<port>, displays=<displays>, split='train')
obs, info = envs.reset()
obs, reward, done, info = envs.step([5 for _ in range(num_processes)])

Notes: In this case, if a case is done, the obs and info returned by step will be the initial status of the next case.

It will use port numbers [port, port + num_processes), and use X displays in displays (it should be a list of strings such as [":4", ":5"]). A single X display can be used for multiple instances simultaneously, so the length of displays can be smaller than num_processes.

Baseline

We provide the code of our modular planner in baseline/planner. Run it with (replace :4 :5 with your available X displays). You can download the pretrained modular models here and place them in <project root>/pretrained.

conda activate planner
python baseline/planner/main_planner.py --displays :4 :5 --num-processes=1

Evaluation

You can evaluate the result (SR, SPL, SNA) by putting script into the env_log folder and run

python eval.py

you can replace "non_distractor" with "distractor"