Awesome
semantic_nav
asdasdasd Visual Representation for Semantic Target Driven Navigation
Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, James Davidson
arXiv 2018
<div align="center"> <table style="width:100%" border="0"> <tr> <td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_fridge_2.gif'></td> <td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_tv_1.gif'></td> </tr> <tr> <td align="center">Target: Fridge</td> <td align="center">Target: Television</td> </tr> <tr> <td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_microwave_1.gif'></td> <td align="center"><img src='https://cs.gmu.edu/~amousavi/gifs/smaller_couch_1.gif'></td> </tr> <tr> <td align="center">Target: Microwave</td> <td align="center">Target: Couch</td> </tr> </table> </div>ArXiv: https://arxiv.org/abs/1805.06066
1. Installation
Requirements
Python Packages
networkx
gin-config
Download semantic_nav
git clone --depth 1 https://github.com/tensorflow/models.git
2. Datasets
Download ActiveVisionDataset
We used Active Vision Dataset (AVD) which can be downloaded from <a href="http://cs.unc.edu/~ammirato/active_vision_dataset_website/">here</a>. To make our code faster and reduce memory footprint, we created the AVD Minimal dataset. AVD Minimal consists of low resolution images from the original AVD dataset. In addition, we added annotations for target views, predicted object detections from pre-trained object detector on MS-COCO dataset, and predicted semantic segmentation from pre-trained model on NYU-v2 dataset. AVD minimal can be downloaded from <a href="https://console.cloud.google.com/storage/browser/active-vision-dataset">here</a>. Set $AVD_DIR
as the path to the downloaded AVD Minimal.
ActiveVisionDataset Demo
If you wish to navigate the environment, to see how the AVD looks like you can use the following command:
python viz_active_vision_dataset_main -- \
--mode=human \
--gin_config=envs/configs/active_vision_config.gin \
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'
3. Training
Right now, the released version only supports training and inference using the real data from Active Vision Dataset.
Run training
Use the following command for training:
# Train
python train_supervised_active_vision.py \
--mode='train' \
--logdir=$CHECKPOINT_DIR \
--modality_types='det' \
--batch_size=8 \
--train_iters=200000 \
--lstm_cell_size=2048 \
--policy_fc_size=2048 \
--sequence_length=20 \
--max_eval_episode_length=100 \
--test_iters=194 \
--gin_config=envs/configs/active_vision_config.gin \
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
--logtostderr
Run Evaluation
Use the following command for unrolling the policy on the eval environments. The inference code periodically check the checkpoint folder for new checkpoints to use it for unrolling the policy on the eval environments. After each evaluation, it will create a folder in the $CHECKPOINT_DIR/evals/$ITER where $ITER is the iteration number at which the checkpoint is stored.
# Eval
python train_supervised_active_vision \
--mode='eval' \
--logdir=$CHECKPOINT_DIR \
--modality_types='det' \
--batch_size=8 \
--train_iters=200000 \
--lstm_cell_size=2048 \
--policy_fc_size=2048 \
--sequence_length=20 \
--max_eval_episode_length=100 \
--test_iters=194 \
--gin_config=envs/configs/active_vision_config.gin \
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
--logtostderr
At any point, you can run the following command to compute statistics such as success rate over all the evaluations so far. It also generates gif images for unrolling of the best policy.
# Visualize and Compute Stats
python viz_active_vision_dataset_main.py \
--mode=eval \
--eval_folder=$CHECKPOINT_DIR/evals/ \
--output_folder=$OUTPUT_GIFS_FOLDER \
--gin_config=envs/configs/active_vision_config.gin \
--gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'
Reference
If you find our work useful in your research please consider citing our paper:
@inproceedings{MousavianArxiv18,
author = {A. Mousavian and A. Toshev and M. Fiser and J. Kosecka and J. Davidson},
title = {Visual Representations for Semantic Target Driven Navigation},
booktitle = {arXiv:1805.06066},
year = {2018},
}