Awesome
HANNA: Visual Navigation with Natural Multimodal Assistance
<img src="teaser/pytorch-logo-dark.png" width="10%">
EMNLP'19 Paper: Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Authors: Khanh Nguyen, Hal Daumé III
UPDATE Oct 15, 2019: fix a bug in the validation code that prevented the code from reproducing results in the paper.
What is HANNA?
HANNA is an interactive photo-realistic simulator that mimics an agent fulfilling object-finding tasks by leveraging natural language-and-vision assistance.
<p align="center"> <img src="teaser/example.png" alt="IMAGE ALT TEXT HERE" width="1000" border="10"/> An example HANNA task. </p>How is HANNA different from other visual navigation tasks?
<p align="center"> <img src="teaser/difference.png" alt="IMAGE ALT TEXT HERE" width="500" border="10"/>Comparing HANNA with VLN <a href="https://arxiv.org/abs/1711.07280">(Anderson et al., 2018b)</a>, EQA <a href="https://arxiv.org/abs/1904.03461">(Wijmans et al., 2019)</a>, VNLA <a href="https://arxiv.org/abs/1812.04155">(Nguyen et al., 2019)</a>, CVDN <a href="https://arxiv.org/abs/1907.04957">(Thomason et al., 2019)</a>.
</p>Let's play with HANNA!
git clone --recursive https://github.com/khanhptnk/hanna.git
(don't forget therecursive
flag!)- Download data.
- Setup simulator.
- Run experiments.
Citation
If you use the code or data in this repo, please cite our paper using the following bibtex code
@inproceedings{nguyen2019hanna,
author = {Nguyen, Khanh and Daum{\'e} III, Hal},
title = {Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {November},
year = {2019},
}