Awesome

HOP-VLN-finetune

This respository is the finetune code of HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation. The code is based on Recurrent-VLN-BERT. Thanks to Yicong Hong for releasing the Recurrent-VLN-BERT code.

Prerequisites

Installation

Install docker Please check here to install docker.

Create container To pull the image:

docker pull starrychiao/hop-recurrent:v1

If your CUDA version is 11.3, you can pull the image:

docker pull starrychiao/vlnbert-2022-3090:1.0

To create the container:

docker run -it --ipc host  --shm-size=1024m --gpus all --name your_name  --volume "your_directory":/root/mount/Matterport3DSimulator starrychiao/hop-recurrent:v1

or (if you pull the image for cuda 11.3)

docker run -it --ipc host  --shm-size=1024m --gpus all --name your_name  --volume "your_directory":/root/mount/Matterport3DSimulator starrychiao/vlnbert-2022-3090:1.0

Set up

docker start "your container id or name"
docker exec -it "your container id or name" /bin/bash
cd /root/mount/Matterport3DSimulator

Download the trained models.

R2R

cd finetune_r2r

Data Preparation

Please follow the instructions below to prepare the data in directories:

MP3D navigability graphs: connectivity
- Download the connectivity maps .
MP3D image features: img_features
- Download the Scene features (ResNet-152-Places365).
R2R data: data
- Download the R2R data [5.8MB].
Augmented data: data/prevalent
- Download the collected triplets in PREVALENT [1.5GB] (pre-processed for easy use).

Initial HOP weights

Pre-trained HOP weights: load_model/checkpoint
- Download the pytorch_model.bin from here.

Training

bash run/train_agent.bash

Evaluating

bash run/test_agent.bash

NDH

cd finetune_ndh

Data Preparation

Please follow the instructions below to prepare the data in directories:

MP3D navigability graphs: connectivity
- Download the connectivity maps .
MP3D image features: img_features
- Download the Scene features (ResNet-152-Places365).

Initial HOP weights

Pre-trained HOP weights for NDH: load/model
- Download the pytorch_model.bin from here.

Training

bash run/train.bash

Evaluating

bash run/test.bash

Citation

If you use or discuss our HOP, please cite our paper:

@InProceedings{Qiao2022HOP,
    author    = {Qiao, Yanyuan, Qi Yuankai, Hong, Yicong, Yu, Zheng, Wang, Peng and Wu, Qi},
    title     = {HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {15418-15427}
}