Awesome
PREVALENT_R2R
Apply PREVALENT pretrained model on R2R task. I am clearing the redundant code and classes. But you should still be able to run the code now.
Requirements
OS: Ubuntu docker image: vlnres/mattersim:v5
1 install docker and set nvidia-docker
-
To install docker please check here
-
To setup docker to use GPU run:
sh nvidia-container-runtime-script.sh
2 create container
- To pull the image:
docker pull vlnres/mattersim:v5
- To create the container:
docker run -it --gpus 1 --volume "your_work_directory":/root/mount/Matterport3DSimulator vlnres/mattersim:v5
3 Set up (for some missing dependencies)
docker start “your container id or container name”
docker exec -it “your container id or container name” /bin/bash
cd /root/mount/Matterport3DSimulator
pip install --user pytorch-transformers==1.2.0
pip install --user tensorboardX
Train Agent for R2R
- 1 We follow the same training schedule as here. You can train your own speaker and initial back translation agent. Alternatively, you can use provided speaker and initial agent.
- 2 Make sure you already put pretrained_model under
./pretrained_hug_models/dicadd
, initial agent under./previous_btbert_agent
and trained speaker undersnap/speaker/
- 3 Run the following example command (change the directory name accordingly)
CUDA_VISIBLE_DEVICES=0 python r2r_src/train.py --attn soft --train auglistener --selfTrain --aug tasks/R2R/data/aug_paths.json --speaker snap/speaker/state_dict/best_val_unseen_bleu --load previous_btbert_agent/temp/best_val_unseen --pretrain_model_name ./pretrained_hug_models/dicadd/checkpoint-12864 --angleFeatSize 128 --accumulateGrad --featdropout 0.4 --feedback sample --subout max --optim rms --lr 0.00002 --iters 100000 --maxAction 35 --encoderType Dic --batchSize 20 --include_vision True --use_dropout_vision True --d_enc_hidden_size 1024 --critic_dim 1024 --name cvpr_agent
You can also start fine-tuning based on previous snapshot by following command. Based on our observation, continue training on previous snapshot and reduce learning rate correspondingly would be helpful.
CUDA_VISIBLE_DEVICES=0 python r2r_src/train.py --attn soft --train auglistener --selfTrain --aug tasks/R2R/data/aug_paths.json --speaker snap/speaker/state_dict/best_val_unseen_bleu --load previous_btbert_agent/temp/best_val_unseen --pretrain_model_name ./pretrained_hug_models/dicadd/checkpoint-12864 --angleFeatSize 128 --accumulateGrad --featdropout 0.4 --feedback sample --subout max --optim rms --lr 0.000002 --iters 100000 --maxAction 35 --encoderType Dic --batchSize 20 --include_vision True --use_dropout_vision True --d_enc_hidden_size 1024 --critic_dim 1024 --d_update_add_layer True --name finetune_cvpr_agent
Note: if you come with the cudnn error: CUDNN_STATUS_EXECUTION_FAILED, uninstall torchvision and reinstall torchvision=0.3.0 , eg:
conda uninstall torchvision
conda install torchvision=0.3.0
Train Agent for NDH
python tasks/NDH/ndhtrain.py --path_type player_path --history all --feedback 'sample' --encoder_type 'vlbert’ --eval_type 'val' --batch_size 5 --pretrain_model_name ./pretrained_hug_models/dicadd/checkpoint-12864 --learning_rate 0.0005 --n_iters 20000 --vl_layers 4 --la_layers 9