Awesome
Pose-based Modular Network for Human-Object Interaction Detection
<!---------------------------------------------------------------------------------------------------------------->Official Pytorch implementation for Pose-based Modular Network for Human-Object Interaction Detection.
<!---------------------------------------------------------------------------------------------------------------->Code Overview
In this project, we implement our method based on VS-GATs. The structure of the code in this project is similar to VS-GATs. You can check it for the description of each file.
<!---------------------------------------------------------------------------------------------------------------->Getting Started
Prerequisites
This codebase was tested with Python 3.6, Pytorch 1.1.0, torchvision 0.3, CUDA 10.0, Ubuntu 16.04.
Installation
-
Clone this repository.
git clone https://github.com/birlrobotics/PMN.git
-
Install Python dependencies:
pip install -r requirements.txt
Prepare Data
Download Original Data (Optional)
- Download the original HICO-DET dataset and put it into
datasets/hico
. - Follow here to prepare the original data of V-COCO dataset in
datasets/vcoco
folder. - (For VS-GATs) Download the pretrain word2vec model on GoogleNews and put it into
./datasets/word2vec
Download the Processed Data
- Download our processed data for HICO-DET and V-COCO and put them into
datasets/processed
with the original file name.
Download the Pretrained Model of VS-GATs
- In our method, we build our module based on the VS-GATs which is fixed when training. Download the pretrained model of VS-GATs for HICO-DET and V-COCO and put them into
./checkpoints
with the original file name.
Training
-
On HICO-DET dataset:
python hico_train.py --exp_ver='hico_pmn' --b_s=32 --d_p=0.2 --bn='true' --n_layers=1 --b_l 0 3 --lr=3e-5
-
Similarly, for V-COCO datset:
python vcoco_train.py --exp_ver='vcoco_pmn' --b_s=32 --d_p=0.2 --bn='true' --n_layers=1 --b_l 0 3 --o_c_l 64 64 64 64 --lr=3e-5
-
You can visualized the training process through tensorboard:
tensorboard --logdir='log/'
. -
Checkpoints will be saved in
checkpoints/
folder.
Testing
-
Run the following script: option 'final_ver' means the name of which experiment and 'path_to_the_checkpoint_file' means where the checkpoint file is. (You can use the checkpoint of HICO-DET and V-COCO to reproduce the detection results in our paper.).
bash hico_eval.sh 'final_ver' 'path_to_the_checkpoint_file'
-
For V-COCO dataset, you first need to cover the original
./datasets/vcoco/vsrl_eval.py
with the new one in./result/vsrl_eval.py
because we add some codes to save the detection results. Then run:python vcoco_eval.py -p='path_to_the_checkpoint_file'
-
Results will be saved in
result/
folder.
Results
- Please check the paper for the quantitative results and several qualitative detection results are as follow:
Acknowledgemen
In this project, some codes which process the data and eval the model are built upon VS-GATs: Visual-Semantic Graph Attention Networks for Human-Object Interaction Detecion, ECCV2018-Learning Human-Object Interactions by Graph Parsing Neural Networks and ICCV2019-No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques. Thanks them for their great works.