Awesome
Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
This is the code for the paper ICCV 2019 Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks in Pytorch.
Dependencies
- cuda 9.0
- Python 3.6
- Pytorch 0.4.1.
Dataset setup
CPN 2D detections for Human3.6 M datasets are provided by VideoPose3D by Pavllo etal., which can be downloaded by:
cd data
wget https://dl.fbaipublicfiles.com/video-pose-3d/data_2d_h36m_cpn_ft_h36m_dbb.npz
wget https://dl.fbaipublicfiles.com/video-pose-3d/data_2d_h36m_detectron_ft_h36m.npz
cd ..
3D labels and ground truth can be downloaded and put in data/ folder 3d gt labels
Download pretrained model
Pretrained models can be found in pretrained models, pls download it and put in the ckpt/ dictory(create it if it does not exist)
Test the Model
To test on Human3.6M on single frame, run:
python main_graph.py --pad 0 --show_protocol2 --post_refine --stgcn_reload 1 --post_refine_reload 1 --previous_dir '/ckpt/1_frame/cpn/' --stgcn_model 'model_st_gcn_36_eva_post_5062.pth' --post_refine_model 'model_post_refine_36_eva_post_5062.pth'
To test on Human3.6M on 3-frames, run:
python main_graph.py --pad 1 --show_protocol2 --post_refine --stgcn_reload 1 --post_refine_reload 1 --previous_dir '/ckpt/3_frame/cpn/' --stgcn_model 'model_st_gcn_58_eva_post_4903.pth' --post_refine_model 'model_post_refine_58_eva_post_4903.pth'
Train the Model
To train on Human3.6M with 3-frame, run:
python main_graph.py --pad 1 --pro_train 1 --save_model 1
After training for several epoches, add post_refine part
python main_graph.py --pad 1 --pro_train 1 --post_refine --save_model 1 --learning_rate 1e-5 --sym_penalty 1 --co_diff 1 --stgcn_reload 1 --previous_dir [your model saved path] --stgcn_model [your pretrained model]
Citation
@inproceedings{cai2019exploiting,
title={Exploiting spatial-temporal relationships for 3d pose estimation via graph convolutional networks},
author={Cai, Yujun and Ge, Liuhao and Liu, Jun and Cai, Jianfei and Cham, Tat-Jen and Yuan, Junsong and Thalmann, Nadia Magnenat},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
pages={2272--2281},
year={2019}
}
Acknowledgements
Some of our implementation code/preprocessed data was adapted from VideoPose3D by Pavllo et al., st-gcn by Yansijie et al., simple-yet-effective baseline by Julia et al.,Non-local neural networks. Thanks for their help!
Licence
MIT