Awesome
Image-Play
Code for reproducing the results in the following paper:
This repo, together with skeleton2d3d and pose-hg-train (branch image-play
), hold the code for reproducing the results in the following paper:
Forecasting Human Dynamics from Static Images
Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
Check out the project site for more details.
Role
-
The main content of this repo is for implementing training step 3 (Sec. 3.3), i.e. training the full 3D-PFNet (hourglass + RNNs + 3D skeleton converter).
-
For the implementation of training step 1, please refer to submodule pose-hg-train (branch
image-play
). -
For the implementation of training step 2, please refer to submodule skeleton2d3d.
Citing Image-Play
Please cite Image-Play if it helps your research:
@INPROCEEDINGS{chao:cvpr2017,
author = {Yu-Wei Chao and Jimei Yang and Brian Price and Scott Cohen and Jia Deng},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
title = {Forecasting Human Dynamics from Static Images},
year = {2017},
}
Clone the Repository
This repo contains three submodules (pose-hg-train
, skeleton2d3d
, and Deep3DPose
), so make sure you clone with --recursive
:
git clone --recursive https://github.com/ywchao/image-play.git
Contents
- Download Pre-Computed Models and Prediction
- Dependencies
- Setting Up Penn Action
- Training to Forecast 2D Pose
- Training to Forecast 3D Pose
- Comparison with NN Baselines
- Evaluation
- Human Character Rendering
Download Pre-Computed Models and Prediction
If you just want to run prediction or evaluation, you can simply download the pre-computed models and prediction (2.4G) and skip the training sections.
./scripts/fetch_implay_models_prediction.sh
./scripts/setup_symlinks_models.sh
This will populate the exp
folder with precomputed_implay_models_prediction
and set up a set of symlinks.
You can now set up Penn Action and run the evaluation demo with the downloaded prediction. This will ensure exact reproduction of the paper's results.
Dependencies
To proceed to the remaining content, make sure the following are installed.
- Torch7
- We used commit bd5e664 (2016-10-17) with CUDA 8.0.27 RC and cuDNN v5.1 (cudnn-8.0-linux-x64-v5.1).
- All our models were trained on a GeForce GTX TITAN X GPU.
- matio-ffi
- torch-hdf5
- MATLAB
- Blender
- This is only required for human character rendering.
- We used release blender-2.78a-linux-glibc211-x86_64 (2016-10-26).
Setting Up Penn Action
The Penn Action dataset is used for training and evaluation.
-
Download the Penn Action dataset to
external
.external
should containPenn_Action.tar.gz
. Extract the files:tar zxvf external/Penn_Action.tar.gz -C external
This will populate the
external
folder with a folderPenn_Action
withframes
,labels
,tools
, andREADME
. -
Preprocess Penn Action by cropping the images:
matlab -r "prepare_penn_crop; quit"
This will populate the
data/penn-crop
folder withframes
andlabels
. -
Generate validation set and preprocess annotations:
matlab -r "generate_valid_penn; quit" python tools/preprocess.py
This will populate the
data/penn-crop
folder withvalid_ind.txt
,train.h5
,val.h5
, andtest.h5
. -
Optional: Visualize statistics:
matlab -r "vis_data_stats; quit"
The output will be saved in
output/vis_dataset
. -
Optional: Visualize annotations:
matlab -r "vis_data_anno; quit"
The output will be saved in
output/vis_dataset
. -
Optional: Visualize frame skipping. As mentioned in the paper (Sec 4.1), we generated training and evaluation sequences by skipping frames. The following MATLAB script visualizes a subset of the generated sequences after frame skipping:
matlab -r "vis_action_phase; quit"
The output will be saved in
output/vis_action_phase
.
Training to Forecast 2D Pose
We begin with training a minimal model (hourglass + RNNs) which does just 2D pose forecasting.
-
Before starting, make sure to remove the symlinks from the download section, if any:
find exp -type l -delete
-
Obtain a trained hourglass model. This is done with the submodule
pose-hg-train
.Option 1: Download pre-computed hourglass models (50M): (recommended)
cd pose-hg-train ./scripts/fetch_hg_models.sh ./scripts/setup_symlinks_models.sh cd ..
This will populate the
pose-hg-train/exp
folder withprecomputed_hg_models
and set up a set of symlinks.Option 2: Train your own models.
-
Start training:
./scripts/penn-crop/hg-256-res-clstm.sh $GPU_ID
The output will be saved in
exp/penn-crop/hg-256-res-clstm
. -
Optional: Visualize training loss and accuracy:
matlab -r "exp_name = 'hg-256-res-clstm'; plot_loss_err_acc; quit"
The output will be saved to
output/plot_hg-256-res-clstm.pdf
. -
Optional: Visualize prediction on a subset of the test set:
matlab -r "vis_preds_2d; quit"
The output will be saved in
output/vis_hg-256-res-clstm
.
Training to Forecast 3D Pose
Now we train the full 3D-PFNet (hourglass + RNNs + 3D skeleton converter), which also converts each 2D pose into 3D.
-
Obtain a trained hourglass model if you have not (see the section above).
-
Obtain a trained 3d skeleton converter. This is done with the submodule
skeleton2d3d
.Option 1: Download pre-computed s2d3d models (108M): (recommended)
cd skeleton2d3d ./scripts/fetch_s2d3d_models_prediction.sh ./scripts/setup_symlinks_models.sh cd ..
This will populate the
skeleton2d3d/exp
folder withprecomputed_s2d3d_models_prediction
and set up a set of symlinks. -
Start training:
./scripts/penn-crop/hg-256-res-clstm-res-64.sh $GPU_ID
The output will be saved in
exp/penn-crop/hg-256-res-clstm-res-64
. -
Optional: Visualize training loss and accuracy:
matlab -r "exp_name = 'hg-256-res-clstm-res-64'; plot_loss_err_acc; quit"
The output will be saved to
output/plot_hg-256-res-clstm-res-64.pdf
. -
Optional: Visualize prediction on a subset of the test set. Here we leverage Human3.6M's 3D pose visualizing routine.
First, download the Human3.6M dataset code:
cd skeleton2d3d ./h36m_utils/fetch_h36m_code.sh cd ..
This will populate the
skeleton2d3d/h36m_utils
folder withRelease-v1.1
.Then run the visualization script:
matlab -r "vis_preds_3d; quit"
If you run this for the first time, the script will ask you to set two paths. Set the data path to
skeleton2d3d/external/Human3.6M
and the config file directory toskeleton2d3d/h36m_utils/Release-v1.1
. This will create a new fileH36M.conf
underimage-play
.The output will be saved in
output/vis_hg-256-res-clstm-res-64
.
Comparison with NN Baselines
This demo reproduces the nearest neighbor (NN) baselines reported in the paper (Sec. 4.1).
-
Obtain a trained hourglass model if you have not (see the section above).
-
Run pose estimation on input images.
./scripts/penn-crop/hg-256.sh $GPU_ID
The output will be saved in
exp/penn-crop/hg-256
. -
Run the NN baselines:
matlab -r "nn_run; quit"
The output will be saved in
exp/penn-crop/nn-all-th09
andexp/penn-crop/nn-oracle-th09
. -
Optional: Visualize prediction on a subset of the test set:
matlab -r "nn_vis; quit"
The output will be saved in
output/vis_nn-all-th09
andoutput/vis_nn-oracle-th09
.
Evaluation
This demo runs the MATLAB evaluation script and reproduces our results in the paper (Tab. 1 and Fig. 7). If you are using pre-computed prediction, and want to also evaluate the NN baselines, make sure to first run step 3 in the last section.
Compute Percentage of Correct Keypoints (PCK):
matlab -r "eval_run; quit"
This will print out the PCK values with threshold 0.05 (PCK@0.05) and also show the PCK curves.
Human Character Rendering
Finally, we show how we rendered human characters from the forecasted 3D skeletal poses using the method developed by Chen et al. [4]. This relies on the submodule Deep3DPose
.
-
Obtain forecasted 3D poses by either downloading pre-computed prediction or generating your own.
-
Set Blender path. Edit the following line in
tools/render_scape.m
:blender_path = '$BLENDER_PATH/blender-2.78a-linux-glibc211-x86_64/blender';
-
Run rendering. We provide demo for both rendering without and with textures.
Render body shape without textures:
matlab -r "texture = 0; render_scape; quit"
Render body shape with textures:
matlab -r "texture = 1; render_scape; quit"
The output will be saved in
output/render_hg-256-res-clstm-res-64
.