Home

Awesome

[CVPR 2021] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers

<img src="src/imgs/Overview.jpg" style="zoom:20%;" />

[Paper]

PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers <br/> Frank Yu, Mathieu Salzmann, Pascal Fua, and Helge Rhodin <br/> In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

PyTorch implementation for removing perspective distortions from images or 2D poses using Perspective Crop Layers (PCLs) to improve accuracy of 3D human pose estimation techniques. Shown below are examples of this perspective distortion and its correction using PCL. Images shown below are from the Human3.6M (left) and MPI-INF-3DHP (right) datasets.

Setup

Prerequisites

To get started, please run the following commands:

conda create -n pcl python=3.6.2
conda activate pcl
conda install --file requirements.txt
conda install -c anaconda ipykernel
python -m ipykernel install --user --name=pcl
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.2 -c pytorch

Running the Demos and Using PCL

We have included 2 Jupyter notebook demos for you to try out PCLs on both a general setting (RECOMMENDED) pcl_demo.ipynb (which does not require any pretrained models and goes through step-by-step how to use PCL) as well as one geared towards human pose estimation on extracted samples from Human3.6m as well as MPI-INF-3DHP (humanPose-demo.ipynb) (which requires a pretrained model and additional data)

Using PCLs

Image Input

P_virt2orig, R_virt2orig, K_virt = pcl.pcl_transforms({Crop Position [px; Nx2]}, {Crop Scale [px; Nx2]}, {Camera Intrinsic Matrix [px; Nx3x3]})
grid_sparse = pcl.perspective_grid(P_virt2orig, {Input IMG Dim.}, {Output IMG Dim}, transform_to_pytorch=True)
PCL_cropped_img = F.grid_sample({Original IMG}, grid_perspective)

NOTE: All input tensors to PCL MUST be in pixel coordinates (including the camera matrix). This means, for a 512x512 image, the range of the coordinates should be [0, 512)

NOTE: If the output of the network is normalized you must first deformalized the output before running this line

# Using the same R_virt2orig from the above command
pose_3d = pcl.virtPose2CameraPose(virt_3d_pose, R_virt2orig, batch_size, num_joints)

2D Pose Input

NOTE: The input 2D pose is not root centered yet (ie. the hip joint 2D coordinate should NOT be (0,0); this should be done afterwards (possibly during the normalization step).

virt_2d_pose, R_virt2orig, P_virt2orig = pcl.pcl_transforms_2d(pose_2d, {Crop Position [px; Nx2]}, {Crop Scale [px; Nx2]}, {Camera Intrinsic Matrix [px; Nx3x3]})
# Using the same R_virt2orig from the above command
pose_3d = pcl.virtPose2CameraPose(virt_3d_pose, R_virt2orig, batch_size, num_joints)

Coordinate Systems Used:

Image coordinates: First coordinate is the horizontal axis, second coordinate is the y axis, and the origin is in the top left.

3D coordinates (left-handed coordinate system): First coordinate is the horizontal axis (left to right), second coordinate is the vertical axis (up), and the third is in depth direction (positive values in front of camera).

Training and Evaluation

Preprocessing the Dataset:

Please follow the instructions from Margipose for downloading, pre-processing, and storing the data.

Pretrained Models:

Included in the GitHub are 4 sets of pretrained models that are used in humanPose-demo.ipynb

Train and Evaluation Code:

We have also included training and evaluation code.

License

This work is licensed under MIT License. See LICENSE for details.

If you find our code helpful, please consider citing the following paper:

@article{yu2020pcls,
  title={PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers},
  author={Yu, Frank and Salzmann, Mathieu and Fua, Pascal and Rhodin, Helge},
  journal={arXiv preprint arXiv:2011.13607},
  year={2020}
}

Acknowledgements