Awesome
Human View Synthesis
Human Pose Manipulation and Novel View Synthesis using Differentiable Rendering
Guillaume Rochette, Chris Russell, Richard Bowden
https://arxiv.org/abs/2111.12731
Abstract
We present a new approach for synthesizing novel views of people in new poses. Our novel differentiable renderer enables the synthesis of highly realistic images from any viewpoint. Rather than operating over mesh-based structures, our renderer makes use of diffuse Gaussian primitives that directly represent the underlying skeletal structure of a human. Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network. The formulation gives rise to a fully differentiable framework that can be trained end-to-end. We demonstrate the effectiveness of our approach to image reconstruction on both the Human3.6M and Panoptic Studio datasets. We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses.
Videos
- View Synthesis from an Existing Viewpoint: https://youtu.be/vVCQM4cNwz8
- View Synthesis from a Virtual Viewpoint: https://youtu.be/gI9I8iLGVr4
- Motion Transfer between Actors: https://youtu.be/sbY34nnDy2M
- Paper Presentation (12'): https://youtu.be/Sxwpsi9TDOs
Table of Contents
- Requirements
- Preparing Datasets
- Training
- Inference
- Pre-trained Models
- Visualise the Renderer's Output
- Citation
Requirements
- Install Miniconda: https://docs.conda.io/en/latest/miniconda.html
- Set-up the following environment:
conda env create -f environment.yml -n my-env
conda activate my-env
export PYTHONPATH=.
Preparing Datasets
Please follow the instructions detailed in each repositories:
- Panoptic Studio: https://github.com/GuillaumeRochette/PanopticProcessing
- Human3.6M: https://github.com/GuillaumeRochette/Human36MProcessing
Training
If you want to train the model on Panoptic
, at a resolution of 256x256
, using the LPIPS
loss and then fine-tuning using the Adaptive
adversarial framework:
- Ensure you update the various
config.json
files, as theroot
is set to/path/to/datasets/Panoptic
. - First pre-train the 2D-to-3D pose regression model:
python training/pose_2d_to_pose_3d.py \
--hparams experiments/Pose2DToPose3D/Panoptic/hparams.json \
--config experiments/Pose2DToPose3D/Panoptic/config.json
- Pick the best model from
experiments/Pose2DToPose3D/Panoptic/lightning_logs/version_0/checkpoints
and save it underexperiments/Pose2DToPose3D/Panoptic/model.ckpt
. - Transfer the pre-trained weights to the novel view synthesis model:
python training/prepare_novel_view_synthesis.py \
--config_pose_2d_to_pose_3d experiments/Pose2DToPose3D/Panoptic/config.json \
--ckpt_pose_2d_to_pose_3d experiments/Pose2DToPose3D/Panoptic/model.ckpt \
--hparams_novel_view_synthesis experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/hparams.json \
--ckpt_novel_view_synthesis experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/initial.ckpt
- Train the novel view synthesis model with the LPIPS loss:
python training/novel_view_synthesis.py \
--hparams experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/hparams.json \
--config experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/config.json \
--ckpt experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/initial.ckpt
- Pick the best model from
experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/lightning_logs/version_0/checkpoints
and save it underexperiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/model.ckpt
. - Fine-tune the model with an adversarial loss:
python training/novel_view_synthesis.py \
--hparams experiments/NovelViewSynthesis/Panoptic/256x256/Adaptive/hparams.json \
--config experiments/NovelViewSynthesis/Panoptic/256x256/Adaptive/hparams.json \
--ckpt experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/model.ckpt
- Pick the best model from
experiments/NovelViewSynthesis/Panoptic/256x256/Adaptive/lightning_logs/version_0/checkpoints
and save it underexperiments/NovelViewSynthesis/Panoptic/256x256/Adaptive/model.ckpt
.
Inference
- View Synthesis from an Existing Viewpoint:
python inference/infer_real.py \
--ckpt experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/model.ckpt \
--root /path/to/datasets/Panoptic \
--sequence 171026_pose1/Subsequences/0 \
--in_view 00 \
--out_view 24 \
--interval 2922 3751 \
--output_dir experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/InferReal
- View Synthesis from a Virtual Viewpoint:
python inference/infer_virtual.py \
--ckpt experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/model.ckpt \
--root /path/to/datasets/Panoptic \
--sequence 171026_pose1/Subsequences/0 \
--view 00 \
--interval 2922 3751 \
--output_dir experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/InferVirtual
- Motion Transfer between Actors:
python inference/motion_transfer.py \
--ckpt experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/model.ckpt \
--root /path/to/datasets/Panoptic \
--sequence_motion 171204_pose1/Subsequences/0 \
--in_view_motion 00 \
--out_view_motion 24 \
--interval_motion 3903 4237 \
--sequence_appearance 171026_pose1/Subsequences/1 \
--in_view_appearance 18 \
--interval_appearance 5445 5445 \
--output_dir experiments/NovelViewSynthesis/Panoptic/256x256/LPIPS/MotionTransfer
Pre-trained Models
Download the following models and place them in their correspoding directories:
- Human36M.256x256.LPIPS.ckpt in
experiments/NovelViewSynthesis/Human36M/256x256/LPIPS
. - Panoptic.512x512.LPIPS.ckpt in
experiments/NovelViewSynthesis/Panoptic/512x512/LPIPS
.
Visualise the Renderer's Output
- Simple shapes (one at a time):
python rendering/visualize_shapes.py \
-- one_sphere \
# --one_ellipsoid \
# --two_spheres \
# --many_spheres 64 \
# --many_ellipsoids 64 \
--height 270 \
--width 480
- Human Poses:
python rendering/visualize_poses.py \
--root /path/to/datasets/Panoptic \
--sequence 171026_pose1/Subsequences/0 \
--view 00 \
--frame 420 \
--height 256 \
--width 256
Citation
@article{Rochette2021,
title={Human Pose Manipulation and Novel View Synthesis using Differentiable Rendering},
author={Rochette, Guillaume and Russell, Chris and Bowden, Richard},
booktitle={IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2021},
year={2021}
}