Awesome
High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields
This repository provides a PyTorch implementation for the paper: High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields.
A self-driven generated video of our method: here
A cross-driven generated video of our method: here
Installation
Tested on Ubuntu 22.04, Pytorch 2.0.1 and CUDA 11.6.
git clone https://github.com/muyuWang/HHNeRF.git
cd HHNeRF
Install dependency
pip install -r requirements.txt
Data pre-processing
Our data preprocessing method follows previous work AD-NeRF, SSP-NeRF and RAD-NeRF. We provide some HR videos in 900 * 900 resolution. In data preprocessing, please downsample them to 450 * 450. Then use the downsampled frames to perform data preprocessing in RAD-NeRF (extract images, detect lands, face parsing, extract background, estimate head pose ...). With the extracted landmarks, extract patches from the eye region and then utilize a ResNet model to extract their features.
- Finally, file structure after finishing all steps:
./data/<ID> ├──<ID>.mp4 # original video ├──ori_imgs # original images from video │ ├──0.png │ ├──0.lms # 2D landmarks │ ├──... ├──hr_imgs # HR ground truth frames (static background) │ ├──0.jpg │ ├──... ├──eye_features # eye patches and features │ ├──0_l.png # left eye │ ├──0_r.png # right eye │ ├──0.pt # eye feature │ ├──... ├──gt_imgs # ground truth images (static background) │ ├──0.jpg │ ├──... ├──parsing # semantic segmentation │ ├──0.png │ ├──... ├──torso_imgs # inpainted torso images │ ├──0.png │ ├──... ├──aud.wav # original audio ├──aud.npy # audio features (deepspeech) ├──bc.jpg # default background ├──track_params.pt # raw head tracking results ├──transforms_train.json # head poses (train split) ├──transforms_val.json # head poses (test split)
Some HR talking videos and processed data can be downloaded at baidudisk.
Usage
The training script is in train.sh.
. Here is an example.
Training DaNeRF module:
python main.py data/Sunak/ --workspace trial/Sunak/ -O --iters 70000 --data_range 0 -1 --dim_eye 6 --lr 0.005 --lr_net 0.0005 --num_rays 65536 --patch_size 32
Training DaNeRF and ECSR jointly:
python main_sr.py data/Sunak/ --workspace trial/Sunak/ -O --iters 150000 --data_range 0 -1 --dim_eye 6 --patch_size 32 --srtask --num_rays 16384 --lr 0.005 --lr_net 0.0005 --weight_pcp 0.05 --weight_style 0.01 --weight_gan 0.01 --test_tile 450
with ckpt use --ftsr_path 'trial/Sunak/modelsr_ckpt/sresrnet_17.pth'
.
Acknowledgement
This project is developed based on RAD-NeRF of Tang et al and 4K-NeRF of Wang et al. Thanks for these great works.