Awesome

High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields

This repository provides a PyTorch implementation for the paper: High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields. (on TVCG2024, Manuscript DOI: 10.1109/TVCG.2024.3488960)

A self-driven generated video of our method: here

A cross-driven generated video of our method: here

Installation

Tested on Ubuntu 22.04, Pytorch 2.0.1 and CUDA 11.6.

git clone https://github.com/muyuWang/HHNeRF.git
cd HHNeRF

Install dependency

pip install -r requirements.txt

Data pre-processing

Our data preprocessing method follows previous work AD-NeRF, SSP-NeRF and RAD-NeRF. We provide some HR videos in 900 * 900 resolution. In data preprocessing, please downsample them to 450 * 450. Then use the downsampled frames to perform data preprocessing in RAD-NeRF (extract images, detect lands, face parsing, extract background, estimate head pose ...). With the extracted landmarks, extract patches from the eye region and then utilize a ResNet model to extract their features.

Finally, file structure after finishing all steps:

./data/<ID>
├──<ID>.mp4 # original video
├──ori_imgs # original images from video
│  ├──0.png
│  ├──0.lms # 2D landmarks
│  ├──...
├──hr_imgs # HR ground truth frames (static background)
│  ├──0.jpg
│  ├──...
├──eye_features # eye patches and features
│  ├──0_l.png # left eye
│  ├──0_r.png # right eye
│  ├──0.pt # eye feature
│  ├──...
├──gt_imgs # ground truth images (static background)
│  ├──0.jpg
│  ├──...
├──parsing # semantic segmentation
│  ├──0.png
│  ├──...
├──torso_imgs # inpainted torso images
│  ├──0.png
│  ├──...
├──aud.wav # original audio 
├──aud.npy # audio features (deepspeech)
├──bc.jpg # default background
├──track_params.pt # raw head tracking results
├──transforms_train.json # head poses (train split)
├──transforms_val.json # head poses (test split)

Some HR talking videos and processed data can be downloaded at baidudisk.

Usage

The training script is in train.sh.. Here is an example.

Training DaNeRF module:

python main.py data/Sunak/ --workspace trial/Sunak/ -O --iters 70000 --data_range 0 -1 --dim_eye 6 --lr 0.005 --lr_net 0.0005 --num_rays 65536 --patch_size 32

Training DaNeRF and ECSR jointly:

python main_sr.py data/Sunak/ --workspace trial/Sunak/ -O --iters 150000 --data_range 0 -1 --dim_eye 6 --patch_size 32 --srtask --num_rays 16384 --lr 0.005 --lr_net 0.0005 --weight_pcp 0.05 --weight_style 0.01 --weight_gan 0.01 --test_tile 450

with ckpt use --ftsr_path 'trial/Sunak/modelsr_ckpt/sresrnet_17.pth'.

Acknowledgement

This project is developed based on RAD-NeRF of Tang et al and 4K-NeRF of Wang et al. Thanks for these great works.