

High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields

This repository provides a PyTorch implementation for the paper: High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields.

A self-driven generated video of our method: here

A cross-driven generated video of our method: here


Tested on Ubuntu 22.04, Pytorch 2.0.1 and CUDA 11.6.

git clone https://github.com/muyuWang/HHNeRF.git

Install dependency

pip install -r requirements.txt

Data pre-processing

Our data preprocessing method follows previous work AD-NeRF, SSP-NeRF and RAD-NeRF. We provide some HR videos in 900 * 900 resolution. In data preprocessing, please downsample them to 450 * 450. Then use the downsampled frames to perform data preprocessing in RAD-NeRF (extract images, detect lands, face parsing, extract background, estimate head pose ...). With the extracted landmarks, extract patches from the eye region and then utilize a ResNet model to extract their features.

Some HR talking videos and processed data can be downloaded at baidudisk.


The training script is in train.sh.. Here is an example.

Training DaNeRF module:

python main.py data/Sunak/ --workspace trial/Sunak/ -O --iters 70000 --data_range 0 -1 --dim_eye 6 --lr 0.005 --lr_net 0.0005 --num_rays 65536 --patch_size 32

Training DaNeRF and ECSR jointly:

python main_sr.py data/Sunak/ --workspace trial/Sunak/ -O --iters 150000 --data_range 0 -1 --dim_eye 6 --patch_size 32 --srtask --num_rays 16384 --lr 0.005 --lr_net 0.0005 --weight_pcp 0.05 --weight_style 0.01 --weight_gan 0.01 --test_tile 450

with ckpt use --ftsr_path 'trial/Sunak/modelsr_ckpt/sresrnet_17.pth'.


This project is developed based on RAD-NeRF of Tang et al and 4K-NeRF of Wang et al. Thanks for these great works.