Home

Awesome

TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

This is the official repository for our ECCV 2024 paper TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting.

Paper | Project | Video

image

Installation

Tested on Ubuntu 18.04, CUDA 11.3, PyTorch 1.12.1

git clone git@github.com:Fictionarry/TalkingGaussian.git --recursive

conda env create --file environment.yml
conda activate talking_gaussian
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
pip install tensorflow-gpu==2.8.0

If encounter installation problem from the diff-gaussian-rasterization or gridencoder, please refer to gaussian-splatting and torch-ngp.

Preparation

Usage

Important Notice

Video Dataset

Here we provide two video clips used in our experiments, which are captured from YouTube. Please respect the original content creators' rights and comply with YouTube’s copyright policies in the usage.

Other used videos can be found from GeneFace and AD-NeRF.

Pre-processing Training Video

Audio Pre-process

In our paper, we use DeepSpeech features for evaluation.

Train

# If resources are sufficient, partially parallel is available to speed up the training. See the script.
bash scripts/train_xx.sh data/<ID> output/<project_name> <GPU_ID>

Test

# saved to output/<project_name>/test/ours_None/renders
python synthesize_fuse.py -S data/<ID> -M output/<project_name> --eval  

Inference with target audio

python synthesize_fuse.py -S data/<ID> -M output/<project_name> --use_train --audio <preprocessed_audio_feature>.npy

Citation

Consider citing as below if you find this repository helpful to your project:

@article{li2024talkinggaussian,
    title={TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting}, 
    author={Jiahe Li and Jiawei Zhang and Xiao Bai and Jin Zheng and Xin Ning and Jun Zhou and Lin Gu},
    journal={arXiv preprint arXiv:2404.15264},
    year={2024}
}

Acknowledgement

This code is developed on gaussian-splatting with simple-knn, and a modified diff-gaussian-rasterization. Partial codes are from RAD-NeRF, DFRF, GeneFace, and AD-NeRF. Teeth mask is from EasyPortrait. Thanks for these great projects!