Awesome
Generalizable One-shot Neural Head Avatar
[project page] [paper]
This repository is the official PyTorch implementation of the following paper:
Generalizable One-shot Neural Head Avatar, Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz.
Citation
If you find our work useful in your research, please cite:
@article{li2023goha,
title={Generalizable One-shot Neural Head Avatar},
author={Li, Xueting and De Mello, Shalini and Liu, Sifei and Nagano, Koki and Iqbal, Umar and Kautz, Jan},
journal={NeurIPS},
year={2023}
}
Environment Setup
Our training is carried out on 8 V100 32GB GPUs, while testing can be run on a single V100 16GB GPU (Inference needs about 9GB GPU memory). We developed our code on Ubuntu 18.04.5, with GPU driver version 535.54.03 and CUDA 11.3.
<details> <summary> Package Installation </summary>Install all packages by sh install.sh
.
Please see here for instructions.
</details>Demo
<details> <summary> Demo data download </summary>We provide pre-processed demo data including a single-view portrait image from CelebA and a drive video from HDTF. It can be downloaded here. The celeba
folder includes the source portrait image while the HDTF
folder contains the drive video. To test on your own images, please preprocess the data following dataset preprocessing instructions.
Download the pre-trained model from google drive and put the folder in src/logs/
. The pre-trained model is subject to the Creative Commons — Attribution-NonCommercial-ShareAlike 4.0 International — CC BY-NC-SA 4.0 License terms.
Assuming that the path of the downloaded demo data is /raid/goha_demo_data
, the one-shot animation demo can be run by:
cd src
python demo.py --config configs/s2.yml --checkpoint logs/s3/checkpoint825000.ckpt --savedir /raid/test --source_path /raid/goha_demo_data/celeba/ --target_path /raid/goha_demo_data/HDTF/
The --source_path
points to the source image while the --target_path
points to the drive video path, please change them according to where save the downloaded demo data. Animated video will be saved in --savedir
. Including --frame_limit 100
in the command enables a fast test on the first 100 frames.
Testing on the CelebA dataset
<details> <summary> CelebA pre-processing </summary>Follow these instructions to process the CelebA dataset. The processed dataset has the structure below, where images
include cropped portrait image, matting
include foreground masks predicted by MODNet and dataset.json
includes camera views for each portrait.
- celeba
- celeba
- images
- matting
- dataset.json
</details>
<details>
<summary> Testing on CelebA </summary>
To carry out cross-identity animation, run
python test_celeba_cross.py --config configs/s2.yml --celeba_path /raid/celeba --checkpoint logs/s3/checkpoint825000.ckpt --savedir /raid/results/celeba_cross
--celeba_path
is the path of the processed CelebA dataset, --test_sample_number
indicates testing image number, the defualt number will run on all image pairs in the CelebA dataset.
-
We use torch-fidelty for FID score computation:
pip install torch-fidelity fidelity --gpu 0 --input1 /raid/results/celeba_cross/source/ --input2 /raid/results/celeba_cross/low_res/ --fid
-
We use this script from NeRFace for LPIPS, PSNR, SSIM and L1 metrics:
python metrics.py --gt_path /raid/results/celeba_cross/source/ --images_path /raid/results/celeba_cross/low_res/
-
We use ArcFace to evaluate CSIM and Deep3DFaceRecon_pytorch for AED, APD, and AKD.
Training
<details> <summary> Training dataset preparation </summary>Please follow these instructions.
</details> <details> <summary> Model training </summary>To train the model, run:
sh train.sh
Training logs can be found in logs/s1
, logs/s2
or logs/s3
depending on the training stage and visualized by Tensorboard. The overall training takes about 1 week on 8 V100 32GB GPUs.
Acknowledgement
This work is built on top of EG3D, GFPGAN, Deep3DRecon and face-parsing.PyTorch. We also use MODNET to remove portrait backgrounds.
Contact
If you have any questions or comments, please feel free to contact xuetingl@nvidia.com