Awesome

:book: The Face Depth Network of ``Depth-Aware Generative Adversarial Network for Talking Head Video Generation'' (CVPR 2022)

:fire: If DaGAN is helpful in your photos/projects, please help to :star: it or recommend it to your friends. Thanks:fire:

[Paper] [Project Page] [Demo] [Poster Video]

Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu The Hong Kong University of Science and Technology

Cartoon Sample

https://user-images.githubusercontent.com/19970321/162151632-0195292f-30b8-4122-8afd-9b1698f1e4fe.mp4

Human Sample

https://user-images.githubusercontent.com/19970321/162151327-f2930231-42e3-40f2-bfca-a88529599f0f.mp4

Image Dataset

:wrench: Dependencies and Installation

Python >= 3.7 (Recommend to use Anaconda or Miniconda)
PyTorch >= 1.7
Option: NVIDIA GPU + CUDA
Option: Linux

⚙️ Setup

Clone repo

git clone https://github.com/harlanhong/DaGAN-Head.git
cd CVPR2022-Head

Install dependent packages

conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch
pip install tensorboardX==1.4
conda install opencv=3.3.1   # just needed for evaluation

Or you can use the environment of DaGAN directly

:zap: Quick Inference

Pre-trained checkpoint

The pre-trained checkpoint of face depth network and our DaGAN checkpoints can be found under following link: OneDrive.

Inference! To run a demo, download checkpoint and run the following command to predict scaled disparity for a single image with:

python test_simple.py --image_path assets/test_image.jpg --model_name tmp/You_Model/models/weights_19

⏳ Training

Datasets

Splits. The train/test/validation splits are upload on the One drive

Train on VoxCeleb

To train a model on specific dataset run:

CUDA_VISIBLE_DEVICES=0 python train.py --batch_size 32  --heigh 256 --width 256 --dataset vox  --sample_num 100000 --model_name taking_head_10w --data_path vox2

Training on your own dataset

You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from MonoDataset – see the CELEBDataset class in datasets/celeb_dataset.py for an example.

:scroll: Acknowledgement

Our Face-Depth-Network implementation is borrowed from Monodepth2. We appreciate the authors of Monodepth2 for making their codes available to public.

:scroll: BibTeX

@inproceedings{hong2022depth,
            title={Depth-Aware Generative Adversarial Network for Talking Head Video Generation},
            author={Hong, Fa-Ting and Zhang, Longhao and Shen, Li and Xu, Dan},
            journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
            year={2022}
          }

:e-mail: Contact

If you have any question, please email fhongac@cse.ust.hk.