Awesome
:book: The Face Depth Network of ``Depth-Aware Generative Adversarial Network for Talking Head Video Generation'' (CVPR 2022)
<p align="center"> <small>:fire: If DaGAN is helpful in your photos/projects, please help to :star: it or recommend it to your friends. Thanks:fire:</small> </p><!-- > [Fa-Ting Hong](https://harlanhong.github.io), [Longhao Zhang](https://dblp.org/pid/236/7382.html), [Li Shen](https://scholar.google.co.uk/citations?user=ABbCaxsAAAAJ&hl=en), [Dan Xu](https://www.danxurgb.net) <br> --> <!-- > The Hong Kong University of Science and Technology, Alibaba Cloud -->[Paper] [Project Page] [Demo] [Poster Video]<br>
Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu <br> The Hong Kong University of Science and Technology
Cartoon Sample
Human Sample
Image Dataset
<p align="center"> <img src="assets/pointcloud.jpg"> </p>:wrench: Dependencies and Installation
- Python >= 3.7 (Recommend to use Anaconda or Miniconda)
- PyTorch >= 1.7
- Option: NVIDIA GPU + CUDA
- Option: Linux
⚙️ Setup
-
Clone repo
git clone https://github.com/harlanhong/DaGAN-Head.git cd CVPR2022-Head
-
Install dependent packages
conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch pip install tensorboardX==1.4 conda install opencv=3.3.1 # just needed for evaluation
Or you can use the environment of DaGAN directly
:zap: Quick Inference
Pre-trained checkpoint
The pre-trained checkpoint of face depth network and our DaGAN checkpoints can be found under following link: OneDrive.
Inference! To run a demo, download checkpoint and run the following command to predict scaled disparity for a single image with:
python test_simple.py --image_path assets/test_image.jpg --model_name tmp/You_Model/models/weights_19
⏳ Training
Datasets
- Splits. The train/test/validation splits are upload on the One drive
Train on VoxCeleb
To train a model on specific dataset run:
CUDA_VISIBLE_DEVICES=0 python train.py --batch_size 32 --heigh 256 --width 256 --dataset vox --sample_num 100000 --model_name taking_head_10w --data_path vox2
Training on your own dataset
You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from MonoDataset
– see the CELEBDataset
class in datasets/celeb_dataset.py
for an example.
:scroll: Acknowledgement
Our Face-Depth-Network implementation is borrowed from Monodepth2. We appreciate the authors of Monodepth2 for making their codes available to public.
:scroll: BibTeX
@inproceedings{hong2022depth,
title={Depth-Aware Generative Adversarial Network for Talking Head Video Generation},
author={Hong, Fa-Ting and Zhang, Longhao and Shen, Li and Xu, Dan},
journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}
:e-mail: Contact
If you have any question, please email fhongac@cse.ust.hk
.