Home

Awesome

VariTex: Variational Neural Face Textures

License CC BY-NC-SA 4.0 Python 3.6

Teaser

This is the official repository of the paper:

VariTex: Variational Neural Face Textures<br> Marcel C. Bühler, Abhimitra Meka, Gengyan Li, Thabo Beeler, and Otmar Hilliges.<br> Abstract: Deep generative models have recently demonstrated the ability to synthesize photorealistic images of human faces with novel identities. A key challenge to the wide applicability of such techniques is to provide independent control over semantically meaningful parameters: appearance, head pose, face shape, and facial expressions. In this paper, we propose VariTex - to the best of our knowledge the first method that learns a variational latent feature space of neural face textures, which allows sampling of novel identities. We combine this generative model with a parametric face model and gain explicit control over head pose and facial expressions. To generate images of complete human heads, we propose an additive decoder that generates plausible additional details such as hair. A novel training scheme enforces a pose independent latent space and in consequence, allows learning of a one-to-many mapping between latent codes and pose-conditioned exterior regions. The resulting method can generate geometrically consistent images of novel identities allowing fine-grained control over head pose, face shape, and facial expressions, facilitating a broad range of downstream tasks, like sampling novel identities, re-posing, expression transfer, and more.

Code and Models

Code, Environment

Data

We train on the FFHQ dataset and we use the Basel Face Model 2017 (BFM). Please download the following:

Environment variables should point to your data, facemodel, and (optional) output folder: export DP=<YOUR_DATA_FOLDER>; export FP=<YOUR_FACEMODEL_FOLDER>; export OP=<YOUR_OUTPUT_FOLDER>. We assume the following folder structure.

Using the Pretrained Model

Make sure you have downloaded the pretrained model (link above). Define the checkpoint file: export CP=<PATH_TO_CHECKPOINT>.ckpt

Demo Notebook

Run the notebook CUDA_VISIBLE_DEVICES=0 jupyter notebook and open demo.ipynb.

Inference Script

The inference script runs three different modes on the FFHQ dataset:

  1. Inference on the extracted geometries and original pose (inference.inference_ffhq)
  2. Inference with extracted geometries and multiple poses (inference.inference_posed_ffhq)
  3. Inference with random geometries and poses (inference.inference_posed)

You can adjust the number of samples with the parameter n.

CUDA_VISIBLE_DEVICES=0 python varitex/inference.py --checkpoint $CP --dataset_split val.

Training

Run CUDA_VISIBLE_DEVICES=0 python varitex/train.py.

If you wish, you can set a variety of input parameters. Please see varitex.options.

A GPU with 24 GB VMem should support batch size 7. If your GPU has only 12 GB, please use a lower batch size.

Training should converge after 44 epochs, which takes roughly 72 hours on a NVIDIA Quadro RTX 6000/8000 GPU.

Implementation Details

The VariTex architecture consists of several components (in varitex/modules). We pass on a dictionary from one component to the next. The following table lists the classes / methods with their corresponding added tensors.

Class / MethodAdds...
varitex.data.hdf_dataset.NPYDatasetIMAGE_IN, IMAGE_IN_ENCODE, SEGMENTATION_MASK, UV_RENDERED
varitex.modules.encoder.EncoderIMAGE_ENCODED
varitex.modules.generator.Generator.forward_encoded2latent_distributionSTYLE_LATENT_MU, STYLE_LATENT_STD
varitex.modules.generator.Generator.forward_sample_styleSTYLE_LATENT
varitex.modules.generator.Generator.forward_latent2featureimageLATENT_INTERIOR, LATENT_EXTERIOR
varitex.modules.decoder.DecoderTEXTURE_PERSON
varitex.modules.generator.Generator.sample_textureFACE_FEATUREIMAGE
varitex.modules.decoder.AdditiveDecoderADDITIVE_FEATUREIMAGE
varitex.modules.generator.Generator.forward_merge_texturesFULL_FEATUREIMAGE
varitex.modules.feature2image.Feature2ImageRendererIMAGE_OUT, SEGMENTATION_PREDICTED

Acknowledgements

We implement our pipeline in Lightning and use the SPADE discriminator. The neural rendering is inspired by Neural Voice Puppetry. We found the pytorch3d renderer very helpful.

License

Copyright belongs to the authors. All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)