Home

Awesome

Semantic Occupancy Field

This repository contains the code for training/generating SOF (semantic occupancy field) as part of the TOG submission: SofGAN: A Portrait Image Generator with Dynamic Styling.

Installation

Clone the main SofGAN repo by git clone --recursive https://github.com/apchenstu/softgan_test.git. This repo will be automatically included in softgan_test/modules.

Data preparation

Create a root directory (e.g. data), and for each instance (e.g. 00000) create a folder with seg images and calibrated camera poses. The folder structure looks like:

└── data   # instance id
    └── 00000
    │   ├── cam2world.npy       # camera extrinsics
    │   ├── cameras.npy            
    │   ├── intrinsic.npy       # camera intrinsics
    │   ├── zRange.npy          # optional only when use depth for training
    │   ├── 00000.png
    │   ...
    │   └── 00029.png
    ├── 00001
    │   └── ...
    ...
    └── xxxxx
        └── ...

Download the example data from here. We provide a notebook for data preprocessing.

Ideally, SOF could be trained with your own datasets with multi-view face segmentation maps. Similar to SRNs we uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates. Please specify --orthogonal during training if you're using orthogonal projection for your own data. Please also notice that you might need to change the sample_instances_* and sample_observations_* parameter according to the number of instances and views of your own dataset.

As the accuracy of camera parameters might largly affect the training, you can specify --opt_cam during training to automatically optimize the camera parameters.

Training

STEP 1: Training network parameters

The training is done following two phrases. Firstly, please train the network parameters with multiview segmaps:

python train.py --config_filepath=./configs/face_seg_real.yml 

Training might take 1 to 3 days depends on the dataset size and quality.

STEP 2 (optional): Inverse rendering

We use inverse rendering to expand the trained geometric sampling space with single view segmaps collected from CelebAMaskHQ. The example config file is provided in ./configs/face_seg_single_view.yml, notice that we set --overwrite_embeddings and --freeze_networks to True, and specify --checkpoint_path as the trained checkpoint in STEP 1. After training, you can access the corresponding latent code for each portrait by loading the checkpoint.

python train.py --config_filepath=./configs/face_seg_single_view.yml 

Similar process could be used to back project in-the-wild portrait images into a latent vector in SOF geometric sampling space, and used for mutiview portrait generation.

Pretrained Checkpoints

Please download the pre-trained checkpoint from either GoogleDrive or BaiduDisk (password: k0b8) and save to ./checkpoints.

Inference

Please follow renderer.ipynb in the SofGAN repo for free-view portrait generation.

Once trained, SOF could be used for generating free-view segmentation maps for arbitrary instances in the geometric space. The inference codes are provided in notebooks in scripts:

Acknowledgment

Thanks vsitzmann for sharing the awesome idea of SRNs, which has greatly inspired our design of SOF.