Awesome
Semantic Occupancy Field
This repository contains the code for training/generating SOF (semantic occupancy field) as part of the TOG submission: SofGAN: A Portrait Image Generator with Dynamic Styling.
Installation
Clone the main SofGAN repo by git clone --recursive https://github.com/apchenstu/softgan_test.git
. This repo will be automatically included in softgan_test/modules
.
Data preparation
Create a root directory (e.g. data
), and for each instance (e.g. 00000
) create a folder with seg images and calibrated camera poses. The folder structure looks like:
└── data # instance id
└── 00000
│ ├── cam2world.npy # camera extrinsics
│ ├── cameras.npy
│ ├── intrinsic.npy # camera intrinsics
│ ├── zRange.npy # optional only when use depth for training
│ ├── 00000.png
│ ...
│ └── 00029.png
├── 00001
│ └── ...
...
└── xxxxx
└── ...
Download the example data from here. We provide a notebook for data preprocessing.
Ideally, SOF
could be trained with your own datasets with multi-view face segmentation maps. Similar to SRNs we uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates. Please specify --orthogonal
during training if you're using orthogonal projection for your own data. Please also notice that you might need to change the sample_instances_*
and sample_observations_*
parameter according to the number of instances and views of your own dataset.
As the accuracy of camera parameters might largly affect the training, you can specify --opt_cam
during training to automatically optimize the camera parameters.
Training
STEP 1: Training network parameters
The training is done following two phrases. Firstly, please train the network parameters with multiview segmaps:
python train.py --config_filepath=./configs/face_seg_real.yml
Training might take 1 to 3 days depends on the dataset size and quality.
STEP 2 (optional): Inverse rendering
We use inverse rendering to expand the trained geometric sampling space with single view segmaps collected from CelebAMaskHQ. The example config file is provided in ./configs/face_seg_single_view.yml
, notice that we set --overwrite_embeddings
and --freeze_networks
to True
, and specify --checkpoint_path
as the trained checkpoint in STEP 1. After training, you can access the corresponding latent code for each portrait by loading the checkpoint.
python train.py --config_filepath=./configs/face_seg_single_view.yml
Similar process could be used to back project in-the-wild portrait images into a latent vector in SOF
geometric sampling space, and used for mutiview portrait generation.
Pretrained Checkpoints
Please download the pre-trained checkpoint from either GoogleDrive or BaiduDisk (password: k0b8) and save to ./checkpoints
.
Inference
Please follow renderer.ipynb
in the SofGAN repo for free-view portrait generation.
Once trained, SOF could be used for generating free-view segmentation maps for arbitrary instances in the geometric space. The inference codes are provided in notebooks in scripts
:
- Most testing codes are included in
scripts/TestAll.ipynb
, e.g. generating multiview images, modify attributes, visualize depth layers and build depth prior with marching cube. - To generate sampling free-view portrait segmentations from the geometry space, please refer to
scripts/Test_MV_Inference.ipynb
. - To visulalize a trained SOF volume as in Fig.5, please use
scripts/Test_Slicing.ipynb
. - To calculat mIOU during SOF training (Fig.9), please modify the model checkpoint directory and run
scripts/Test_mIoU.ipynb
. - We also provide
scripts/Test_GMM.ipynb
for miscs like fitting GMM model to the geometric space.
Acknowledgment
Thanks vsitzmann for sharing the awesome idea of SRNs, which has greatly inspired our design of SOF.