Home

Awesome

HandsOff: Labeled Dataset Generation With No Additional Human Annotations (CVPR 2023 Highlight)

Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation.

<a href="https://arxiv.org/abs/2212.12645"><img src="https://img.shields.io/badge/arXiv-2212.12645-b31b1b.svg" height=22.5></a>

The code is based on EditGAN.

Updates

Requirements

conda env create --name handsoff_env --file requirements.yml
conda activate handsoff_env

Datasets

Data splits

Pretrained models

Pretrained GAN checkpoints

We use the following pretrained GAN checkpoints:

Pretrained ReStyle checkpoints

We use the following pretrained ReStyle checkpoints:

Pretrained Label Generators

Coming soon!

GAN inversion latent codes

Training

:warning: Training HandsOff is RAM consuming, as all hypercolumn representations are kept in memory

:warning: Training HandsOff is GPU memory consuming. All experiments were run on Nvidia Tesla V100 GPUs with 32GB memory.

Examples of experimental configuration files available in /experiments/ for face and car segmentation. More examples to come soon!

Run GAN inversion

cd restyle-encoder

python scripts/inference_iterative.py \
--exp_dir=/path/to/experiment \                             # path to output directory of ReStyle
--checkpoint_path=experiment/checkpoints/best_model.pt \    # pretrained ReStyle checkpoint path
--data_path=/path/to/test_data \                            # path to images to invert
--test_batch_size=4 \                                   
--test_workers=4 \
--n_iters_per_batch=5

cd ..
python format_latents.py \
--latents_dir=/exp_dir/from/restyle \                       # path to `exp_dir` from inference_iterative.py (should contain `latents.npy`)
--latents_save_dir=/path/to/save/folder \                   # path to directory to save formatted latents
--latents_save_name=name_of_saved_latents.npy               # name of saved file (e.g., `latents_formatted.npy`)
python optimize_latents.py \
--exp /path/to/handsoff/experiment/exp.json \               # path to exp.json for HandsOff (e.g., /experiments/face_seg.json)
--latents_path /path/to/initial/latents.npy                  # name of formatted outputs from format_latents.py
--latents_save_dir /path/to/save/folder \                   # path to save directory of refined latents
--latents_save_name name_of_saved_latents.npy \             # name of save file
--images_dir /path/to/images/to/refine \                    # path to images that were inverted

Train the label generator

python train_label_generator.py --exp experiments/exp.json 

Generating Synthetic Datasets and Evaluation

Generate data

python generate_data.py \                                   
--exp experiments/exp.json \                                # same config file as train_label_generator.py
--start_step start_step \                                   # int: random state to start dataset generation     
--resume path/to/dir/with/trained/label/generators \        # path to directory with label generator checkpoints 
--num_sample 10000 \                                        # number of image-label pairs to generate
--save_vis False                                            # whether to save colored images of generated labels

Train DeepLabV3 on generated data

python train_deeplab.py \
--exp experiments/exp.json \                                        # same config file as train_label_generator.py
--data_path path/to/dir/with/trained/label/generators/samples \     # generate_data.py saves dataset to --resume/samples (if save_vis = False)

Evaluate DeepLabV3 on real test set

python test_deeplab.py \
--exp experiments/exp.json \                                # same config file as train_label_generator.py
--resume path/to/dir/with/trained/deeplab/checkpoints \     # path to directory with trained DeepLabV3 checkpoints 
--validation_number val_number                              # Number of images used for validation. Takes the first val_number images for validation