Home

Awesome

E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion

This repository hosts the official PyTorch implementation of the paper: "E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion" (Accepted by TIP 2022), which was initially called "A Simple Baseline for StyleGAN Inversion".

<img src='imgs/teaser.png'>

Tianyi Wei<sup>1</sup>, Dongdong Chen<sup>2</sup>, Wenbo Zhou<sup>1</sup>, Jing Liao<sup>3</sup>, Weiming Zhang<sup>1</sup>, Lu Yuan<sup>2</sup>, Gang Hua<sup>4</sup>, Nenghai Yu<sup>1</sup> <br> <sup>1</sup>University of Science and Technology of China, <sup>2</sup>Microsoft Cloud AI <br> <sup>3</sup>City University of Hong Kong, <sup>4</sup>Wormpex AI Research

News

2022.03.26: The paper has been accepted by IEEE Transactions on Image Processing [TIP]! 🎉
2022.02.19: Initial code release

Getting Started

Prerequisites

$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install matplotlib scipy opencv-python pillow scikit-image tqdm tensorflow-io

If you want to run secure deep hiding, you need to install matlab engine.

Pretrained Models

Please download the pre-trained models from the following links. Each E2Style model contains the entire E2Style architecture, including the encoder and decoder weights.

PathDescription
StyleGAN InversionE2Style trained with the FFHQ dataset for StyleGAN inversion.
ColorizationE2Style trained with the FFHQ dataset for colorization.
DenoiseE2Style trained with the FFHQ dataset for denoising.
InpaintingE2Style trained with the FFHQ dataset for inpainting.
Super ResolutionE2Style trained with the CelebA-HQ dataset for super resolution (up to x32 down-sampling).
Sketch to ImageE2Style trained with the CelebA-HQ dataset for image synthesis from sketches.
Segmentation to ImageE2Style trained with the CelebAMask-HQ dataset for image synthesis from segmentation maps.

If you wish to use one of the pretrained models for training or inference, you may do so using the flag --checkpoint_path. In addition, we provide various auxiliary models needed for training your own E2Style model from scratch.

PathDescription
FFHQ StyleGANStyleGAN model pretrained on FFHQ taken from rosinality with 1024x1024 output resolution.
IR-SE50 ModelPretrained IR-SE50 model taken from TreB1eN for use in our multi ID loss during E2Style training.

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you may use your own paths by changing the necessary values in configs/path_configs.py.

Training

Preparing your Data

Training E2Style

The main training script can be found in scripts/train.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.
Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs.

Training the E2Style Encoder

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5 \
--parse_lambda=1 \
--training_stage=1
python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/1-stage-inversion.pt \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5 \
--parse_lambda=1 \
--training_stage=2
python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/2-stage-inversion.pt \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5 \
--parse_lambda=1 \
--training_stage=3

Colorization

python scripts/train.py \
--dataset_type=ffhq_colorization \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5 \
--parse_lambda=1 \

Denoise

python scripts/train.py \
--dataset_type=ffhq_denoise \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5
--parse_lambda=1 \

Inpainting

python scripts/train.py \
--dataset_type=ffhq_inpainting \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5
--parse_lambda=1 \

Sketch to Face

python scripts/train.py \
--dataset_type=celebs_sketch_to_face \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0 \
--parse_lambda=1 \
--w_norm_lambda=0.005 \
--label_nc=1 \
--input_nc=1

Segmentation Map to Face

python scripts/train.py \
--dataset_type=celebs_seg_to_face \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0 \
--parse_lambda=1 \
--w_norm_lambda=0.005 \
--label_nc=19 \
--input_nc=19

Super Resolution

python scripts/train.py \
--dataset_type=celebs_super_resolution \
--exp_dir=/path/to/experiment \
--workers=4 \
--batch_size=4 \
--test_batch_size=4 \
--test_workers=4 \
--val_interval=5000 \
--save_interval=5000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.5 \
--parse_lambda=1 \
--w_norm_lambda=0.005 \
--resize_factors=1,2,4,8,16,32

Additional Notes

Testing

Inference

Having trained your model, you can use scripts/inference.py to apply the model on a set of images.
For example,

python scripts/inference.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=experiment/checkpoints/best_model.pt \
--data_path=/path/to/test_data \
--test_batch_size=1 \
--test_workers=4 \
--stage=1 \
--save_inverted_codes \
--couple_outputs \
--resize_outputs

Additional notes to consider:

Secure Deep Hiding

python scripts/secure_deep_hiding.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=pretrained_models/inversion.pt \
--secret_dir=/path/to/secret_dir \
--cover_dir=/path/to/cover_dir \

Semantic Editing

python scripts/manipulate.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=pretrained_models/inversion.pt \
--deriction_name=age \
--edited_dir=/path/to/edited_dir \

Style Mixing

python scripts/stylemixing.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=pretrained_models/inversion.pt \
--style_dir=/path/to/style_dir \
--content_dir=/path/to/content_dir \

Interpolation

python scripts/interpolate.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=pretrained_models/inversion.pt \
--source_dir=/path/to/source_dir \
--target_dir=/path/to/target_dir \

Additional notes to consider:

Acknowledgements

This code is heavily based on pSp and idinvert.

Citation

If you find our work useful for your research, please consider citing the following papers :)

@article{wei2022e2style,
  title={E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion},
  author={Wei, Tianyi and Chen, Dongdong and Zhou, Wenbo and Liao, Jing and Zhang, Weiming and Yuan, Lu and Hua, Gang and Yu, Nenghai},
  journal={IEEE Transactions on Image Processing},
  year={2022}
}