Awesome

E4S: Fine-grained Face Swapping via Regional GAN Inversion, CVPR 2023

<a href="#">Zhian Liu1*</a> <a href="#">Maomao Li2*</a> <a href="https://yzhang2016.github.io">Yong Zhang2*</a> <a href="#">Cairong Wang3</a> <a href="https://qzhang-cv.github.io/">Qi Zhang2</a> <a href="https://juewang725.github.io/">Jue Wang2</a> <a href="https://nieyongwei.net/">Yongwei Nie1✉️</a>

1South China University of Technology 2Tencent AI Lab 3Tsinghua Shenzhen International Graduate School *: equal contributions, ✉️: corresponding author

pipeline

TL;DR: A face swapping method from fine-grained face editing perspective, realized by texture and shape extraction and swapping for each facial region.

🧑‍💻 Changelog

[2023.05.11]: Add pre-trained SegNeXt-FaceParser.
[2023.05.10]: Add multi-band blending as #5.
[2023.04.19]: Add training and optimization script.
[2023.04.16]: Add face editing inference demo.
[2023.04.11]: Add face swapping inference demo (continue updating).
[2023.03.29]: E4S repository initialized.
[2023.02.28]: E4S has been accepted by CVPR 2023!

Usage

1. Installation

Please check the installation Doc for the guidance.

2. Inference Demo

2.1 face swapping

Face swapping in defult settings:

python scripts/face_swap.py --source=example/input/faceswap/source.jpg --target=example/input/faceswap/target.jpg

The reuslts will be saved to example/output/faceswap folder. Left to right: source, target, swapped face

You can optionally provide the face parsing result of the target image via --target_mask arg, and turn on the --verbose=True for detailed visulization. The results will be saved in the --output_dir folder (default to example/output/faceswap).

python scripts/face_swap.py \
      --source=./example/input/faceswap/source.jpg \
      --target=./example/input/faceswap/target.jpg \
      --target_mask=./example/input/faceswap/target_mask.png \
      --verbose=True

It's recommanded to turn on --lap_bld for better result around the face boundary.

Feel free to use different pre-trained face parser by --faceParser_name option, [default | segnext] is currently supported. Don't forget to fetch the corresponding ckpts before use.

For more information and supported args, run python scripts/face_swap.py -h for help.

2.2 face editing

For texture related editting or interpolation, run

python scripts/face_edit.py \
      --source=./example/input/faceedit/source.jpg \
      --reference=./example/input/faceedit/reference.jpg \
      --region hair eyes \
      --alpha=1

The reuslts will be saved to example/output/faceedit folder.

For shape related editing, we provide an interactive editing demo that was build upon graido, just run

python demo/gradio_demo.py

TODO:

Share the gradio demo on Huggingface.
Privide the optimization script for better results.

3. Train

If you plan to train the model from scratch, you will need to do a bit more stuffs. Machine with multiple GPUs is recommanded for the training.

3.1 dataset

Please download the CelebAMask-HQand FFHQ dataset accordingly. For FFHQ datatset, we only use the images1024x1024(~ 90GB disk space). We assume the datasets are linked to the ./data folder.

CelebAMask-HQ

Make a soft link via ln -s <donwloaded_CelebAMaskHQ_path> ./data/CelebAMaskHQ. The RGB images and corresponding facial segmentations are already provided, make sure the folders ./data/CelebAMask-HQ/CelebA-HQ-img and ./data/CelebAMask-HQ/CelebA-HQ-mask exist.

FFHQ

Make a soft link via ln -s <donwloaded_FFHQ_path> ./data/FFHQ. Since the facial segmentations are not provided, run sh scripts/prepare_FFHQ.sh for the esitimation (will cost some time). After processing, the directory should be:

data/FFHQ
├── ffhq_list.txt
├── images1024
│   └── 00000
│       ├── 00000.png
│       ├── 00001.png
|       ├── XXXXX.png
│   └── 01000
│       ├── 01000.png
│       ├── 01001.png
|       ├── XXXXX.png
│   ...
├── BiSeNet_mask
│   └── 00000
│       ├── 00000.png
│       ├── 00001.png
|       ├── XXXXX.png
│   └── 01000
│       ├── 01000.png
│       ├── 01001.png
|       ├── XXXXX.png
│   ...

3.2 pre-trained models

StyleGANv2: paper | code

Please download the pre-trained ckpt(364M) here, and put it in the pretrained_ckpts/stylegan2 folder.

Auxiliary models

We utilitize a pre-trained IR-SE50 model during training to calculate the identity loss, which is taken from TreB1eN repo. Please download it here accordingly from the following table and put them in the pretrained_ckpts/auxiliary folder.

3.3 training script

Training on FFHQ in default 8 GPUs A100 settings:

python  -m torch.distributed.launch \
        --nproc_per_node=8 \
        --nnodes=1 \
        --node_rank=0 \
        --master_addr=localhost \
        --master_port=22222 \
        scripts/train.py

It takes around 2 days to finish the training with 300K iters, batch size = 2 for each GPU. For more information and supported args, run python scripts/train.py -h for help.

4. Optimization

For a specific face, applying an optimization stage would always produce better texture code, run:

python scripts/optimization.py --save_intermediate --verbose

The optimzed texture code and the intermediate visulization results will be saved at /work_dir/optim, i.e., --output_dir option. Please feel free to change the step number of optimization, i.e.,--W_steps.

You can also specify your pre-trained RGI model via --checkpoint_path option, which is set to ./pretrained_ckpts/e4s/iteration_30000.pt by default.

🔗 Citation

If you find our work useful in your research, please consider citing:

@article{liu2022fine,
  title={Fine-Grained Face Swapping via Regional GAN Inversion},
  author={Liu, Zhian and Li, Maomao and Zhang, Yong and Wang, Cairong and Zhang, Qi and Wang, Jue and Nie, Yongwei},
  journal={arXiv preprint arXiv:2211.14068},
  year={2022}
}

🌟 Ackowledgements

Code borrows heavily from PSP, SEAN. We thank the authors for sharing their wonderful codebase.