

PixelFace+: Towards Controllable Face Generation and Manipulation with Text Descriptions and Segmentation Masks

Python 3.6

By Xiaoxiong Du, Jun Peng, Yiyi Zhou, Jinlu Zhang, Siting Chen, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji.

MM '23: Proceedings of the 31st ACM International Conference on Multimedia


Demo Video


This repository is pytorch implementation of PixelFace+. PixelFace+ utilizes both mask and text features for highly controllable face generation and manipulation. We propose the GCMF module to achieve better decoupling. Additionally, to enhance the alignment between generated images and text, we introduce a regularization loss function based on CLIP. The framework diagram of PixelFace+ is shown below:The Framework of PixelFace+


python 3.6
pytorch 1.10.0
pytorch-fid 0.2.1
torchvision 0.11.1

Data preparation

Multi-Modal-CelebA-HQ Dataset [Link]

Before training, please dowload the dataset2.json (which has been compressed as a zip file), and place the file in the MMceleba dataset directory.


  1. Preparing your settings. To train a model, you should modify code/cfg/mmceleba.yml to adjust the settings you want. The default configuration is to train on MMceleba with input and output image resolution set to 256*256, and BatchSize set to 4. Increasing the BatchSize may result in a decrease in semantic alignment after training, as a larger BatchSize reduces the constraint of the CLIP regularization loss.

  2. Training the model. run train.py under the main folder to start training:

cd /PixelFace+/code
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node  2 --master_port 10011  main.py --cfg cfg/mmceleba.yml
  1. Testing the model. After training for more than 70 epochs, the model automatically evaluates its performance every ten epochs. If you need to modify the evaluation frequency, you can do so at line 675 in \code\trainer.py.


You can use the eval1 method(which at line 732 of \code\trainer.py) to generate iamges.

If you want to generate an image from your own description, you may can try to put the code of sample.py to \code\trainer.py.

Pretrain Model

  1. Dowload the pretrain model. The Model link: https://pan.baidu.com/s/1ARSjz6IXCO2-8qf1Tf9p-A?pwd=qwer, the file extraction code:qwer.

  2. Modify the cfg file\code\cfg\mmceleba.yml to use the pretrain model:

  FLAG: True

  ##### Modify This Line #####

  B_NET_D: True
  MAX_EPOCH: 100


Thanks for a lot of codes from PixelFolder and PixelFace.