Home

Awesome

CelebA-Dialog Dataset

Talk-to-Edit: Fine-Grained Facial Editing via Dialog </br> Yuming Jiang*, Ziqi Huang*, Xingang Pan, Chen Change Loy and Ziwei Liu </br> In IEEE International Conference on Computer Vision (ICCV), 2021.

From MMLab@NTU affliated with S-Lab, Nanyang Technological University.

<img src="./assets/celeba_dialog.png" width="80%">

[Project Page] | [Paper] | [Code] | [Video] | [Web Page]

CelebA-Dialog is a large-scale visual-language face dataset with the following features:

<img src="./assets/dataset.png" width="100%">

The dataset can be employed as the training and test sets for the following computer vision tasks: fine-grained facial attribute recognition, fine-grained facial manipulation, text-based facial generation and manipulation, face image captioning, natural language based facial recognition and manipulation, and broader multi-modality learning tasks. The dataset is proposed in Talk-to-Edit.

Download Links

You can download using the following links:

Link (HQ)SizeFilesFormatDescription
CelebA-Dialog (HQ)~4.4 GB30,000 high-resolution images and corresponding annotations
├  image (HQ)~2.7 GB30,000JPGimages from CelebA-HQ
├  fine-grained label (HQ)~600 KB1TXTfine-grained labels for 5 attributes
├  binary label (HQ)~3.5 MB1TXTbinary labels for 40 attributes
├  text (HQ)~27 MB4TXT and JSONnatural language captions and editing requests
├  mask (HQ)~1.8 GBPNGsegmentation masks (1) binary (2) colorized
├  identity (HQ)~400 KB1TXTidentity label of each image
Link (standard)SizeFilesFormatDescription
CelebA-Dialog (standard)202,599 original CelebA images and corresponding annotations
├  image (standard)images from CelebA
├  fine-grained label (standard)~4 MB1TXTfine-grained labels for 5 attributes
├  binary label (standard)~25 MB1TXTbinary labels for 40 attributes
├  text (standard)~14 MBTXT and JSONnatural language captions and editing requests
├  identity (standard)~3.3 MB1TXTidentity label of each image
Link (mapping)SizeFilesFormatDescription
HQ-to-standard mapping~1 MB1TXTThe mapping between 30,000 CelebA-HQ images and the 202,599 CelebA images

Details

Image

Fine-Grained Label

Binary Label

Text

Mask

We preprocess the facial segmentation masks of CelebAMask-HQ to ease future research.

Below is the color-to-label parsing information:

Label list
0: 'background'1: 'skin'2: 'nose'3: 'eye_g'4: 'l_eye'
5: 'r_eye'6: 'l_brow'7: 'r_brow'8: 'l_ear'9: 'r_ear'
10: 'mouth'11: 'u_lip'12: 'l_lip'13: 'hair'14: 'hat'
15: 'ear_r'16: 'neck_l'17: 'neck'18: 'cloth'
from PIL import Image
import numpy as np

segm = Image.open(f)
segm = np.array(segm) # shape: [512, 512]

Identity

Some images are of the same person. There are totally 10,177 identities in the dataset. On average, there are:

Agreement

Citation

If you find this dataset useful for your research and use it in your work, please consider cite the following papers:

@InProceedings{CelebA-Dialog,
  title = {Talk-to-Edit: Fine-Grained Facial Editing via Dialog},
  author = {Jiang, Yuming and Huang, Ziqi and Pan, Xingang and Loy, Chen Change and Liu, Ziwei},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

@inproceedings{CelebAMask-HQ,
  title = {MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
  author = {Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}

@inproceedings{CelebA-HQ,
  title={Progressive Growing of {GAN}s for Improved Quality, Stability, and Variation},
  author={Tero Karras and Timo Aila and Samuli Laine and Jaakko Lehtinen},
  booktitle={International Conference on Learning Representations},
  year={2018},
}

@inproceedings{CelebA,
  title = {Deep Learning Face Attributes in the Wild},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
  month = {December},
  year = {2015}
}