Home

Awesome

FFHQ-Text👸

Facial Attribute Textual Descriptions 📃 for Flickr-Faces-HQ Dataset (FFHQ) 👸.

<div align="center"><img src=./Pic/Overview.png></div>

📚Text-to-Image Datasets

Text-to-X Datasets

DatasetPublicCategoriesImages (Resolution)AnnotationsAttributesOther Details
CUB-200-201120011,788 (Unfixed)10UncountedBBox, Segmentation...
Oxford-102 Flowers1028,189 (Unfixed)10Uncounted-
MS-COCO91120k (Unfixed)5UncountedBBox, Segmentation...

Text-to-Face Datasets

DatasetPublicCategoriesImages (Resolution)AnnotationsAttributesOther Details
SCU-Text2face×1 (Mixed)1,000 (256×256)5Uncounted-
Text2FaceGAN×1 (Mixed)10,000 (178×218)640-
Faces a la Carte×1 (Mixed)202,599 (178×218)up to 1040-
Multi-Modal-CelebA-HQ1 (Mixed)30,000 (512×512)1038Mask, Sketches
FFHQ-Text1 (Female)760 (1024×1024)9162BBox

🍀 Overview

FFHQ-Text is a small-scale face image dataset with large-scale facial attributes, designed for text-to-face generation&manipulation, text-guided facial image manipulation, and other vision-related tasks.

This dataset is an extension of the NVIDIA Flickr-Faces-HQ Dataset (FFHQ), which is the selected top 760 female FFHQ images that only contain one complete human face.

In this study, we explore terminology in the human facial to manually annotate the FFHQ-Text dataset, which is breakdown into the following 13 multi-valued facial element groups from coarse to fine:

🎁 Download

ContentSizeFilesFormatDetails
FFHQ-Text-1,524Main Folder
├  Image0.97 GB760PNGFemale images from FFHQ of size 1024×1024
├  Text766 KB760TXT9 descriptions for each selected facial image in FFHQ
├  Train12 KB1PKLFilenames of training images
├  Test6 KB1PKLFilenames of testing images
├  bounding_boxes21 KB1TXTDetermine the location and orientation of each face
├  images19 KB1TXTCounts, paths and filenames of all facial images

✒ Bounding boxes for each face were extracted using the VGG Image Annotator (VIA) platform.

Please fill out the FFHQ-Text Dataset Request Form.

If it is not convenient to access Google, please contact me📧 directly with your real name, institution, and institution/organization email address. We will send you an email with the FFHQ-Text dataset within one week.

🎉 Awesome Repo

This is a survey on Text-to-Image generation & manipulation and Other Related Works. ReadMe Card

I hope you can have a primary knowledge about this topic, or some information would be helpful to find some sparks in your research~

📚 Feedback

Please fill out the FFHQ-Text Dataset Feedback Form.

I would greatly value your thoughts, suggestions, concerns or problems.

📌License & Privacy

The dataset is made available under Creative Commons BY-NC-SA 4.0 license by Interaction Laboratory, Ritsumeikan University. You can use, redistribute, and adapt it for non-commercial purposes, as long as you (a) give appropriate credit by citing our paper, (b) indicate any changes that you've made, and (c) distribute any derivative works under the same license.

The individual images were published in Flickr by their respective authors under either Creative Commons BY 2.0, Creative Commons BY-NC 2.0, Public Domain Mark 1.0, Public Domain CC0 1.0, or U.S. Government Works license. All of these licenses allow free use, redistribution, and adaptation for non-commercial purposes. However, some of them require giving appropriate credit to the original author, as well as indicating any changes that were made to the images. The license and original author of each image are indicated in the metadata.

For other instructions, please see the privacy section of the original FFHQ dataset for more details.

🎯 Terms of Use

Use of the provided FFHQ-Text Dataset will be deemed and treated as the user agreeing to and accepting the following Terms of Use content:

⭐Citation

<p align=center>“A picture🖼 is worth a thousand words📜~ ”</p>

If you find this dataset helpful for your research, please cite it as below:


@inproceedings{zhou2021generative,
  title={Generative Adversarial Network for Text-to-Face Synthesis and Manipulation with Pretrained BERT Model},
  author={Zhou, Yutong and Shimada, Nobutaka},
  booktitle={2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)},
  pages={01--08},
  year={2021}
}

@inproceedings{zhou2021generative,
  title={Generative Adversarial Network for Text-to-Face Synthesis and Manipulation},
  author={Zhou, Yutong},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={2940--2944},
  year={2021}
}

@inproceedings{karras2019style,
  title={A style-based generator architecture for generative adversarial networks},
  author={Karras, Tero and Laine, Samuli and Aila, Timo},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4401--4410},
  year={2019}
}