Awesome
MetFaces Dataset
MetFaces is an image dataset of human faces extracted from works of art, originally created as part of our work on:
Training Generative Adversarial Networks with Limited Data<br> Tero Karras (NVIDIA), Miika Aittala (NVIDIA), Janne Hellsten (NVIDIA), Samuli Laine (NVIDIA), Jaakko Lehtinen (NVIDIA and Aalto University), Timo Aila (NVIDIA)<br> https://arxiv.org/abs/2006.06676
The dataset consists of 1336 high-quality PNG images at 1024×1024 resolution. The images were downloaded via the Metropolitan Museum of Art Collection API, and automatically aligned and cropped using dlib. Various automatic filters were used to prune the set.
For business inquiries, please contact researchinquiries@nvidia.com
For press and other inquiries, please contact Hector Marinez at hmarinez@nvidia.com
Licenses
The source images are made available under the Creative Commons Zero (CC0) license by the Metropolitan Museum of Art. Please read here for more information about their Open Access policy.
The dataset itself (including JSON metadata, processed images, and documentation) is made available under Creative Commons BY-NC 2.0 license by NVIDIA Corporation. You can use, redistribute, and adapt it for non-commercial purposes, as long as you (a) give appropriate credit by citing our paper, and (b) indicate any changes that you've made.
Overview
All data is hosted on Google Drive:
Path | Size | Files | Format | Description |
---|---|---|---|---|
metfaces-dataset | 14.5 GB | 2621 | Main folder | |
├ metfaces.json | 1.8 MB | 1 | JSON | Image metadata including original download URL. |
├ images | 1.6 GB | 1336 | PNG | Aligned and cropped images at 1024×1024 |
└ unprocessed | 13 GB | 1284 | PNG | Original images |
Reproducing the dataset
MetFaces 1024x1024 images can be reproduced with the metfaces.py
script as follows:
- Download the contents of the metfaces-dataset Google Drive folder. Retain the original folder structure (e.g., you should have
local/path/metfaces.json
,local/path/unprocessed
.) - Run
metfaces.py --json data/metfaces-dataset.json --source-images data --output-dir out
To reproduce the MetFaces-U dataset ("unaligned MetFaces"), use the following command:
python metfaces.py --json data/metfaces-dataset.json --source-images data \
--random-shift=0.2 --retry-crops --no-rotation \
--output-dir out-unaligned
Metadata
The metfaces.json
file contains the following information for each image:
[
{
"obj_id": "11713", # Metmuseum object ID
"meta_url": "https://collectionapi.metmuseum.org/public/collection/v1/objects/11713",
"source_url": "https://images.metmuseum.org/CRDImages/ad/original/ap26.129.1.jpg",
"source_path": "unprocessed/image-11713.png", # Original raw image file under local dataset copy
"source_md5": "c1e4c5a42de6a4d6909d3820c16f9eb5", # MD5 checksum of the raw image file
"image_path": "images/11713-00.png", # Processed 1024x1024 image
"image_md5": "605a90ab744bdbc9737da5620f2777ab", # MD5 checksum of the processed image
"title": "Portrait of a Gentleman", # Metmuseum object's title
"artist_display_name": "Charles Willson Peale", # Metmuseum object's artist's display name
"face_spec": { # Info about the raw image:
"rect": [404, 238, 775, 610], # - Axis-aligned rectangle of the face region
"landmarks": [...], # - 68 face landmarks reported by dlib
"shrink": 2
},
"face_idx": 0
},
...
]
For full Metmuseum metadata, you can access the meta_url
contents by e.g., curl https://collectionapi.metmuseum.org/public/collection/v1/objects/11713
.