Home

Awesome

CIFAR-10H

CIFAR-10H is a new dataset of soft labels reflecting human perceptual uncertainty for the 10,000-image CIFAR-10 test set, first appearing in the paper:

Joshua C. Peterson*, Ruairidh M. Battleday*, Thomas L. Griffiths, & Olga Russakovsky (2019). Human uncertainty makes classification more robust. In Proceedings of the IEEE International Conference on Computer Vision. (preprint)

And more recently in:

Ruairidh M. Battleday*, Joshua C. Peterson*, & Thomas L. Griffiths (2020). Capturing human categorization of natural images by combining deep networks and cognitive models. Nature Communications, 11(1), 1-14. (paper)

And:

Pulkit Singh, Joshua C. Peterson, Ruairidh M. Battleday, & Thomas L. Griffiths (2020). End-to-end deep prototype and exemplar models for predicting human behavior. Proceedings of the 42nd Annual Conference of the Cognitive Science Society. (preprint)

Repository Contents

data/cifar10h-counts.npy - 10000 x 10 numpy matrix containing human classification counts (out of ~50) for each image and class.

data/cifar10h-probs.npy - 10000 x 10 numpy matrix containing normalized human classification counts (probabilities) for each image and class. These are the labels used for training and evaluation in the above paper.

The order of the 10,000 labels matches the original CIFAR-10 test set order.

data/cifar10h-raw.zip - Zip archive containing cifar10h-raw.csv, raw, annotator-level data. The columns are as follows:

The mapping from category names to labels is: "airplane": 0, "automobile": 1, "bird": 2, "cat": 3, "deer": 4, "dog": 5, "frog": 6, "horse": 7, "ship": 8, "truck": 9, which match the original CIFAR-10 dataset.

TODO

References

Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images (Vol. 1, No. 4, p. 7). Technical report, University of Toronto. (website)