Home

Awesome

Collages dataset

This repository contains the dataset described in the following paper by Teney et al. Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization.

The task is a binary classification task. Each image is a tiling of four blocks. Each block contains one of two classes from well-known datasets: MNIST, CIFAR-10, Fashion-MNIST, and SVHN.

<p align="center"> <img src="training-preview.png" width="500"><br> <i>Sample training images of 4-block collages.</i> </p>

Because of the simplicity bias (see Shah et al.) a neural network naively trained on this dataset systematically focuses on the MNIST digit while ignoring other parts that are more difficult to classify. Therefore the accuracy on three of the four test sets does not raise above chance (50%).

The dataset can be used to measure the propensity of a learning algorithm to focus only on parts of images, its resilience to (potentially) spurious patterns, etc. It can replace the popular Colored-MNIST toy data for some use cases.

<p align="center"> <img src="testing-ood.png" width="500"><br> <i>Example use case: OOD testing with 2-block collages.</i> </p>

Downloads

We provide 4-block and 2-block versions (MNIST and CIFAR only) of the dataset. We provide ordered and shuffled versions (blocks appearing in random order). The shuffled version can be used to demonstrate that a given method does not rely on a known or constant image structure. We generated the collages in 1/4th the resolution of the original datasets (i.e. collages of 16x16 pixels) to enable very fast experimentation. Other versions can be generated with the script provided.

Generation of the dataset

We provide a Matlab script to generate the dataset in different versions than those provided. The script proceeds as follows. We use first load images from MNIST, Fashion-MNIST, CIFAR-10, and SVHN. The images are converted to grayscale. The images from MNIST and Fashion-MNIST are padded to 32x32 pixels. We pre-select two classes from each dataset to be respectively associated with the labels (of collages) 0 and 1. We follow Shah et al. and choose 0/1 for MNIST, automobile/truck for CIFAR-10. We then choose 0/1 for SVHN and pullover/coat for Fashion-MNIST. We generate a training set of 51,200 collages (=50*1024) and several test sets of 10,240 collages (=10*1024). Each collage is formed by tiling four blocks, each containing an image chosen at random from the corresponding source dataset. The images in the training/evaluation sets come from the original training/test sets of the source datasets.

In the training set, the class in each block is perfectly correlated with collage label. In each of the four test sets, the class in only one block is correlated with the collage label. Other blocks are randomized to either of the two possible classes. We also generate four training sets in this manner to be used solely to obtain upper bounds on the highest accuracy achievable on each block with a given model/architecture.

Citation

Please cite the dataset as follows:

@article{teney2021evading,
  title={Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization},
  author={Teney, Damien and Abbasnejad, Ehsan  and Lucey, Simon and van den Hengel, Anton},
  year={2021},
  journal={arXiv preprint arXiv:2105.05612}
}

Also check out the paper by Shah et al. that first proposed 2-block collages of MNIST and CIFAR-10: The Pitfalls of Simplicity Bias in Neural Networks.

Please report any issue to contact@damienteney.info.