Home

Awesome

confidentlearning-reproduce

Experimental data for reproducibility of CIFAR-10 experimental results in the confident learning paper.

The code to generate these Confident Learning CIFAR-10 benchmarking results is available in the cleanlab Python package, specifically in examples/cifar10. We used cleanlab v0.1.0 for the original paper.

Because GitHub limits filesizes to 100MB, I cannot upload trained ResNet-50 models (180MB each), but for every setting, I upload an out log file with the accuracy at every batch and test accuracy at every epoch. The file naming conventions are as follows

A PyTorch-ready version of CIFAR-10 dataset

is available here for download: cifar10/dataset

Need out-of-sample predicted probabilities for CIFAR-10 train set?

You can obtain standard (no noise added to label) predicted probabilities here.

These are computed using four-fold cross-validation with a ResNet50 architecture. You can download the out-of-sample predicted probabilities for all training examples in CIFAR-10 for various noise and sparsities settings here:

Precomputed label errors for CIFAR-10 train set

Using the psx predicted probabilities above as input, I used cleanlab, the Python package that implements confident learning, to compute the label errors for every confident learning method in the CL paper, for every noise and sparsity setting. The outputs are boolean numpy arrays. They are ordered in the same order as the examples when loaded using torch.utils.data.dataloader. The PyTorch-prepared CIFAR dataset is available here for download: cifar10/dataset. If you load this dataset in PyTorch, indices will match exactly with the label error masks below.

Column headers are formatted as: <sparsity * 10>_<noise * 10>.

METHOD0_22_24_26_20_42_44_46_40_72_74_76_7
C_confusionLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINK
formulaLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINK
CL: PBCLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINK
CL: PBNRLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINK
CL: C+NRLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINKLINK

License

Copyright (c) 2017-2020 Curtis Northcutt. Released under the MIT License. See LICENSE for details.