Home

Awesome

Learning to Augment Distributions for Out-of-distribution Detection

Learning to Augment Distributions for Out-of-Distribution Detection (NeurIPS 2023)

Qizhou Wang, Zhen Fang, Yonggang Zhang, Feng Liu, Yixuan Li, and Bo Han.

Keywords: Out-of-distribution Detection, Reliable Machine Learning

Abstract: Open-world classification systems should discern out-of-distribution (OOD) data whose labels deviate from those of in-distribution (ID) cases, motivating recent studies in OOD detection. Advanced works, despite their promising progress, may still fail in the open world, owing to the lack of knowledge about unseen OOD data in advance. Although one can access auxiliary OOD data (distinct from unseen ones) for model training, it remains to analyze how such auxiliary data will work in the open world. To this end, we delve into such a problem from a learning theory perspective, finding that the distribution discrepancy between the auxiliary and the unseen real OOD data is the key to affecting the open-world detection performance. Accordingly, we propose Distributional-Augmented OOD Learning (DAL), alleviating the OOD distribution discrepancy by crafting an OOD distribution set that contains all distributions in a Wasserstein ball centered on the auxiliary OOD distribution. We justify that the predictor trained over the worst OOD data in the ball can shrink the OOD distribution discrepancy, thus improving the open-world detection performance given only the auxiliary OOD data. We conduct extensive evaluations across representative OOD detection setups, demonstrating the superiority of our DAL over its advanced counterparts.

@inproceedings{
wang2023dal,
title={Learning to Augment Distributions for Out-of-distribution Detection},
author={Wang, Qizhou and Fang, Zhen and Zhang, Yonggang and Liu, Feng and Li, Yixuan and Han, Bo},
booktitle={NeurIPS},
year={2023},
url={https://openreview.net/forum?id=OtU6VvXJue}
}

Get Started

Environment

Pretrained Models and Datasets

Pretrained models are provided in folder

./models/

Please download the datasets in folder

./data/

Surrogate OOD Dataset

Test OOD Datasets

Training

To train the DOE model on CIFAR benckmarks, simply run:

python main.py cifar10 --gamma=10 --beta=.01  --rho=10  --iter=10 --learning_rate=0.07 --strength=1
python main.py cifar100  --gamma=10 --beta=.005  --rho=10  --iter=10 --learning_rate=0.07 --strength=1

Results

The key results on CIFAR benchmarks are listed in the following table.

CIFAR-10CIFAR-10CIFAR-100CIFAR-100
FPR95AUROCFPR95AUROC
MSP50.1591.0278.6175.95
OE4.6798.8843.1490.27
DAL2.6899.0129.6893.92