Home

Awesome

Official code and data repository of ADBench: Anomaly Detection Benchmark (NeurIPS 2022). Please star, watch, and fork ADBench for the active updates!

Recent News:

Citing ADBench:

Our ADBench benchmark paper is now available on arxiv and NeurIPS Proceedings. If you find this work useful or use some our released datasets, we would appreciate citations to the following paper:

@inproceedings{han2022adbench,  
      title={ADBench: Anomaly Detection Benchmark},   
      author={Songqiao Han and Xiyang Hu and Hailiang Huang and Mingqi Jiang and Yue Zhao},  
      booktitle={Neural Information Processing Systems (NeurIPS)}
      year={2022},  
}

Who Are We? ✨

ADBench is a collaborative effort between researchers at Shanghai University of Finance and Economics (SUFE) and Carnegie Mellon University (CMU). The project is designed and conducted by Minqi Jiang (SUFE) and Yue Zhao (CMU), and Xiyang Hu (CMU) --the author(s) of important anomaly detection libraries, including anomaly detection for tabular (PyOD), time-series (TODS), and graph data (PyGOD). The project is also maintained by Chaochuan Hou (SUFE).

<a href="https://github.com/Minqi824/ADBench/graphs/contributors"> <img src="https://contrib.rocks/image?repo=Minqi824/ADBench" /> </a>

Why Do You Need ADBench?

ADBench is (to our best knowledge) the most comprehensive tabular anomaly detection benchmark, where we analyze the performance of 30 anomaly detection algorithms on 57 datasets (where we introduced 10 new datasets). By analyzing both research needs and deployment requirements in industry, ADBench conducts 98,436 experiments with three major angles:

  1. the effect of supervision (e.g., ground truth labels) by including 14 unsupervised, 7 semi-supervised, and 9 supervised methods;
  2. algorithm performance under different types of anomalies by simulating the environments with 4 types of anomalies; and
  3. algorithm robustness and stability under 3 settings of data corruptions.

Key Takeaways in 1 Minute:

  1. :bangbang: surprisingly none of the benchmarked unsupervised algorithms is statistically better than others, emphasizing the importance of algorithm selection;
  2. :bangbang: with merely 1% labeled anomalies, most semi-supervised methods can outperform the best unsupervised method, justifying the importance of supervision;
  3. in controlled environments, we observe that best unsupervised methods for specific types of anomalies are even better than semi- and fully-supervised methods, revealing the necessity of understanding data characteristics;
  4. semi-supervised methods show potential in achieving robustness in noisy and corrupted data, possibly due to their efficiency in using labels and feature selection;
  5. :interrobang: and many more can be found in our papers (Section 4)

The Figure below provides an overview of our proposed ADBench (see our paper for details).

ADBench


How to use ADBench?

We envision three primary usages of ADBench:

We provide full guidance of ADBench in the notebook.

Installation

pip install adbench
pip install --upgrade adbench

Prerequisite: Downloading datasets in ADBench from the github repo

from adbench.myutils import Utils
utils = Utils() # utility function
# download datasets from the remote github repo
# we recommend jihulab for China mainland user and github otherwise
utils.download_datasets(repo='jihulab')

Quickly implement ADBench for benchmarking AD algorithms.

We present the following example for quickly implementing ADBench in three different Angles illustrated in the paper. Currently, 57 datasets can be used for evaluating 30 algorithms in ADBench, and we encourage to test your customized datasets/algorithms in our ADBench testbed.

Run Entire Experiments of ADBench

from adbench.run import RunPipeline

'''
Params:
suffix: file name suffix;

parallel: running either 'unsupervise', 'semi-supervise', or 'supervise' (AD) algorithms,
corresponding to the Angle I: Availability of Ground Truth Labels (Supervision);

realistic_synthetic_mode: testing on 'local', 'global', 'dependency', and 'cluster' anomalies, 
corresponding to the Angle II: Types of Anomalies;

noise type: evaluating algorithms on 'duplicated_anomalies', 'irrelevant_features' and 'label_contamination',
corresponding to the Angle III: Model Robustness with Noisy and Corrupted Data.
'''

# return the results including [params, model_name, metrics, time_fit, time_inference]
# besides, results will be automatically saved in the dataframe and ouputted as csv file in adbench/result folder
pipeline = RunPipeline(suffix='ADBench', parallel='semi-supervise', realistic_synthetic_mode=None, noise_type=None)
results = pipeline.run()

pipeline = RunPipeline(suffix='ADBench', parallel='unsupervise', realistic_synthetic_mode='cluster', noise_type=None)
results = pipeline.run()

pipeline = RunPipeline(suffix='ADBench', parallel='supervise', realistic_synthetic_mode=None, noise_type='irrelevant_features')
results = pipeline.run()

Run Your Customized Algorithms on either ADBench Datasets or Your Customized Dataset

# customized model on ADBench's datasets
from adbench.run import RunPipeline
from adbench.baseline.Customized.run import Customized

# notice that you should specify the corresponding category of your customized AD algorithm
# for example, here we use Logistic Regression as customized clf, which belongs to the supervised algorithm
# for your own algorithm, you can realize the same usage as other baselines by modifying the fit.py, model.py, and run.py files in the adbench/baseline/Customized
pipeline = RunPipeline(suffix='ADBench', parallel='supervise', realistic_synthetic_mode=None, noise_type=None)
results = pipeline.run(clf=Customized)

# customized model on customized dataset
import numpy as np
dataset = {}
dataset['X'] = np.random.randn(1000, 20)
dataset['y'] = np.random.choice([0, 1], 1000)
results = pipeline.run(dataset=dataset, clf=Customized)

See detailed guidance of ADBench in the notebook.

Datasets

ADBench includes 57 datasets, as shown in the following Table.

We have unified all the datasets in .npz format, and you can directly access a dataset by the following script

import numpy as np
data = np.load('adbench/datasets/Classical/6_cardio.npz', allow_pickle=True)
X, y = data['X'], data['y']
NumberData# Samples# Features# Anomaly% AnomalyCategory
1ALOI495342715083.04Image
2annthyroid720065347.42Healthcare
3backdoor9532919623292.44Network
4breastw683923934.99Healthcare
5campaign4118862464011.27Finance
6cardio1831211769.61Healthcare
7Cardiotocography21142146622.04Healthcare
8celeba2025993945472.24Image
9census299285500185686.20Sociology
10cover2860481027470.96Botany
11donors61932610367105.93Sociology
12fault19412767334.67Physical
13fraud284807294920.17Finance
14glass214794.21Forensic
15Hepatitis80191316.25Healthcare
16http567498322110.39Web
17InternetAds1966155536818.72Image
18Ionosphere3513212635.90Oryctognosy
19landsat643536133320.71Astronautics
20letter1600321006.25Image
21Lymphography1481864.05Healthcare
22magic.gamma1902010668835.16Physical
23mammography1118362602.32Healthcare
24mnist76031007009.21Image
25musk3062166973.17Chemistry
26optdigits5216641502.88Image
27PageBlocks5393105109.46Document
28pendigits6870161562.27Image
29Pima768826834.90Healthcare
30satellite643536203631.64Astronautics
31satimage-2580336711.22Astronautics
32shuttle49097935117.15Astronautics
33skin24505735085920.75Image
34smtp951563300.03Web
35SpamBase420757167939.91Document
36speech3686400611.65Linguistics
37Stamps3409319.12Document
38thyroid37726932.47Healthcare
39vertebral24063012.50Biology
40vowels145612503.43Linguistics
41Waveform3443211002.90Physics
42WBC2239104.48Healthcare
43WDBC36730102.72Healthcare
44Wilt481952575.33Botany
45wine12913107.75Chemistry
46WPBC198334723.74Healthcare
47yeast1484850734.16Biology
48CIFAR1052635122635.00Image
49FashionMNIST63155123155.00Image
50MNIST-C100005125005.00Image
51MVTec-ADSee Table B2.Image
52SVHN52085122605.00Image
53Agnews100007685005.00NLP
54Amazon100007685005.00NLP
55Imdb100007685005.00NLP
56Yelp100007685005.00NLP
5720newsgroupsSee Table B3.NLP

Algorithms

ADBench can be served as a great complement to the PyOD toolkit, providing additional deep learning anomaly detection algorithms API. Compared to the previous benchmark studies, we have a larger algorithm collection with

  1. latest unsupervised AD algorithms like DeepSVDD and ECOD;
  2. SOTA semi-supervised algorithms, including DeepSAD and DevNet;
  3. latest network architectures like ResNet in computer vision (CV) and Transformer in natural language processing (NLP) domain ---we adapt ResNet and FTTransformer models for tabular AD in the proposed ADBench; and
  4. ensemble learning methods like LightGBM, XGBoost, and CatBoost. The Figure below shows the algorithms (14 unsupervised, 7 semi-supervised, and 9 supervised algorithms) in ADBench. Algorithms

For each algorithm, we also introduce its specific implementation in the following Table. The only thing worth noting is that model name should be specified (especially for those models deployed by their corresponding package, e.g., PyOD). The following codes show the example to import AD models. Please see the Table for complete AD models included in ADBench and their import methods.

# Directly import AD algorithms from the existing toolkits like PyOD
from adbench.baseline.PyOD import PYOD
model = PYOD(seed=42, model_name='XGBOD')  # initialization
model.fit(X_train, y_train)  # fit
score = model.predict_score(X_test)  # predict

# Import deep learning AD algorithms from our ADBench
from adbench.baseline.PReNet.run import PReNet
model = PReNet(seed=42)
model.fit(X_train, y_train)  # fit
score = model.predict_score(X_test)  # predict
ModelYearTypeDLImportSource
PCABefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
OCSVMBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
LOFBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
CBLOFBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
COFBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
HBOSBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
KNNBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
SODBefore 2017Unsupfrom adbench.baseline.PyOD import PYODLink
COPOD2020Unsupfrom adbench.baseline.PyOD import PYODLink
ECOD2022Unsupfrom adbench.baseline.PyOD import PYODLink
IForest†Before 2017Unsupfrom adbench.baseline.PyOD import PYODLink
LODA†Before 2017Unsupfrom adbench.baseline.PyOD import PYODLink
DeepSVDD2018Unsupfrom adbench.baseline.PyOD import PYODLink
DAGMM2018Unsupfrom adbench.baseline.DAGMM.run import DAGMMLink
GANomaly2018Semifrom adbench.baseline.GANomaly.run import GANomalyLink
XGBOD†2018Semifrom adbench.baseline.PyOD import PYODLink
DeepSAD2019Semifrom adbench.baseline.DeepSAD.src.run import DeepSADLink
REPEN2018Semifrom adbench.baseline.REPEN.run import REPENLink
DevNet2019Semifrom adbench.baseline.DevNet.run import DevNetLink
PReNet2020Semifrom adbench.baseline.PReNet.run import PReNet/
FEAWAD2021Semifrom adbench.baseline.FEAWAD.run import FEAWADLink
NBBefore 2017Supfrom adbench.baseline.Supervised import supervisedLink
SVMBefore 2017Supfrom adbench.baseline.Supervised import supervisedLink
MLPBefore 2017Supfrom adbench.baseline.Supervised import supervisedLink
RF†Before 2017Supfrom adbench.baseline.Supervised import supervisedLink
LGB†2017Supervisedfrom adbench.baseline.Supervised import supervisedLink
XGB†Before 2017Supfrom adbench.baseline.Supervised import supervisedLink
CatB†2019Supfrom adbench.baseline.Supervised import supervisedLink
ResNet2019Supfrom adbench.baseline.FTTransformer.run import FTTransformerLink
FTTransformer2019Supfrom adbench.baseline.FTTransformer.run import FTTransformerLink