Home

Awesome

Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation (AugAN)

1.Introduction

This repository contains code for paper "Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation" (TKDE 2023).

Update - Graph Datasets with Distribution Shifts !!!

We created graph datasets with distribution shifts by using the above graph split code. The partitioned subgraphs (e.g., AD_dblp_sub0, AD_dblp_sub1, AD_dblp_sub2, AD_dblp_sub3) own similar characteristics yet present different distributions.

Although our paper is about graph anomaly detection (i.e., detecting rare categories on graph), the released datasets can be used for node classification tasks.

We believe the released datasets are also helpful for graph out-of-distribution generalization task (i.e., domain generalization on graphs) and graph domain adaptation task.

Please refer to our paper for the details of building graph datasets with distribution shifts.

2. Usage

Requirements:

Datasets:

Users can create datasets with the code.

Please check the data statistics and control the overlapped nodes in each sub-graph. The processed datasets are put into the ./sub_G_datasets/ folder.

Data Format:

The input data for AugAN is a '.mat' file with 'gnd' (ground-truth), 'Attributes' (attributes), and 'Network' (graph structure).

Example:

3. Citation

Please kindly cite the paper if you use the code or any resources in this repo:

@article{zhou2023improving,
  title={Improving generalizability of graph anomaly detection models via data augmentation},
  author={Zhou, Shuang and Huang, Xiao and Liu, Ninghao and Zhou, Huachi and Chung, Fu-Lai and Huang, Long-Kai},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2023},
  publisher={IEEE}
}