Home

Awesome

Anomaly Detection in Networks via Score-Based Generative Models

This is a repository for an ICML 2023 SPIGM Workshop paper and my Master's Thesis at Skoltech.

Authors: Dmitrii Gavrilev, Evgeny Burnaev (research advisor)

In this project, we use GDSS as a generative model.

Abstract

Node outlier detection in attributed graphs is a challenging problem for which there is no method that would work well across different datasets. Motivated by the state-of-the-art results of score-based models in graph generative modeling, we propose to incorporate them into the aforementioned problem. Our method achieves competitive results on small-scale graphs. We provide an empirical analysis of the Dirichlet energy, and show that generative models might struggle to accurately reconstruct it.

Prerequisites

Usage

python run_benchmark.py trains GDSS with random hyperparameters on a chosen dataset, runs inference with our methods, and repeats this pipeline 20 times. The result of inference is a .npy file with intermediate calculations.

Arguments:

We evaluate our methods in a notebook by processing intermediate calculations from .npy files. See an example of training, inference and evaluation in Colab:

(Matrix distance as a dissimilarity measure) <a target="_blank" href="https://colab.research.google.com/github/realfolkcode/GraphDiffusionAnomaly/blob/main/notebooks/gda_benchmark.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

(Shift in energy as a dissimilarity measure) <a target="_blank" href="https://colab.research.google.com/github/realfolkcode/GraphDiffusionAnomaly/blob/main/notebooks/gda_benchmark_energy.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

Checkpoints

Optionally, you can download the model checkpoints here.

Unzip them at ./checkpoint/{dataset_name}/ and run the benchmark with the --skip_training True option.