Awesome
Towards Fair Graph Anomaly Detection: Problem, New Datasets, and Evaluation
NEW! Accepted at CIKM 2024.
Our datasets (Reddit, Twitter) are publicly available through this link (alt link) as PyTorch Geometric datasets.
Environment
- python=3.8
- pytorch
- pyg
- networkx
- scipy
Please note that our work builds upon PyGOD. However, our work uses version 0.3 due to the different implementation of CoLA. In PyGOD v1.1 as of May 2024, their implementation of CoLA is different due to the use of neighbour sampling, instead of random walk sampling that is present in v0.3 and as described in the CoLA paper. We hope this clarifies why an older version of PyGOD is included in our repository. As a result, we have included the PyGOD v0.3 repo here.
Our implementations
We implemented the fairness regularisers (FairOD, HIN, correlation) in the utils.py
file that takes in the model’s raw anomaly score or loss and the sensitive attributes to calculate a fairness loss.
We also implemented the ADCG regulariser (to improve equality of odds) in the same utils.py
file.
We have modified the files for DOMINANT, CONAD, CoLA, DONE, AdONE, and included a new file for VGOD to include fairness regularizers. Please see the correspondig fit_with_fairness()
method that calls the above fairness regularisers while training the model. If the ADCG regulariser is used, the ideal DCG score is also calculated in this method.
Sample runs
Please see fairGAD/test_fair_fitting.py
. This file is our main driver file to obtain the results for all tests used in this paper.
User Account Inquiry and Removal
Please contact nnnk [at] gatech [dot] edu
with an email titled "FairGAD - Account Inquiry and Removal" including the your username and platform (Reddit/Twitter) to check if your account is used in our dataset and wish to be removed from it.
Data Use Agreement
If you would like to use our dataset, please contact the same email address above with an email titled "FairGAD - Data Use Agreement" and we will get in touch with you.
Potential Twitter bots
We have included a list of node indexes that correspond to Twitter accounts that have a Botometer "raw_overall" score of greater than 0.9 that may be possible Twitter bot accounts.
Citing FairGAD
Accepted at CIKM 2024. Citation to be updated.
We would appreciate a citation to the following paper if you have used our work:
@online{neoFairGraphAnomaly2024,
title = {Towards {{Fair Graph Anomaly Detection}}: {{Problem}}, {{New Datasets}}, and {{Evaluation}}},
author = {Neo, Neng Kai Nigel and Lee, Yeon-Chang and Jin, Yiqiao and Kim, Sang-Wook and Kumar, Srijan},
date = {2024-02-25},
eprint = {2402.15988},
eprinttype = {arxiv},
eprintclass = {cs},
url = {http://arxiv.org/abs/2402.15988},
}