Home

Awesome

<p align="center"> <br> <a href="https://image.flaticon.com/icons/svg/1671/1671517.svg"> <img src="https://github.com/safe-graph/DGFraud-TF2/blob/main/logo.png" width="550"/> </a> <br> <p> <p align="center"> <a href="https://travis-ci.com/github/safe-graph/DGFraud-TF2"> <img alt="travis-ci" src="https://travis-ci.com/safe-graph/DGFraud-TF2.svg?token=wicswr4X2g4v8jddTpUv&branch=main"> </a> <a href="https://www.tensorflow.org/install"> <img alt="Tensorflow" src="https://img.shields.io/badge/tensorflow-2.X-orange"> </a> <a href="https://www.python.org/"> <img alt="Python" src="https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9-blue"> </a> <a href="https://github.com/safe-graph/DGFraud-TF2/archive/main.zip"> <img alt="PRs" src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg"> </a> <a href="https://github.com/safe-graph/DGFraud-TF2/pulls"> <img alt="GitHub release" src="https://img.shields.io/github/v/release/safe-graph/DGFraud-TF2?include_prereleases"> </a> </p> <h3 align="center"> <p>A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X </h3>

Introduction | Useful Resources | Installation | Datasets | User Guide | Implemented Models | How to Contribute

Introduction

DGFraud-TF2 is a Graph Neural Network (GNN) based toolbox for fraud detection. It is the Tensorflow 2.X version of DGFraud, which is implemented using TF 1.X. It integrates the implementation & comparison of state-of-the-art GNN-based fraud detection models. The introduction of implemented models can be found here.

We welcome contributions to this repo like adding new fraud detectors and extending the features of the toolbox.

If you use the toolbox in your project, please cite the paper below and the algorithms you used:

CIKM'20 (PDF)

@inproceedings{dou2020enhancing,
  title={Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters},
  author={Dou, Yingtong and Liu, Zhiwei and Sun, Li and Deng, Yutong and Peng, Hao and Yu, Philip S},
  booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20)},
  year={2020}
}

Useful Resources

Installation

git clone https://github.com/safe-graph/DGFraud-TF2.git
cd DGFraud-TF2
python setup.py install

Requirements

* python>=3.6
* tensorflow>=2.0
* numpy>=1.16.4
* scipy>=1.2.0

Datasets

DBLP

We uses the pre-processed DBLP dataset from Jhy1993/HAN You can run the FdGars, Player2Vec, GeniePath and GEM based on the DBLP dataset. Unzip the archive before using the dataset:

cd dataset
unzip DBLP4057_GAT_with_idx_tra200_val_800.zip

Example dataset

We implement example graphs for SemiGNN, GAS and GEM in data_loader.py. Because those models require unique graph structures or node types, which cannot be found in opensource datasets.

Yelp dataset

For GraphConsis and GraphSAGE, we preprocessed Yelp Spam Review Dataset with reviews as nodes and three relations as edges.

The dataset with .mat format is located at /dataset/YelpChi.zip. The .mat file includes:

The YelpChi data preprocessing details can be found in our CIKM'20 paper. To get the complete metadata of the Yelp dataset, please email to ytongdou@gmail.com for inquiry.

User Guide

Running the example code

You can find the implemented models in algorithms directory. For example, you can run Player2Vec using:

python Player2Vec_main.py 

You can specify parameters for models when running the code.

Running on your datasets

Have a look at the load_data_dblp() function in utils/utils.py for an example.

In order to use your own data, you have to provide:

You can specify a dataset as follows:

python xx_main.py --dataset your_dataset 

or by editing xx_main.py

The structure of code

The repository is organized as follows:

Implemented Models

Model Source

ModelPaperVenueReference
SemiGNNA Semi-supervised Graph Attentive Network for Financial Fraud DetectionICDM 2019BibTex
Player2VecKey Player Identification in Underground Forums over Attributed Heterogeneous Information Network Embedding FrameworkCIKM 2019BibTex
GASSpam Review Detection with Graph Convolutional NetworksCIKM 2019BibTex
FdGarsFdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review SystemWWW 2019BibTex
GeniePathGeniePath: Graph Neural Networks with Adaptive Receptive PathsAAAI 2019BibTex
GEMHeterogeneous Graph Neural Networks for Malicious Account DetectionCIKM 2018BibTex
GraphSAGEInductive Representation Learning on Large GraphsNIPS 2017BibTex
GraphConsisAlleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud DetectionSIGIR 2020BibTex
HACUDCash-Out User Detection Based on Attributed Heterogeneous Information Network with a Hierarchical Attention MechanismAAAI 2019BibTex

Model Comparison

ModelApplicationGraph TypeBase Model
SemiGNNFinancial FraudHeterogeneousGAT, LINE, DeepWalk
Player2VecCyber CriminalHeterogeneousGAT, GCN
GASOpinion FraudHeterogeneousGCN, GAT
FdGarsOpinion FraudHomogeneousGCN
GeniePathFinancial FraudHomogeneousGAT
GEMFinancial FraudHeterogeneousGCN
GraphSAGEOpinion FraudHomogeneousGraphSAGE
GraphConsisOpinion FraudHeterogeneousGraphSAGE
HACUDFinancial FraudHeterogeneousGAT

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create PR or email to bdscsafegraph@gmail.com for inquiry.