Home

Awesome

<p align="center"> <br> <a href="https://image.flaticon.com/icons/svg/1671/1671517.svg"> <img src="https://github.com/safe-graph/UGFraud/blob/master/UGFraud_logo.png" width="400"/> </a> <br> <p> <p align="center"> <a href="https://travis-ci.org/github/safe-graph/UGFraud"> <img alt="Building" src="https://travis-ci.org/safe-graph/UGFraud.svg?branch=master"> </a> <a href="https://github.com/safe-graph/UGFraud/blob/master/LICENSE"> <img alt="GitHub" src="https://img.shields.io/github/license/safe-graph/UGFraud"> </a> <a href="https://pepy.tech/project/ugfraud"> <img alt="Downloads" src="https://pepy.tech/badge/ugfraud"> </a> <a href="https://pypi.org/project/UGFraud/"> <img alt="Pypi version" src="https://img.shields.io/pypi/v/ugfraud"> </a> </p> <h3 align="center"> <p>An Unsupervised Graph-based Toolbox for Fraud Detection </h3>

Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates several state-of-the-art graph-based fraud detection algorithms. It can be applied to bipartite graphs (e.g., user-product graph), and it can estimate the suspiciousness of both nodes and edges. The implemented models can be found here.

The toolbox incorporates the Markov Random Field (MRF)-based algorithm, dense-block detection-based algorithm, and SVD-based algorithm. For MRF-based algorithms, the users only need the graph structure and the prior suspicious score of the nodes as the input. For other algorithms, the graph structure is the only input.

Meanwhile, we have a deep graph-based fraud detection toolbox which implements state-of-the-art graph neural network-based fraud detectors.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you use the toolbox in your project, please cite the paper below and the algorithms you used :

@inproceedings{dou2020robust,
  title={Robust Spammer Detection by Nash Reinforcement Learning},
  author={Dou, Yingtong and Ma, Guixiang and Yu, Philip S and Xie, Sihong},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2020}
}

Useful Resources

Table of Contents

Installation

You can install UGFraud from pypi:

pip install UGFraud

or download and install from github:

git clone https://github.com/safe-graph/UGFraud.git
cd UGFraud
python setup.py install

Dataset

The demo data is not the intact data (rating and date information are missing). The rating information is only used in ZooBP demo. If you need the intact date to play demo, please email bdscsafegraph@gmail.com to download the intact data from Yelp Spam Review Dataset. The metadata.gz file in /UGFraud/Yelp_Data/YelpChi includes:

User Guide

Running the example code

You can find the implemented models in /UGFraud/Demo/ directory. For example, you can run fBox using:

python eval_fBox.py 

Running on your datasets

Check out the data_to_network_graph function in /UGFraud/Demo/demo_pre.py to convert your data into the networkx graph.

In order to use your own data, you have to provide the following information at least:

'user_id':{
        'product_id':
                {
                'label': 1
                }

You can use dict_to networkx(graph_dict) function from /Utils/helper.py file to convert your graph_dict into a networkx graph. For more details, please see data_to_network_graph.py.

The structure of code

The /UGFraud repository is organized as follows:

Implemented Models

ModelPaperVenueReference
SpEagleCollective Opinion Spam Detection: Bridging Review Networks and MetadataKDD 2015BibTex
GANGGANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed GraphICDM 2017BibTex
fBoxSpotting Suspicious Link Behavior with fBox: An Adversarial PerspectiveICDM 2014BibTex
FraudarFRAUDAR: Bounding Graph Fraud in the Face of CamouflageKDD 2016BibTex
ZooBPZooBP: Belief Propagation for Heterogeneous NetworksVLDB 2017BibTex
SVDSingular value decomposition and least squares solutions-BibTex
PriorEvaluating suspicioueness based on prior information--

Model Comparison

ModelApplicationGraph TypeModel Type
SpEagleReview SpamTripartiteMRF
GANGSocial SybilBipartiteMRF
fBoxSocial FraudsterBipartiteSVD
FraudarSocial FraudsterBipartiteDense-block
ZooBPE-commerce FraudTripartiteMRF
SVDDimension ReductionBipartiteSVD

TODO List

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create issues or send email to bdscsafegraph@gmail.com for inquiry.