Awesome

Botspot++: A Hierarchical Deep Ensemble Model for Bots Install Fraud Detection in Mobile Advertising (https://doi.org/10.1145/3476107)

Datasets

To evaluate our proposed model more comprehensively, we built three datasets for different time periods, which are avaliable from https://drive.google.com/drive/folders/1CBIOxCtI5Ztx-E5Ua7nO0UjdEabJM2nC?usp=sharing. And The statistics data of the four offline datasets are detailed as below.

Dataset	#Dev	#Chan-Camp	#Normal Install(Train, Test)	#Bots Install(Train, Test)
dataset-1	1676101	1347	1245650, 162960	270815, 20560
dataset-2	1313073	1190	1049610, 195792	139349, 9596
dataset-3	1299895	1139	1153705, 181437	77708, 12016

Requirements

Pytorch 1.6.0
LightGBM 3.0.0
Python 3.6
scikit-learn 0.23.2
Numpy 1.19.1

Usage

git clone https://github.com/mobvistaresearch/BotSpot-Plus.git
cd BotSpot-Plus
Download datasets
download them from this link(https://drive.google.com/drive/folders/1CBIOxCtI5Ztx-E5Ua7nO0UjdEabJM2nC?usp=sharing) and put the datasets folder on root folder of current project.
Model training LightGBM:
    cd ML
    # set which dataset is used for training and the parameters of LightGBM
    python main.py --dataset dataset1 --num_trees 500 --max_depth 5
MLP:
    cd DL/MLP
    # set which dataset to use for training
    python main.py --dataset dataset1
GAT:
    cd DL/GAT
    # set which dataset to use for training
    python main.py --dataset dataset1 --device_num 0
GraphConsis:
    cd DL/GraphConsis
    # set which dataset and which gpu device to use for training
    python main.py --dataset dataset1 --device_num 0
GraphSAGE、BotSpot、BotSpot++:
    cd DL/BotSpot
    --dataset: the dataset specified, e.g., dataset1, dataset2, etc.
    --use_gbm: whether to use gbm model for global context. e.g., take True or False.
    --use_stratified: whether to use stratified during message passing, take True or False.
    --use_botspot_plus: whether to use botspot_plus
    --use_self_attn: whether to use self attention for leaf embeddings
    --device_num: set which gpu device to use for training
    GraphSAGE usage:
    python main.py --dataset dataset1 --use_gbm false --use_stratified false
                                 --use_botspot_plus false --use_self_attn false --device_num 0
    BotSpot usage:
    python main.py --dataset dataset1 --use_gbm true --use_stratified true
                                 --use_botspot_plus false --use_self_attn false --device_num 1
    BotSpot++ usage:
    python main.py --dataset dataset1 --use_gbm true --use_stratified true
                                 --use_botspot_plus true --use_self_attn true --device_num 2