Home

Awesome

Botspot++: A Hierarchical Deep Ensemble Model for Bots Install Fraud Detection in Mobile Advertising (https://doi.org/10.1145/3476107)

Datasets

To evaluate our proposed model more comprehensively, we built three datasets for different time periods, which are avaliable from https://drive.google.com/drive/folders/1CBIOxCtI5Ztx-E5Ua7nO0UjdEabJM2nC?usp=sharing. And The statistics data of the four offline datasets are detailed as below.

Dataset#Dev#Chan-Camp#Normal Install(Train, Test)#Bots Install(Train, Test)
dataset-1167610113471245650, 162960270815, 20560
dataset-2131307311901049610, 195792139349, 9596
dataset-3129989511391153705, 18143777708, 12016

Requirements

Usage

  1. git clone https://github.com/mobvistaresearch/BotSpot-Plus.git
  2. cd BotSpot-Plus
  3. Download datasets
    download them from this link(https://drive.google.com/drive/folders/1CBIOxCtI5Ztx-E5Ua7nO0UjdEabJM2nC?usp=sharing) and put the datasets folder on root folder of current project.
  4. Model training LightGBM:
        cd ML
        # set which dataset is used for training and the parameters of LightGBM
        python main.py --dataset dataset1 --num_trees 500 --max_depth 5
    MLP:
        cd DL/MLP
        # set which dataset to use for training
        python main.py --dataset dataset1
    GAT:
        cd DL/GAT
        # set which dataset to use for training
        python main.py --dataset dataset1 --device_num 0
    GraphConsis:
        cd DL/GraphConsis
        # set which dataset and which gpu device to use for training
        python main.py --dataset dataset1 --device_num 0
    GraphSAGE、BotSpot、BotSpot++:
        cd DL/BotSpot
        --dataset: the dataset specified, e.g., dataset1, dataset2, etc.
        --use_gbm: whether to use gbm model for global context. e.g., take True or False.
        --use_stratified: whether to use stratified during message passing, take True or False.
        --use_botspot_plus: whether to use botspot_plus
        --use_self_attn: whether to use self attention for leaf embeddings
        --device_num: set which gpu device to use for training
        GraphSAGE usage:
        python main.py --dataset dataset1 --use_gbm false --use_stratified false
                                     --use_botspot_plus false --use_self_attn false --device_num 0
        BotSpot usage:
        python main.py --dataset dataset1 --use_gbm true --use_stratified true
                                     --use_botspot_plus false --use_self_attn false --device_num 1
        BotSpot++ usage:
        python main.py --dataset dataset1 --use_gbm true --use_stratified true
                                     --use_botspot_plus true --use_self_attn true --device_num 2