Home

Awesome

Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network

Paper: Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network

Following GraphGym, we release a platform Space4HGNN for designing and evaluating Heterogeneous Graph Neural Networks (HGNN). It is implemented with PyTorch and DGL, using the OpenHGNN package.

We have deployed the code into OpenHGNN. Here we will introduce the space4hgnn part in OpenHGNN and how to run it.

The file tree of Space4HGNN in OpenHGNN

.
├── README.md
├── openhgnn
│   ├── __init__.py
│   ├── dataset
│   │   ├── LinkPredictionDataset.py
│   │   ├── NodeClassificationDataset.py
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── academic_graph.py
│   │   ├── base_dataset.py
│   │   ├── hgb_dataset.py
│   │   └── utils.py
│   ├── layers
│   │   ├── GeneralGNNLayer.py
│   │   ├── GeneralHGNNLayer.py
│   │   ├── HeteroGraphConv.py
│   │   ├── HeteroLinear.py
│   │   ├── MetapathConv.py
│   │   ├── SkipConnection.py
│   │   └── __init__.py
│   ├── models
│   │   ├── __init__.py
│   │   ├── base_model.py
│   │   ├── general_HGNN.py
│   │   └── homo_GNN.py
│   ├── tasks
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── base_task.py
│   │   ├── link_prediction.py
│   │   ├── node_classification.py
│   ├── trainerflow
│   │   ├── README.md
│   │   ├── link_prediction.py
│   │   ├── node_classification.py
│   └── utils
│       ├── __init__.py
│       ├── activation.py
│       ├── evaluator.py
│       └── utils.py
├── requirements.txt
├── setup.py
├── space4hgnn
│   ├── README.md
│   ├── __init__.py
│   ├── figure
│   │   ├── distribution.py
│   │   └── rank.py
│   ├── generate_yaml.py
│   ├── parallel.sh
│   ├── prediction
│   │   └── excel
│   │       └── gather_all_Csv.py
│   └── utils.py
└── space4hgnn.py

How to run

1 Install

The installation process is same with OpenHGNN Get Started.

2 Run a single experiment

2.1 Generate designs randomly

Here we will generate a random design combination for each dataset and save it in a .yaml file. The candidate designs are listed in ./space4hgnn/generate_yaml.py.

python ./space4hgnn/generate_yaml.py --gnn_type gcnconv --times 1 --key has_bn --configfile test

--aggr -a, specify the gnn type, [and gcnconv, gatconv, sageconv, ginconv are optional].

--times -t, the ID of yaml file to control different random sampling.

--key -k, specify a design dimension.

--configfile -c, specify a directory name to store configure yaml file.

Note: .yaml file will be saved in the yaml_file_path which is controlled by four arguments.

yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml' with the above example code

2.2 Launch an experiment

python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB

--model -m name of models

--subgraph_extraction -u subgraph extraction methods

--task -t name of task

--dataset -t name of dataset

--gpu -g controls which gpu you will use. If you do not have gpu, set -g -1.

--repeat -r times to repeat, default 5

--gnn_type -a gun type.

--times -t same with generating random designs

--key -k a design dimension

--value -v the value of key design dimension

--configfile -c load the yaml file which is in the directory configfile

--predictfile -p The file path to store predict files.

e.g.:

We implement three model families in Space4HGNN, Homogenization model family, Relation model family, Meta-path model family.

For Homogenization model family, we can omit the parameter --subgraph_extraction,

python space4hgnn.py -m homo_GNN -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB

For Relation model family, --model is general_HGNN and --subgraph_extraction is relation,

python space4hgnn.py -m general_HGNN -u relation -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB

For Meta-path model family, --model is general_HGNN and --subgraph_extraction is meta-path

python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB

**Note: **

Similar with generating yaml file, experiment will load the design configuration from yaml_file_path. And it will save the results into a .csv file in prediction_file_path.

yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml'
prediction_file_path = './space4hgnn/prediction/excel/{}/{}_{}/{}_{}_{}_{}.csv'.format(predictfile, key, value, model_family, gnn_type, times, dataset)
# Here prediction_file_path = './space4hgnn/prediction/test/has_bn_True/metapath_gcnconv_1_HGBn-ACM.yaml'

3 Run a batch of experiments

An example:

./space4hgnn/parallel.sh 0 5 has_bn True node_classification test_paral test_paral

It will generate configuration files for the batch of experiments and launch a batch of experiments.

The following is the arguments descriptions:

  1. The first argument controls which gpu to use. Here is 0.
  2. Repeat times. Here is 5
  3. Design dimension. Here is BN.
  4. Choice of design dimension. Here set BN True.
  5. Task name. Here is nodeclassification
  6. Configfile is the path to save configuration files.
  7. Predictfile is the path to save prediction files.

Note:

If you encounter the error bash: ./space4hgnn/parallel.sh: Permission denied, you can try with cmd chmod +x ./space4hgnn/parallel.sh.

3 Analyze the results

3.1 Gather all results

To gather all experiments results, we should run the following command to gather all results into one .csv file.

python ./space4hgnn/prediction/excel/gather_all_Csv.py -p ./space4hgnn/prediction/excel/HGB

3.2 Analyze with figures

We offer ./figure/result.csv recording the experimental results.

3.2.1 Ranking analysis

We analyze the results with average ranking following GraphGym, the corresponding code is in figure/rank.py.

space4hgnn_rank

3.2.2 Distribution estimates

We analyze the results with distribution estimates following NDS, and the corresponding code is in figure/distribution.py.

space4hgnn_distribution

Cite

Please kindly cite our paper if you use this code:

@inproceedings{zhao2022space4hgnn,
  title={Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network},
  author={Zhao, Tianyu and Yang, Cheng and Li, Yibo and Gan, Quan and Wang, Zhenyi and Liang, Fengqi and Zhao, Huan and Shao, Yingxia and Wang, Xiao and Shi, Chuan},
  booktitle={SIGIR},
  year={2022}
}

Acknowledgement

The code is built on GraphGym, a method defining design space for graph neural network.