Home

Awesome

<p> <img src="imgs/fig.png" width="1000"> <br /> </p> <hr> <h1> GraphMAE: Self-Supervised Masked Graph Autoencoders </h1>

Implementation for KDD'22 paper: GraphMAE: Self-Supervised Masked Graph Autoencoders.

We also have a Chinese blog about GraphMAE on Zhihu (知乎), and an English Blog on Medium.

GraphMAE is a generative self-supervised graph learning method, which achieves competitive or better performance than existing contrastive methods on tasks including node classification, graph classification, and molecular property prediction.

<p> <img src="imgs/compare.png" width="520"><img src="imgs/ablation.jpg" width="270"> <br /> </p> <h3> ❗ Update </h3>

[2023-04-12] GraphMAE2 is published and the code can be found here.

[2022-12-14] The PYG implementation of GraphMAE for node / graph classification is available at this branch.

<h2>Dependencies </h2> <h2>Quick Start </h2>

For quick start, you could run the scripts:

Node classification

sh scripts/run_transductive.sh <dataset_name> <gpu_id> # for transductive node classification
# example: sh scripts/run_transductive.sh cora/citeseer/pubmed/ogbn-arxiv 0
sh scripts/run_inductive.sh <dataset_name> <gpu_id> # for inductive node classification
# example: sh scripts/run_inductive.sh reddit/ppi 0

# Or you could run the code manually:
# for transductive node classification
python main_transductive.py --dataset cora --encoder gat --decoder gat --seed 0 --device 0
# for inductive node classification
python main_inductive.py --dataset ppi --encoder gat --decoder gat --seed 0 --device 0

Supported datasets:

Run the scripts provided or add --use_cfg in command to reproduce the reported results.

Graph classification

sh scripts/run_graph.sh <dataset_name> <gpu_id>
# example: sh scripts/run_graph.sh mutag/imdb-b/imdb-m/proteins/... 0 

# Or you could run the code manually:
python main_graph.py --dataset IMDB-BINARY --encoder gin --decoder gin --seed 0 --device 0

Supported datasets:

Run the scripts provided or add --use_cfg in command to reproduce the reported results.

Molecular Property Prediction

Please refer to codes in ./chem for molecular property prediction.

<h2> Datasets </h2>

Datasets used in node classification and graph classification will be downloaded automatically from https://www.dgl.ai/ when running the code.

<h2> Experimental Results </h2>

Node classification (Micro-F1, %):

CoraCiteseerPubMedOgbn-arxivPPIReddit
DGI82.3±0.671.8±0.776.8±0.670.34±0.1663.80±0.2094.0±0.10
MVGRL83.5±0.473.3±0.580.1±0.7---
BGRL82.7±0.671.1±0.879.6±0.571.64±0.1273.63±0.1694.22±0.03
CCA-SSG84.0±0.473.1±0.381.0±0.471.24±0.2073.34±0.1795.07±0.02
GraphMAE(ours)84.2±0.473.4±0.481.1±0.471.75±0.1774.50±0.2996.01±0.08

Graph classification (Accuracy, %)

IMDB-BIMDB-MPROTEINSCOLLABMUTAGREDDIT-BNCI1
InfoGraph73.03±0.8749.69±0.5374.44±0.3170.65±1.1389.01±1.1382.50±1.4276.20±1.06
GraphCL71.14±0.4448.58±0.6774.39±0.4571.36±1.1586.80±1.3489.53±0.8477.87±0.41
MVGRL74.20±0.7051.20±0.50--89.70±1.1084.50±0.60-
GraphMAE(ours)75.52±0.6651.63±0.5275.30±0.3980.32±0.4688.19±1.2688.01±0.1980.40±0.30

Transfer learning on molecular property prediction (ROC-AUC, %):

BBBPTox21ToxCastSIDERClinToxMUVHIVBACEAvg.
AttrMasking64.3±2.876.7±0.464.2±0.561.0±0.771.8±4.174.7±1.477.2±1.179.3±1.671.1
GraphCL69.7±0.773.9±0.762.4±0.660.5±0.976.0±2.769.8±2.778.5±1.275.4±1.470.8
GraphLoG72.5±0.875.7±0.563.5±0.761.2±1.176.7±3.376.0±1.177.8±0.883.5±1.273.4
GraphMAE(ours)72.0±0.675.5±0.664.1±0.360.3±1.182.3±1.276.3±2.477.2±1.083.1±0.973.8
<h1> Citing </h1>

If you find this work is helpful to your research, please consider citing our paper:

@inproceedings{hou2022graphmae,
  title={GraphMAE: Self-Supervised Masked Graph Autoencoders},
  author={Hou, Zhenyu and Liu, Xiao and Cen, Yukuo and Dong, Yuxiao and Yang, Hongxia and Wang, Chunjie and Tang, Jie},
  booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={594--604},
  year={2022}
}