Home

Awesome

A Theory of Link Prediction via Relational Weisfeiler-Leman on Knowledge Graphs

This is the official code base of the NeurIPS 2023 paper A Theory of Link Prediction via Relational Weisfeiler-Leman on Knowledge Graphs (ArXiv) based on PyTorch and TorchDrug, with implemented Conditional Message Passing Neural Network (C-MPNN). It is largely based on the NBFNet code base , with mild modifications to accommodate all models studied in the paper. Also, it supports training and inference with multiple GPUs or multiple machines.

Installation

pip install torch
pip install torchdrug
pip install ogb easydict pyyaml

Reproduction

To reproduce the experiment in the paper, use the following command. Alternatively, you may use --gpus null to run C-MPNN on a CPU. All the datasets will be automatically downloaded in the code.

python script/run.py -c config/inductive/wn18rr.yaml --gpus [0] --version v1

For experiments on inductive relation prediction, you need to additionally specify the split version with --version v1.

For CPU only, run the following command

python script/run.py -c config/inductive/wn18rr.yaml --gpus null --version v1

To run C-MPNN with multiple GPUs or multiple machines, use the following commands

python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3]
python -m torch.distributed.launch --nnodes=4 --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]

Configuration

We provide the hyperparameters for each experiment in configuration files. All the configuration files can be found in config/*/*.yaml.

Inductive Relation Prediction Experiments

The naming and the corresponding model variation are shown below.

Model Architecture ChoiceKeyValue
Aggregate FunctionPrincipal Neighborhood Aggregation(PNA)aggregate_funcpna
Sumsum
Message Function${Mes}_r^{1}(\mathbf{h}_{w \mid u,q}^{(t)},\mathbf{z}_q) = \mathbf{h}_{w \mid u,q}^{(t)} * \mathbf{W}_{r}^{(t)} \mathbf{z}_q $(dependent, rgcn)(yes,no)
${Mes}_r^{2}(\mathbf{h}_{w \mid u,q}^{(t)},\mathbf{z}_q) = \mathbf{h}_{w \mid u,q}^{(t)} * \mathbf{b}_r $(no,no)
${Mes}_r^{3}(\mathbf{h}_{w \mid u,q}^{(t)},\mathbf{z}_q) = \mathbf{W}_{r}^{(t)}\mathbf{h}_{w \mid u,q}^{(t)} $(_,yes)
History Function$f(t) = t$set_boundaryno
$f(t) = 0$yes

In addition, if we consider using ${Mes}_r^3$, then we can pass in additional parameter num_bases: k where k is the number of basis for basis decomposition.

Initialization Experiments

The naming in the config file and the corresponding model variation are shown below.

InitializationEquation
AllZero${Init}_0(u,v,q) = \mathbf{0}$
Zero-One${Init}_1(u,v,q) = \mathbb{1}_{u = v} * \mathbf{1}$
Query${Init}_2(u,v,q) = \mathbb{1}_{u = v} * \mathbf{z}_q $
QueryWithNoise${Init}_3(u,v,q) = \mathbb{1}_{u = v} * (\mathbf{z}_q + \mathbf{\epsilon}_{u})$

Transductive Experiments

For experiments on transductive relation prediction:

python script/run.py -c config/knowledge_graph/wn18rr.yaml --gpus [0] 

Readout Experiments

The TRI-SQR dataset and synthetic experiments are shown in TRI-SQR dataset.ipynb

The key and acceptable values in the config file:

KeyValue
has_readoutyes / no
readout_typesum/ mean
query_specific_readoutyes / no

For further details please refer to the NBFNet code base.

@inproceedings{
huang2023a,
title={A Theory of Link Prediction via Relational Weisfeiler-Leman on Knowledge Graphs},
author={Xingyue Huang and Miguel Romero Orth and İsmail İlkan Ceylan and Pablo Barceló},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=7hLlZNrkt5}
}