


This is the original implementation for KDD 2022 paper Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries.


Example installation using conda:

# Use the cuda version that matches your nvidia driver and pytorch
conda install "pytorch>=1.7.1,<=1.9" cudatoolkit=11.3 pyg -c pyg -c pytorch -y

# To compile fastmoe, CUDA `nvcc` toolchain is required.
# If not exists, it can be installed with conda:
conda install cudatoolkit-dev=11.3 "gxx_linux-64<=10" nccl -c conda-forge -y
# `nvcc` does not support gcc>10 as of 2022/06.

# Download fastmoe submodule if not already downloaded
git submodule update --init
cd fastmoe
pip install -e .


The parameters in the paper is preloaded in configs/. Change root_dir option for the location to save model checkpoints.

Dataset can be downloaded from http://snap.stanford.edu/betae/KG_data.zip. The location for the extracted dataset should be specified in the data_dir in the config files. For exmpale, if the FB15k-237-q2b dataset is in /data/FB15k-237-q2b, this is what the data_dir options should be set.

Alternatively, pretrained models are available at OneDrive.

To reproduce all results for FB15k-237:

kgt="python main.py -c configs/fb15k-237.json"
# run pretrain1 & pretrain2
$kgt pretrain1 pretrain2
# Do multi-task finetuning for all tasks
$kgt reasoning_multi_1e5 kgt reasoning_multi_1e6
# Do single-task finetuning for each task
$kgt reasoning_1p reasoning_2p reasoning_3p
$kgt reasoning_2i reasoning_3i
$kgt reasoning_ip reasoning_pi
$kgt reasoning_2u reasoning_up

For NELL995:

kgt="python main.py -c configs/nell995.json"
# run pretrain
$kgt pretrain
# Do multi-task finetuning
$kgt reasoning_multi
# Do finetuning for tasks
$kgt reasoning_1p reasoning_2p reasoning_3p
$kgt reasoning_2i reasoning_3i
$kgt reasoning_ip reasoning_pi
$kgt reasoning_2u reasoning_up



There are two main implementations for kgTransformer, both of which resides in model.py.

Our current implementation is based on D_KGTransformer.


Training-related utilities can be found in train.py. They accept Iterator's that yield batched data, identical to the output of a torch.utils.data.DataLoader. The most useful functions are main_mp() and ft_test().

TrainClient scatters data onto different workers and perform multi-GPU training based on torch.nn.parallel.DistributedDataParallel. Example usage can be found in main_mp().

Config Files

Each config file is a JSON key-value mapping that maps a task name to a task. The tasks can be run directly from the command line:

python main.py <task_name> [<task_name>...]

In a specific task, base option specifies the task it should inherit from. type option specifies the type of operation of this configuration. See main.py for a full list of available options.


<details> <summary>NCCL Unhandled System Error</summary>

We observed that Infiniband is not supported by fastmoe on some machines.

NCCL with Infiniband can be disabled using an environment variable.

</details> <details> <summary>CUDA Out of Memory</summary>

Adjust batch size and retry. If the issue persists, downgrade pytorch to as early as possible (e.g. LTS 1.8.2 as of 2022/07). This is possibly due to memory issues in higher pytorch versions. See https://github.com/pytorch/pytorch/issues/67680 for more information.



If you are interested in our work and wish to give us a credit, you can use the following BibTeX:

  title={Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries},
  author={Liu, Xiao and Zhao, Shiyu and Su, Kai and Cen, Yukuo and Qiu, Jiezhong and Zhang, Mengdi and Wu, Wei and Dong, Yuxiao and Tang, Jie},
  booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},