Home

Awesome

PyKEEN Benchmarking Results

This repository contains the results from the reproducibility and benchmarking studies described in

Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework. <br /> Ali, M., Berrendorf, M., Hoyt, C. T., Vermue, L., Galkin, M., Sharifzadeh, S., Fischer, A., Tresp, V., & Lehmann, J. (2020). <br /> arXiv, 2006.13365.

This repository itself is archived on Zenodo at DOI.

Reproducibility Study

In this study, we use the KGEMs reimplemented in PyKEEN and the authors' best reported hyper-parameters to make reproductions of past experiments. The experimental artifacts from the reproducibility study can be found here.

Benchmarking (Ablation) Study

In this study, we conduct a large number of hyper-parameter optimizations to investigate the effects of certain aspects of models (training assumption, loss function, regularizer, optimizer, negative sampling strategy, HPO methodology, training strategy). The experimental artifacts from the ablation study can be found here.

We provide an additional tool to search through these configurations at ablation/search.py, by finding configurations with optimal validation H@10 for a number of different queries. You can also run this script without full installation, as long as click and pandas are available. General usage information can be obtained by python3 ablation/search.py --help. Moreover, here are a few examples:

python3 ablation/search.py
python3 ablation/search.py --dataset fb15k237
python3 ablation/search.py --dataset fb15k237 --model distmult
python3 ablation/search.py --dataset wn18rr --training-loop lcwa --at-most 3

Regeneration of Charts

All configuration for installation of relevant code, collation of results, and generation of charts is included in the tox.ini configuration that can be run with:

$ pip install tox
$ tox