Home

Awesome

GraSH: Successive Halving for Knowledge Graphs

This is the code and configuration accompanying the paper "Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings" presented at ECML-PKDD 2022. The code extends the knowledge graph embedding library for distributed training Dist-KGE. For documentation on Dist-KGE refer to the Dist-KGE repository. We provide the hyperparameter settings for the searches and finally selected trials in /examples/experiments/.

UPDATE: GraSH was recently merged into our main library LibKGE. All configs from this repository, except the ones for Freebase that require distributed training, can be executed in LibKGE. Please use LibKGE for your own experiments with GraSH.

Table of contents

  1. Quick start
  2. Configuration of GraSH Search
  3. Run a GraSH hyperparameter search
  4. Results and Configurations
  5. How to cite

Quick start

Setup

# retrieve and install project in development mode
git clone https://github.com/uma-pi1/grash.git
cd grash
pip install -e .

# download and preprocess datasets
cd data
sh download_all.sh
cd ..

Training


# train an example model on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-train.yaml --job.device cpu

This example will train on a toy dataset in a sequential setup on CPU

GraSH Hyperparameter Search

# perform a search with GraSH on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-search-grash.yaml --job.device cpu

This example will perform a small GraSH search with 16 trials on a toy dataset in a sequential setup on CPU

Configuration of GraSH Search

The most important configuration options for a hyperparameter search with GraSH are:

dataset:
  name: yago3-10
grash_search:
  eta: 4
  num_trials: 64
  search_budget: 3
  variant: combined
  parameters: # define your search space here
job:
  type: search
model: complex
train:
  max_epochs: 400

Run a GraSH hyperparameter search

Run the default search on yago3-10 with the following command:

python -m kge start examples/experiments/search_configs/yago3-10/search-complex-yago-combined.yaml

The k-core subgraphs will automatically be generated and saved to data/yago3-10/subsets/k-core/. By default, each experiment will create a new folder in local/experiments/<timestamp>-<config-name> where the results can be found.

Results and Configurations

All results were obtained with the GraSH default settings (num_trials=64, eta=4, search_budget=3, variant=combined)

Yago3-10

ModelVariantMRRHits@1Hits@10Hits@100config
ComplExEpoch0.5360.4600.6720.601config
ComplExGraph0.4630.3750.6340.800config
ComplExCombined0.5280.4550.6600.772config
RotatEEpoch0.4320.3370.6190.768config
RotatEGraph0.4320.3370.6190.768config
RotatECombined0.4340.3420.6070.742config
TransEEpoch0.4990.4060.6610.794config
TransEGraph0.4220.3110.6280.802config
TransECombined0.4990.4060.6610.794config

Wikidata5M

ModelVariantMRRHits@1Hits@10Hits@100config
ComplExEpoch0.3000.2470.3900.506config
ComplExGraph0.3000.2470.3900.506config
ComplExCombined0.3000.2470.3900.506config
RotatEEpoch0.2410.1870.3310.438config
RotatEGraph0.2320.1690.3260.432config
RotatECombined0.2410.1870.3310.438config
TransEEpoch0.2630.2100.3580.483config
TransEGraph0.2630.2100.3580.483config
TransECombined0.2680.2130.3630.480config

Freebase

ModelVariantMRRHits@1Hits@10Hits@100config
ComplExEpoch0.5720.4860.7140.762config
ComplExGraph0.5940.5110.7260.767config
ComplExCombined0.5940.5110.7260.767config
RotatEEpoch0.5610.5220.6250.679config
RotatEGraph0.6130.5780.6690.719config
RotatECombined0.6130.5780.6690.719config
TransEEpoch0.2610.0780.5180.636config
TransEGraph0.5530.5200.6140.682config
TransECombined0.5530.5200.6140.682config

How to cite

@inproceedings{kochsiek2022start,
  title={Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings},
  author={Kochsiek, Adrian and Niesel, Fritz and Gemulla, Rainer},
  booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
  year={2022}
}