Home

Awesome

Gradient Rollback Source Code

This repository contains code for the paper Explaining Neural Matrix Factorization with Gradient Rollback (Lawrence, Sztyler & Niepert, AAAI 2021).

Gradient Rollback experiments consist of 3 steps:

  1. Train a matrix factorization model using DistMult or ComplEx & writing influence maps for GR
  2. Given a trained model and to-be-explained triples, use GR to identify all relevant explanations & extract the top-k if desired.
  3. Evaluate either GR or the baseline (NH)

The 3 steps are explained in more detail below, followed by instruction on how to reproduce the results of the paper.

If you have any questions, feel free to reach out!

Datasets

The GR paper runs experiments on 3 datasets (Nations, FB15k-237, Movielens):

Dependencies

All dependencies are install if python setup.py install or python setup.py develop is run. Please ensure that pip and setuptools are uptodate by running pip install --upgrade pip setuptools.

Additionally, we modified the following tensorflow 2 files:

Training

To run training and GR code, see steps 1 and 2 below.

Evaluation

Evaluation Metrics

Individual Steps

Step 1

This step trains a main model and tracks the influence of the training set, the corresponding output file will be used by GR in step 2. The main entry point for this step is the file run_step_1_train_main_model.py. Example bash files to start this step can be found in bash/*/*step_1_train_main.sh, where * are placeholders for the dataset name. The output is stored in a new generated folder that has the same name as the selected dataset. To change from DistMult to ComplEx, edit bash/*/*step_1_train_main.sh by replacing DistMult with ComplEx.

Step 2

This step allows GR to explain prediction, given a model from step 1, the influence file and the triples one wants to explain. The step consists of two sub-steps. The first extracts for each triple one wants to explain all training instances that are explanations and the corresponding influence score. In the second, the file from the first sub-step can be further refined by extracting the explanations with the top-k highest influence score, where k is a parameter (10 and 100 by default). The main entry point for the first sub-step is the file run_step_2_get_explanations.py. The main entry point for the second sub-step is the file run_step_2_extract_topk_explanations.py. Example bash files to start both sub-steps can be found in bash/*/*step_2_run_GR.sh, where * are placeholders for the dataset name. The output is stored in the same folder as in Step 1.

Step 3

This step evaluates either GR or a baseline (NH). The step can be quite expensive, because a new model has to be trained for each triple and its explanation that one wants to evaluate. The main entry point for this step is the file run_step_3_xai_evaluation.py. Example bash files to start this step can be found in bash/*/*step_3_eval_GR.sh and bash/*/*step_3_eval_NH.sh, for GR/NH, respectively, and where * are placeholders for the dataset name.

Reproduce Results: Example for Nations