

<h1 align="center"> Making interpretations useful (CDEP) 🔨</h1> <p align="center"> Regularizes interpretations (computed via <a href="https://github.com/csinva/hierarchical-dnn-interpretations">contextual decomposition</a>) to improve neural networks. Official code for <i>Interpretations are useful: penalizing explanations to align neural networks with prior knowledges</i> (ICML 2020 <a href="https://arxiv.org/abs/1909.13584">pdf</a>). </p> <p align="center"> <img src="https://img.shields.io/badge/python-3.6--3.9-blue"> <img src="https://img.shields.io/badge/pytorch-1.0%2B-blue"> <img src="https://img.shields.io/github/checks-status/laura-rieger/deep-explanation-penalization/master"> <img src="https://img.shields.io/badge/license-mit-orange.svg"> </p> <p align="center"> <i>Note: this repo is actively maintained. For any questions please file an issue.</i> </p>




ISIC skin-cancer classification - using CDEP, we can learn to avoid spurious patches present in the training set, improving test performance!

<p align="center"> <img width="60%" src="isic-skin-cancer/results/gradCAM.png"></img> </p>

The segmentation maps of the patches can be downloaded here

ColorMNIST - penalizing the contributions of individual pixels allows us to teach a network to learn a digit's shape instead of its color, improving its test accuracy from 0.5% to 25.1%

<p align="center"> <img width="80%" src="mnist/results/ColorMNIST_examples.png"></img> </p>

Fixing text gender biases - CDEP can help to learn spurious biases in a dataset, such as gendered words

<p align="center"> <img width="50%" src="text/results/data_example.png"></img> </p>

using CDEP on your own data

using CDEP requires two steps:

  1. run CD/ACD on your model. Specifically, 3 things must be altered:
  1. add CD scores to the loss function (see notebooks)

related work


  title={Interpretations are useful: penalizing explanations to align neural networks with prior knowledge},
  author={Rieger, Laura and Singh, Chandan and Murdoch, William and Yu, Bin},
  booktitle={International Conference on Machine Learning},