Home

Awesome

logo

Thermostat is a large collection of NLP model explanations and accompanying analysis tools.

This work is described in our paper accepted to EMNLP 2021 System Demonstrations :
Nils Feldhus, Robert Schwarzenberg, and Sebastian MΓΆller.
Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools. 2021.

arXiv pre-print available here: https://arxiv.org/abs/2108.13961

Installation

With pip

PyPI

pip install thermostat-datasets

Explore on Hugging Face Spaces

The Spaces edition of Thermostat launched on October 26, 2021. Check it out here:

Usage

Downloading a dataset requires just two lines of code:

import thermostat
data = thermostat.load("imdb-bert-lig")

Thermostat datasets can be addressed and loaded with an identifier string that contains three basic coordinates: Dataset, Model, and Explainer. In this example, the dataset is IMDb (sentiment analysis of movie reviews), the model is a BERT model fine-tuned on the IMDb data, the explanations are generated using a (Layer) Integrated Gradients explainer.

data then contains the following columns/features:

This is the raw content stored in each of the instances of data:

instance-contents

If we print data, we get more info such as the actual names of the dataset, the explainer and the model:

print(data)
> IMDb dataset, BERT model, Layer Integrated Gradients explanations
> Explainer: LayerIntegratedGradients
> Model: textattack/bert-base-uncased-imdb
> Dataset: imdb

Indexing an instance

We can simply index the loaded dataset like a list:

import thermostat
instance = thermostat.load("imdb-bert-lig")[429]

Visualizing attributions as a heatmap

We can apply .render() to every instance to display a heatmap visualization generated by the displaCy library.

instance.render()  # instance refers to the variable assigned in the last codebox

heatmap-html

Get simple tuple-based heatmap

The explanation attribute stores a tuple-based heatmap with the token, the attribution, and the token index as elements.

print(instance.explanation)  # instance refers to the variable assigned in the second to last codebox

> [('[CLS]', 0.0, 0),
 ('amazing', 2.3141794204711914, 1),
 ('movie', 0.06655970215797424, 2),
 ('.', -0.47832658886909485, 3),
 ('some', 0.15708176791667938, 4),
 ('of', -0.02931656688451767, 5),
 ('the', -0.08834744244813919, 6),
 ('script', -0.2660972774028778, 7),
 ('writing', -0.4021594822406769, 8),
 ('could', -0.19280624389648438, 9),
 ('have', -0.015477157197892666, 10),
 ('been', -0.21898044645786285, 11),
 ('better', -0.4095713794231415, 12),
 ...]  # abbreviated

The heatmap attribute displays it as a pandas table:

print(instance.heatmap)

> token_index    0         1          2         3          4         5    \
token        [CLS]         i       went       and        saw      this   
attribution      0 -0.117371  0.0849944  0.165192  0.0362542 -0.029687   
text_field    text      text       text      text       text      text   

token_index       6         7         8          9          10         11   \
token           movie      last     night      after      being     coaxed   
attribution  0.533126  0.240222  0.171116 -0.0450005 -0.0103401  0.0166524   
text_field       text      text      text       text       text       text   

token_index        13         14          15         16         17   \
token               to         by           a        few    friends   
attribution  0.0269605 -0.0213463  0.00761083  0.0216749  0.0579834   
text_field        text       text        text       text       text   

# abbreviated

Modifying the load function

thermostat.load() is a wrapper around datasets.load_dataset() and you can use any keyword arguments from load_dataset() in load(), too (except path, name and split which are reserved), e.g. if you want to use another cache directory, you can use the cache_dir argument in thermostat.load().


Explainers

Namecaptum implementationParameters
Layer Gradient x Activation (lgxa).attr.LayerGradientXActivation
Layer Integrated Gradients (lig).attr.LayerIntegratedGradients# samples = 25
LIME (lime).attr.LimeBase# samples = 25, <br>mask prob = 0.3
Occlusion (occ).attr.Occlusionsliding window = 3
Shapley Value Sampling (svs).attr.ShapleyValueSampling# samples = 25
Layer DeepLiftShap (lds).attr.LayerDeepLiftShap
Layer GradientShap (lgs).attr.LayerGradientShap# samples = 5

Datasets + Models

Overview

βœ… = Dataset is downloadable
⏏️ = Dataset is finished, but not uploaded yet
πŸ”„ = Currently running on cluster (x n = number of jobs/screens)
⚠️ = Issue

IMDb

imdb is a sentiment analysis dataset with 2 classes (pos and neg). The available split is the test subset containing 25k examples.
Example configuration: imdb-xlnet-lig

NameπŸ€—lgxaliglimeoccsvsldslgs
ALBERT (albert)textattack/albert-base-v2-imdbβœ…βœ…βœ…βœ…βœ…βœ…βœ…
BERT (bert)textattack/bert-base-uncased-imdbβœ…βœ…βœ…βœ…βœ…βœ…βœ…
ELECTRA (electra)monologg/electra-small-finetuned-imdbβœ…βœ…βœ…βœ…βœ…βœ…βœ…
RoBERTa (roberta)textattack/roberta-base-imdbβœ…βœ…βœ…βœ…βœ…βœ…βœ…
XLNet (xlnet)textattack/xlnet-base-cased-imdbβœ…βœ…βœ…βœ…βœ…βš οΈβš οΈ

MultiNLI

multi_nli is a textual entailment dataset. The available split is the validation_matched subset containing 9815 examples.
Example configuration: multi_nli-roberta-lime

NameπŸ€—lgxaliglimeoccsvsldslgs
ALBERT (albert)prajjwal1/albert-base-v2-mnliβœ…βœ…βœ…βœ…βœ…βœ…βœ…
BERT (bert)textattack/bert-base-uncased-MNLIβœ…βœ…βœ…βœ…βœ…βœ…βœ…
ELECTRA (electra)howey/electra-base-mnliβœ…βœ…βœ…βœ…βœ…βœ…βœ…
RoBERTa (roberta)textattack/roberta-base-MNLIβœ…βœ…βœ…βœ…βœ…βœ…βœ…
XLNet (xlnet)textattack/xlnet-base-cased-MNLIβœ…βœ…βœ…βœ…βœ…βš οΈβš οΈ

XNLI

xnli is a textual entailment dataset. It provides the test set of MultiNLI through the "en" configuration. The fine-tuned models used here are the same as the MultiNLI ones. The available split is the test subset containing 5010 examples.
Example configuration: xnli-roberta-lime

NameπŸ€—lgxaliglimeoccsvsldslgs
ALBERT (albert)prajjwal1/albert-base-v2-mnliβœ…βœ…βœ…βœ…βœ…βœ…βœ…
BERT (bert)textattack/bert-base-uncased-MNLIβœ…βœ…βœ…βœ…βœ…βœ…βœ…
ELECTRA (electra)howey/electra-base-mnliβœ…βœ…βœ…βœ…βœ…βœ…βœ…
RoBERTa (roberta)textattack/roberta-base-MNLIβœ…βœ…βœ…βœ…βœ…βœ…βœ…
XLNet (xlnet)textattack/xlnet-base-cased-MNLIβœ…βœ…βœ…βœ…βœ…βš οΈβš οΈ

AG News

ag_news is a news topic classification dataset. The available split is the test subset containing 7600 examples.
Example configuration: ag_news-albert-svs

NameπŸ€—lgxaliglimeoccsvsldslgs
ALBERT (albert)textattack/albert-base-v2-ag-newsβœ…βœ…βœ…βœ…βœ…βœ…βœ…
BERT (bert)textattack/bert-base-uncased-ag-newsβœ…βœ…βœ…βœ…βœ…βœ…βœ…
RoBERTa (roberta)textattack/roberta-base-ag-newsβœ…βœ…βœ…βœ…βœ…βœ…βœ…

Contribute a dataset

New explanation datasets must follow the JSONL format and include the five fields attributions, idx, input_ids, label and predictions as described above in "Usage".

Please follow the instructions for writing a dataset loading script in the official docs of datasets.

Provide the additional Thermostat metadata via the list of builder configs (click here to see the Thermostat implementation of builder configs).

Necessary fields include...

plus features which you can copy from the codebox below:

features={"attributions": "attributions",
          "predictions": "predictions",
          "input_ids": "input_ids"}

While debugging, you can wrap your data with the Thermopack class and see if it correctly parses your data:

import thermostat
from datasets import load_dataset
data = load_dataset('your_dataset')
thermostat.Thermopack(data)

If you're successful, follow the official instructions for sharing a community provided dataset at the HuggingFace hub.

At first, all Thermostat contributions will have to be loaded via the code example above. Please notify us of existing explanation datasets by creating an Issue with the tag Contribution and a maintainer of this repository will add your dataset to the Thermostat configs s.t. it can be accessed by everyone via thermostat.load().


Cite Thermostat

@inproceedings{feldhus2021thermostat,
    title={Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools},
    author={Nils Feldhus and Robert Schwarzenberg and Sebastian MΓΆller},
    year={2021},
    editor = {Heike Adel and Shuming Shi},
    booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
}

Disclaimer

We give no warranties for the correctness of the heatmaps or any other part of the data. This is evolving work and will be hot-patched continuously.

The Thermostat project follows the ACL and ACM Code of Ethics.

Acknowledgements

The majority of the codebase, especially regarding the combination of transformers and captum, stems from our other recent project Empirical Explainers.