Home

Awesome

<!-- title: 'DIANNA: Deep Insight And Neural Network Analysis' tags: - Python - explainable AI - deep neural networks - ONNX - benchmark sets authors: - name: Elena Ranguelova^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-9834-1756 affiliation: 1 - name: Patrick Bos^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-6033-960X affiliation: 1 - name: Yang Liu^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-1966-8460 affiliation: 1 - name: Christiaan Meijer^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-5529-5761 affiliation: 1 - name: Leon Oostrum^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0001-8724-8372 affiliation: 1 - name: Giulia Crocioni^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-0823-0121 affiliation: 1 - name: Laura Ootes^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-2800-8309 affiliation: 1 - name: Pranav Chandramouli^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-7896-2969 affiliation: 1 - name: Aron Jansen^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-4764-9347 affiliation: 1 - name: Stef Smeets^[co-first author] # note this makes a footnote saying 'co-first author' orcid: 0000-0002-5413-9038 affiliation: 1 affiliations: - name: Netherlands eScience Center, Amsterdam, the Netherlands index: 1 -->

build Documentation Status workflow scc badge CII Best Practices fair-software.eu DOI

<img width="300" alt="Logo_ER10" src="https://user-images.githubusercontent.com/3244249/151994514-b584b984-a148-4ade-80ee-0f88b0aefa45.png"> <img align="left" src="https://user-images.githubusercontent.com/55382553/153408200-36c4de2e-7865-4934-956d-09eefd893e6a.png">

Deep Insight And Neural Network Analysis

DIANNA is a Python package that brings explainable AI (XAI) to your research project. It wraps carefully selected XAI methods in a simple, uniform interface. It's built by, with and for (academic) researchers and research software engineers working on machine learning projects.

Why DIANNA?

DIANNA software is addressing needs of both (X)AI researchers and mostly the various domains scientists who are using or will use AI models for their research without being experts in (X)AI. DIANNA is future-proof: one of the very few XAI library supporting the Open Neural Network Exchange (ONNX) format.

After studying the vast XAI landscape we have made choices in the parts of the XAI Taxonomy on which methods, data modalities and problems types to focus. Our choices, based on the largest usage in scientific literature, are shown graphically in the XAI taxonomy below:

<img src="https://github.com/dianna-ai/dianna/assets/3244249/9b864980-86f4-4d0e-8a83-af7d6be606f7" alt="XAI_taxonomy" width="80%"/>

The key points of DIANNA:

For more information on the unique strengths of DIANNA with comparison to other tools, please see the context landscape.

Installation

workflow pypi badge supported python versions

DIANNA can be installed from PyPI using pip on any of the supported Python versions (see badge):

python3 -m pip install dianna

To install the most recent development version directly from the GitHub repository run:

python3 -m pip install git+https://github.com/dianna-ai/dianna.git

If you get an error related to OpenMP when importing dianna, have a look at this issue for possible workarounds.

Pre-requisites only for Macbook Pro with M1 Pro chip users

Getting started

You need:

You get:

Template example for any data modality and explainer

  1. Provide your trained model and data item ( text, image, time series or tabular )
model_path = 'your_model.onnx'  # model trained on your data modality
data_item = <data_item> # data item for which the model's prediction needs to be explained 
  1. If the task is classification: which are the classes your model has been trained for?
labels = [class_a, class_b]   # example of binary classification labels

Which of these classes do you want an explanation for?

explained_class_index = labels.index(<explained_class>)  # explained_class can be any of the labels
  1. Run dianna with the explainer of your choice ( 'LIME', 'RISE' or 'KernalSHAP') and visualize the output:
explanation = dianna.<explanation_function>(model_path, data_item, explainer)
dianna.visualization.<visualization_function>(explanation[explained_class_index], data_item)

Text and image usage examples

Lets illustrate the template above with textual data. The data item of interest is a sentence being (a part of) a movie review and the model has been trained to classify reviews into positive and negative sentiment classes. We are intersted which words are contributing positively (red) and which - negatively (blue) towards the model's desicion to classify the review as positive and we would like to use the LIME explainer:

model_path = 'your_text_model.onnx'
# also define a model runner here (details in dedicated notebook)
review = 'The movie started great but the ending is boring and unoriginal.' 
labels = ["negative", "positive"] 
explained_class_index = labels.index("positive")  
explanation = dianna.explain_text(model_path, text, 'LIME')
dianna.visualization.highlight_text(explanation[explained_class_index], model_runner.tokenizer.tokenize(review))

image

Here is another illustration on how to use dianna to explain which parts of a bee image contributied positively (red) or negativey (blue) towards a classifying the image as a 'bee' using RISE. The Imagenet model has been trained to distinguish between 1000 classes (specified in labels). For images, which are data of higher dimention compared to text, there are also some specifics to consider:

model_path = 'your_image_model.onnx' 
image = PIL.Image.open('your_bee_image.jpeg') 
axis_labels = {2: 'channels'} 
explained_class_index = labels.index('bee') 
explanation = dianna.explain_image(model_path, image, 'RISE', axis_labels=axis_labels, labels=labels)
dianna.visualization.plot_image(explanation[explained_class_index], utils.img_to_array(image)/255., heatmap_cmap='bwr')
plt.show()
<img src="https://github.com/dianna-ai/dianna/assets/3244249/b03e4d4e-e3e8-4248-bf62-e3602b7f6d71" width="215" height="215">

And why would Imagenet think the same image would be a garden spider?

explained_class_index = labels.index('garden_spider') # interested in the image being classified as a garden spider
explanation = dianna.explain_image(model_path, image, 'RISE', axis_labels=axis_labels, labels=labels)
dianna.visualization.plot_image(explanation[explained_class_index], utils.img_to_array(image)/255., heatmap_cmap='bwr')
plt.show()
<img src="https://github.com/dianna-ai/dianna/assets/3244249/e7623803-2369-40ad-b4ef-4a6ae4e902f1" width="215" height="215">

Overview tutorial

There are full working examples on how to use the supported explainers and how to use dianna for all supported data modalities in our overview tutorial.

Demo movie (update planned):

Watch the video on YouTube

IMPORTANT: Sensitivity to hyperparameters

The explainers are sensitive to the choice of their hyperparameters! In this work, this sensitivity to hyperparameters is researched and useful conclusions are drawn. The default hyperparameters used in DIANNA for each explainer as well as the values for our tutorial examples are given in the Tutorials README.

Dashboard

Explore the explanations of your trained model using the DIANNA dashboard (for now images, text and time series classification is supported). Click here for more information.

<a href="https://github.com/dianna-ai/dianna/tree/main/dianna/dashboard" target="_blank"> <img width="1000" align="center" alt="Dianna dashboard screenshot" src="https://raw.githubusercontent.com/dianna-ai/dianna/main/dianna/dashboard/dashboard-screenshot.png"> </a>

Datasets

DIANNA comes with simple datasets. Their main goal is to provide intuitive insight into the working of the XAI methods. They can be used as benchmarks for evaluation and comparison of existing and new XAI methods.

Images

DatasetDescriptionExamplesGeneration
Binary MNIST <img width="25" alt="mnist_zero_and_one_half_size" src="https://user-images.githubusercontent.com/3244249/152354583-d7b68902-d402-4098-922b-b1a33b07e3e1.png">Greyscale images of the digits "1" and "0" - a 2-class subset from the famousMNIST dataset for handwritten digit classification.<img width="120" alt="BinaryMNIST" src="https://user-images.githubusercontent.com/3244249/150808267-3d27eae0-78f2-45f8-8569-cb2561f2c2e9.png">Binary MNIST dataset generation
Simple Geometric (circles and triangles) <img width="20" alt="Simple Geometric Logo" src="https://user-images.githubusercontent.com/3244249/150808842-d35d741e-294a-4ede-bbe9-58e859483589.png">Images of circles and triangles for 2-class geometric shape classificaiton. The shapes of varying size and orientation and the background have varying uniform gray levels.<img width="130" alt="SimpleGeometric" src="https://user-images.githubusercontent.com/3244249/150808125-e1576237-47fa-4e51-b01e-180904b7c7f6.png">Simple geometric shapes dataset generation
Simple Scientific (LeafSnap30) <img width="20" alt="LeafSnap30 Logo" src="https://user-images.githubusercontent.com/3244249/150815639-2da560d4-8b26-4eeb-9ab4-dabf221a264a.png">Color images of tree leaves - a 30-class post-processed subset from the LeafSnap dataset for automatic identification of North American tree species.<img width="600" alt="LeafSnap" src="https://user-images.githubusercontent.com/3244249/150804246-f714e517-641d-48b2-af26-2f04166870d6.png">LeafSnap30 dataset generation

Text

DatasetDescriptionExamplesGeneration
Stanford sentiment treebank <img width="20" alt="nlp-logo_half_size" src="https://user-images.githubusercontent.com/3244249/152355020-908c04f3-aa99-489d-b87a-7e6b1f586118.png">Dataset for predicting the sentiment, positive or negative, of movie reviews.This movie was actually neither that funny, nor super witty.Sentiment treebank

Time series

DatasetDescriptionExamplesGeneration
Coffee dataset <img width="25" alt="Coffe Logo" src="https://github.com/dianna-ai/dianna/assets/3244249/9ab50a0f-5da3-41d2-80e9-70d2c8769162">Food spectographs time series dataset for a two class problem to distinguish between Robusta and Arabica coffee beans.<img width="500" alt="example image" src="https://github.com/dianna-ai/dianna/assets/3244249/763002c5-40ad-48cc-9de0-ea43d7fa8a75)">data source
Weather dataset <img width="25" alt="Weather Logo" src="https://github.com/dianna-ai/dianna/assets/3244249/3ff3d639-ed2f-4a38-b7ac-957c984bce9f">The light version of the weather prediciton dataset, which contains daily observations (89 features) for 11 European locations through the years 2000 to 2010.<img width="500" alt="example image" src="https://github.com/dianna-ai/dianna/assets/3244249/b0a505ac-8a6c-4e1c-b6ad-35e31e52f46d)">data source

Tabular

DatasetDescriptionExamplesGeneration
Pengiun dataset <img width="75" alt="Penguins Logo" src="https://github.com/dianna-ai/dianna/assets/3244249/c7716ad3-f992-4557-80d9-1d8178c7ed57)">Palmer Archipelago (Antarctica) penguin dataset is a great intro dataset for data exploration & visualization similar to the famous Iris dataset.<img width="500" alt="example image" src="https://github.com/allisonhorst/palmerpenguins/blob/main/man/figures/README-mass-flipper-1.png">data source
Weather dataset <img width="25" alt="Weather Logo" src="https://github.com/dianna-ai/dianna/assets/3244249/3ff3d639-ed2f-4a38-b7ac-957c984bce9f">The light version of the weather prediciton dataset, which contains daily observations (89 features) for 11 European locations through the years 2000 to 2010.<img width="500" alt="example image" src="https://github.com/dianna-ai/dianna/assets/3244249/b0a505ac-8a6c-4e1c-b6ad-35e31e52f46d)">data source

ONNX models

<!-- TODO: Add all links, see issue https://github.com/dianna-ai/dianna/issues/135 -->

We work with ONNX! ONNX is a great unified neural network standard which can be used to boost reproducible science. Using ONNX for your model also gives you a boost in performance! In case your models are still in another popular DNN (deep neural network) format, here are some simple recipes to convert them:

More converters with examples and tutorials can be found on the ONNX tutorial page.

And here are links to notebooks showing how we created our models on the benchmark datasets:

Images

ModelsGeneration
Binary MNIST modelBinary MNIST model generation
Simple Geometric modelSimple geometric shapes model generation
Simple Scientific modelLeafSnap30 model generation

Text

ModelsGeneration
Movie reviews modelStanford sentiment treebank model generation

Time series

ModelsGeneration
Coffee modelCoffee model generation
Season prediction modelSeason prediction model generation
Fast Radio Burst classification modelFast Radio Burst classification model generation

Tabular

ModelsGeneration
Penguin model (classification)Penguin model generation
Sunshine hours prediction model (regression)Sunshine hours prediction model generation

We envision the birth of the ONNX Scientific models zoo soon...

Tutorials

DIANNA supports different data modalities and XAI methods (explainers). We have evaluated many explainers using objective criteria (see the How to find your AI explainer blog-post). The table below contains links to the relevant XAI method's papers (for some explanatory videos on the methods, please see tutorials). The DIANNA tutorials cover each supported method and data modality on a least one dataset using the default or tuned hyperparameters. Our plans to expand DIANNA with more data modalities and explainers are given in the ROADMAP.

<!-- see issue: https://github.com/dianna-ai/dianna/issues/142, also related issue: https://github.com/dianna-ai/dianna/issues/148 -->
Data \ XAIRISELIMEKernelSHAP
Images
Text
Timeseries
Tabularplanned
Embeddingwork in progress
Graphs*next steps......

LRP and PatternAttribution also feature in the top 5 of our thoroughly evaluated explainers. Also GradCAM) has been recently found to be semantically continous! Contributing by adding these and more (new) post-hoc explainability methods on ONNX models is very welcome!

Scientific use-cases

Our goal is that the scientific community embrases XAI as a source for novel and unexplored perspectives on scientific problems. Here, we offer tutorials on specific scientific use-cases of uisng XAI:

Use-case (data) \ XAIRISELIMEKernelSHAP
Biology (Phytomorphology): Tree Leaves classification (images)
Astronomy: Fast Radio Burst detection (timeseries)
Geo-science (raster data)planned......
Social sciences (text)work in progress......
Climateplanned......

Reference documentation

For detailed information on using specific DIANNA functions, please visit the documentation page hosted at Readthedocs.

Contributing

If you want to contribute to the development of DIANNA, have a look at the contribution guidelines. See our developer documentation for information on developer installation, running tests, generating documentation, versioning and making a release.

How to cite us

DOI RSD

If you use this package for your scientific work, please consider citing directly the software as:

Ranguelova, E., Bos, P., Liu, Y., Meijer, C., Oostrum, L., Crocioni, G., Ootes, L., Chandramouli, P., Jansen, A., Smeets, S. (2023). dianna (*[VERSION YOU USED]*). Zenodo. https://zenodo.org/record/5592606

or the JOSS paper as:

Ranguelova et al., (2022). DIANNA: Deep Insight And Neural Network Analysis. Journal of Open Source Software, 7(80), 4493, https://doi.org/10.21105/joss.04493

See also the Zenodo page or the JOSS page for exporting the software citation to BibTteX and other formats.

Credits

This package was created with Cookiecutter and the NLeSC/python-template.