Home

Awesome

XAI

Open source tools

Papers and code of Explainable AI esp. w.r.t. Image classificiation

2013 Conference Papers

TitlePaper TitleSource LinkCodeTags
Visualization of CNNDeep Inside Convolutional Networks: Visualising Image Classification Models and Saliency MapsCVPR2013PyTorchVisualization gradient-based saliency maps

2016 Conference Papers

TitlePaper TitleSource LinkCodeTags
CAMLearning Deep Features for Discriminative LocalizationCVPR2016PyTorch (Official)class activation mapping
LIME“Why Should I Trust You?”Explaining the Predictions of Any ClassifierKDD2016PyTorch (Official)trust a prediction

2017 Conference Papers

TitlePaper TitleSource LinkCodeTags
Grad-CAMGrad-CAM: Visual Explanations from Deep Networks via Gradient-based LocalizationICCV2017, CVPR2016 (original)PyTorchVisualization gradient-based saliency maps
Network DissectionNetwork Dissection: Quantifying Interpretability of Deep Visual RepresentationsCVPR2017PyTorch (Official)Visualization

2018 Conference Papers

TitlePaper TitleSource LinkCodeTags
TCAVInterpretability Beyond Feature Attribution:Quantitative Testing with Concept Activation Vectors (TCAV)ICML 2018Tensorflow 1.15.2interpretability method
Interpretable CNNInterpretable Convolutional Neural NetworksCVPR 2018Tensorflow 1.xexplainability by design
AnchorsAnchors: High-Precision Model-Agnostic ExplanationsAAAI 2018sklearn (Official)model-agnostic
Sanity ChecksSanity checks for saliency mapsNeurIPS 2018PyTorchsaliency methods vs edge detector
Grad Cam++Grad Cam++:Improved Visual Explanations forDeep Convolutional NetworksWACV 2018PyTorchsaliency maps
Interpretable BasisInterpretable Basis Decomposition for Visual ExplanationECCV 2018PyTorchibd

2019 Conference Papers

TitlePaper TitleSource LinkCodeTags
Full-gradFull-Gradient Representation for Neural Network VisualizationNeurIPS2019PyTorch (Official) Tensorflowsaliency map representation
This looks like thatThis Looks Like That: Deep Learning for Interpretable Image RecognitionNeurIPS2019PyTorch (Official)object
Counterfactual visual explanationsCounterfactual visual explanationsICML2019interpretability
concept with contribution interpretable cnnExplaining Neural Networks Semantically and QuantitativelyICCV 2019
SISWhat made you do this? Understanding black-box decisions with sufficient input subsetsAISTATS 2019 - Supplementary MaterialTensorflow 1.x
Filter as concept detectorFilters in Convolutional Neural Networks as Independent Detectors of Visual ConceptsACM

2020 Papers

TitlePaper TitleSource LinkCodeTags
INNMaking Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNsECCV 2020PyTorchexplainability by design
X-Grad CAMAxiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs PyTorch
Revisiting BP saliencyThere and Back Again: Revisiting Backpropagation Saliency MethodsCVPR 2020PyTorchgrad cam failure noted
Interacting with explanationMaking deep neural networks right for the right scientific reasons by interacting with their explanationsNature Machine Intelligencesklearn
Class specific FiltersTraining Interpretable Convolutional Neural Networks by Differentiating Class-specific FiltersECCV Supplementary MaterialCode - not yet updatedICLR rejected version with reviews
Interpretable DecouplingInterpretable Neural Network DecouplingECCV 2020
iCapsiCaps: An Interpretable Classifier via Disentangled Capsule NetworksECCV Supplementary Material
VQAInterpretable Visual Reasoning via Probabilistic Formulation under Natural SupervisionECCV 2020PyTorch
When explanations lieWhen Explanations Lie: Why Many Modified BP Attributions FailICML 2020PyTorch
Similarity modelsTowards Visually Explaining Similarity ModelsArxiv
Quantify trustHow Much Should I Trust You? Modeling Uncertainty of Black Box ExplanationsNeurIPS 2020 submissionhima_lakkaraju,sameer_singh,model-agnostic
Concepts for segmentation taskABSTRACTING DEEP NEURAL NETWORKS INTO CONCEPT GRAPHS FOR CONCEPT LEVEL INTERPRETABILITYArxivTensorflow 1.14brain tumour segmentation
Deep Lift based Network PruningUtilizing Explainable AI for Quantization and Pruning of Deep Neural NetworksArxiv NeurIPS formatnas,deep_lift
Unifed Attribution FrameworkA Unified Taylor Framework for Revisiting Attribution MethodsArxivupdatedtaylor,attribution_framework
Global Cocept AttributionTowards Global Explanations of Convolutional Neural Networks with Concept AttributionCVPR 2020
relevance estimationDetermining the Relevance of Features for Deep Neural NetworksECCV 2020
localized concept mapsExplaining AI-based Decision Support Systems using Concept Localization MapsArxivJust repository created
quantify saliencyQuantifying Explainability of Saliency Methods in Deep Neural NetworksArxivPyTorch
generalization of LIME - MeLIMEMeLIME: Meaningful Local Explanation for Machine Learning ModelsArxivTensorflow 1.15
global counterfactual explanationsInterpretable and Interactive Summaries of Actionable RecoursesArxiv
fine grained counterfactual heatmapsSCOUTER: Slot Attention-based Classifier for Explainable Image RecognitionArxivPyTorchscouter
quantify trustHow Much Can We Really Trust You? Towards Simple, Interpretable Trust Quantification Metrics for Deep Neural NetworksArxiv
Non-negative concept activation vectorsIMPROVING INTERPRETABILITY OF CNN MODELS USING NON-NEGATIVE CONCEPT ACTIVATION VECTORSArxiv
different layer activationsExplaining Neural Networks by Decoding Layer ActivationsArxiv
concept bottleneck networksConcept Bottleneck ModelsICML 2020PyTorch
attributionVisualizing the Impact of Feature Attribution BaselinesDistill
CSIContextual Semantic InterpretabilityArxivexplainable_by_design
Improve black box via explanationIntrospective Learning by Distilling Knowledge from Online Self-explanationArxivkowledge_distillation
Patch explanationsInformation-Theoretic Visual Explanation for Black-Box ClassifiersArxivTensorflow 1.13.1patch_sampling,information_theory
CausalityLong-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal EffectNeurIPS 2020PyTorch
Concept in Time series dataConceptual Explanations of Neural Network Prediction for Time SeriesIJCNN 2020time series, see if useful someway
Explainable by DesignTrustworthy Convolutional Neural Networks:A Gradient Penalized-based ApproachArxiv
Colorwise SaliencyVisualizing Color-wise Saliency of Black-Box Image Classification ModelsArxiv
concept basedConcept Discovery for The Interpretation of Landscape ScenicnessDownloadable File
Integrated Score CAMIS-CAM: Integrated Score-CAM for axiomatic-based explanationsArxiv
Grad LAMGrad-LAM: Visualization of Deep Neural Networks for Unsupervised LearningEURASIP 2020
Cites TCAVIntegrating Intrinsic and Extrinsic Explainability: The Relevance of Understanding Neural Networks for Human-Robot InteractionAAAI 2020
AttributionLearning Propagation Rules for Attribution Map GenerationArxiv
Zoom CAMZoom-CAM: Generating Fine-grained Pixel Annotations from Image LabelsArxivmust read before modularity proposal
Masking based saliency maps investigationINVESTIGATING AND SIMPLIFYING MASKING-BASED SALIENCY MAP METHODS FOR MODEL INTERPRETABILITYArxivPyTorch
EvaluationEvaluating Attribution Methods using White-Box LSTMsEMNLP WorkshopPyTorchcites TCAV, says all explanations fail their test
Interpretable Bayesian Neural NetworksIncorporating Interpretable Output Constraints in Bayesian Neural NetworksNeurIPS 2020PyTorch
Survey - Counterfactual explanationsCounterfactual Explanations for Machine Learning: A ReviewArxiv
Standardised ExplainabilityThe Need for Standardised ExplainabilityICML 2020 Workshop
CMENow You See Me (CME): Concept-based Model ExtractionCIKM 2020 workshopsklearn
Q FITQ-FIT: The Quantifiable Feature Importance Technique for Explainable Machine LearningArxiv
Outside black boxLearning outside the Black-Box: The pursuit of interpretable modelsNeurIPS 2020sklearn
Discrete MaskInterpreting Image Classifiers by Generating Discrete MasksIEEE - PAMI
Contrastive explanationsLearning Global Transparent Models Consistent with Local Contrastive ExplanationsNeurIPS 2020
Empirical study of Ideal ExplanationsHow Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation MethodsNeurIPS 2020tensorflow 1.15Example based matching library
This Looks Like That + RelevanceThis Looks Like That, Because ... Explaining Prototypes for Interpretable Image RecognitionArxivPyTorchmust read before relevance
Concept based posthocProtoViewer: Visual Interpretation and Diagnostics of Deep Neural Networks with Factorized PrototypesPaperrefer human subject experiments
Shapley FlowShapley Flow: A Graph-based Approach to Interpreting Model PredictionsArxiv
Attention Vs Saliency and BeyondThe elephant in the interpretability room: Why use attention as explanation when we have saliency methods?Arxiv
Unification of removal methodsFeature Removal Is A Unifying Principle For Model Explanation MethodsNeurIPS 2020 workshopPyTorchfrom the authors of SHAPExtended Arxiv version
Robust and Stable Black Box ExplanationsRobust and Stable Black Box ExplanationsICML 2020hima lakkaraju
Debugging testDebugging Tests for Model ExplanationsArxiv
AISTATS 2020 submissionEnsuring Actionable Recourse via Adversarial TrainingArxivhima lakkaraju
Layer wise explanationInvestigating Learning in Deep Neural Networks using Layer-Wise Weight ChangeResearchGate
cites TCAVDebiasing Convolutional Neural Networks via Meta OrthogonalizationArxivCode page not found
Introducing conceptsSeXAI: Introducing Concepts into Black Boxes for Explainable Artificial IntelligencePaperTensorflow 1.4
Additive explainersLearning simplified functions to understandPaper
BINBorn Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier’s DecisionArxivTensorflow 2.2counterfactual explanations
Explantion using Generative modelsExplaining image classifiers by removing input features using generative modelsACCV 2020Tensorflow 1.12 & Pytorch 1.1Nguyen's paper
Action Recognition ExplanationPlay Fair: Frame Attributions in Video ModelsACCV 2020PyTorch
Concepts in VQAInterpretable Visual Reasoning via Induced Symbolic SpaceArxivCode not yet updated, just repository created
RecoursesBeyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable RecoursesNeurIPS 2020hima lakkaraju
Feature Importance of CNNMeasuring Feature Importance of Convolutional Neural NetworksIEEE
Causal InferenceCausal inference using deep neural networksArxivKeras
Match upMatch Them Up: Visually Explainable Few-shot Image ClassificationArxivPyTorch
Right for the Right ConceptRight for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their ExplanationsArxiv
MALCTransparency Promotion with Model-Agnostic Linear CompetitorsICML 2020
Shapley Taylor IndexThe Shapley Taylor Interaction IndexICML 2020
Concept based explanation + user feedbackTeaching the Machine to Explain Itself using Domain KnowledgeOpenreview
Counterfactual produces AdversarialSemantics and explanation: why counterfactual explanations produce adversarial examples in deep neural networksAIJ submission
MEMEMEME: Generating RNN Model Explanations via Model ExtractionOpenReviewKerasRNN specific LIME, see if any improvisations for MACE comes from here
ProtoPShareProtoPShare: Prototype Sharing for Interpretable Image Classification and Similarity DiscoveryArxiv - Accepted at ACM SIGKDD 2021PyTorchImproved ProtoPNet (This looks like that)
RANCCRANCC: Rationalizing Neural Networks via Concept ClusteringACLTensorflow 1.x
EANEfficient Attention Network: Accelerate Attention by Searching Where to PlugArxivPyTorch
LIME AnalysisWhy model why? Assessing the strengths and limitations of LIMEArxivsklearn
Rethink positive aggregationRethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency MethodsICML 2020 workshop WHI
Pixel wise interpretation metricA Metric to Compare Pixel-wise Interpretation Methods for Neural NetworksIEEE
Latent space debiasingFair Attribute Classification through Latent Space De-biasingArxivPyTorch
Explanation - Teacher StudentEvaluating Explanations: How much do explanations from the teacher aid students?Arxiv
Neural Prototype TreesNeural Prototype Trees for Interpretable Fine-grained Image RecognitionArxivPyTorchsame group of This looks like that + relevance
FixOutFixOut: an ensemble approach to fairer modelsPaper
Concepts on Tabular dataLearning Interpretable Concept-Based Models with Human FeedbackArxiv
BayLIMEBayLIME: Bayesian Local Interpretable Model-Agnostic ExplanationsArxivKeras
PPIProactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision ModelsArxivAnonymous PyTorch code link given
Generalized distillationUnderstanding Interpretability by generalized distillation in Supervised ClassificationAAAI 2021 submissionCode will be public upon acceptance
RIGA Singular Value Perspective on Model RobustnessArxiv
Activation analysisExplaining Predictions of Deep Neural Classifier via Activation AnalysisArxiv
Evaluation metricsEvaluating Explainable Methods for Predictive Process Analytics: A Functionally-Grounded ApproachArxivsklearn
Explanations based on train setExplainable Artificial Intelligence: How Subsets of the Training Data Affect a PredictionArxiv
DAXDAX: Deep Argumentative eXplanation for Neural NetworksArxiv
Debiased CAMDebiased-CAM for bias-agnostic faithful visual explanations of deep convolutional networksArxivTensorflow 2.1.0lot of human subject experiments found
Bias via explanationInvestigating Bias in Image Classification using Model ExplanationsICML WHI 2020
Shapley Credit AllocationOn Shapley Credit Allocation for InterpretabilityArxiv
Dependency DecompositionDependency Decomposition and a Reject Option for Explainable ModelsArxiv
Interpretation NetworkxRAI: Explainable Representations through AIArxiv
Explainable by DesignEvolutionary Generative Contribution MappingsIEEEexplainable by design
Transformer ExplanationTransformer Interpretability Beyond Attention VisualizationArxiv CVPR formatPyTorch
MANEMANE: Model-Agnostic Non-linear Explanations for Deep Learning ModelIEEEsee how similar to MAIRE
Why and Why Not ExplanationsOn Relating ‘Why?’ and ‘Why Not?’ ExplanationsArxivsklearngives theoretical relationship between feature importance and counterfactual techniques
cites ACEAnalyzing Representations inside Convolutional Neural NetworksArxivPyTorch
CENCEN: Concept Evolution Network for Image Classification TasksACM RICAI 2020explainable by design
Quantitative evaluation metricsQuantitative Evaluations on Saliency Methods: An Experimental StudyArxiv
Integrating black box and Interpretable modelIB-M: A Flexible Framework to Align an Interpretable Model and a Black-box ModelIEEE - BIBM 2020
X-GradCAMAxiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNsBMVC 2020
RCAVRobust Semantic Interpretability: Revisiting Concept Activation VectorsICML WHI 2020PyTorch

2021 Papers

TitlePaper TitleSource LinkCodeTags
Debiasing conceptsDebiasing Concept Bottleneck Models with Instrumental VariablesICLR 2021 submissions page - Accepted as Postercausality
Prototype TrajectoryInterpretable Sequence Classification Via Prototype TrajectoryICLR 2021 submissions pagethis looks like that styled RNN
Shapley dependence assumptionShapley explainability on the data manifoldICLR 2021 submissions page
High dimension ShapleyHuman-interpretable model explainability on high-dimensional dataICLR 2021 submissions page
L2x like paperA Learning Theoretic Perspective on Local ExplainabilityICLR 2021 submissions page
EvaluationEvaluation of Similarity-based ExplanationsICLR 2021 submissions pagelike adebayo paper for this looks like that styled methods
Model correctionDefuse: Debugging Classifiers Through Distilling Unrestricted Adversarial ExamplesICLR 2021 submissions page
Subspace explanationConstraint-Driven Explanations of Black-Box ML ModelsICLR 2021 submissions pageto see how close to MUSE by Hima Lakkaraju 2019
Catastrophic forgettingRemembering for the Right Reasons: Explanations Reduce Catastrophic ForgettingICLR 2021 submissions pageCode available in their Supplementary zip file
Non trivial counterfactual explanationsBeyond Trivial Counterfactual Generations with Diverse Valuable ExplanationsICLR 2021 submissions page
Explainable by DesignInterpretability Through Invertibility: A Deep Convolutional Network With Ideal Counterfactuals And IsosurfacesICLR 2021 submissions page
Gradient attributionRethinking the Role of Gradient-based Attribution Methods for Model InterpretabilityICLR 2021 submissions pagelooks like extension of Sixt et al paper
Mask based Explainable by DesignInvestigating and Simplifying Masking-based Saliency Methods for Model InterpretabilityICLR 2021 submissions page
NBDT - Explainable by DesignNBDT: Neural-Backed Decision TreeICLR 2021 submissions page
Variational Saliency MapsVariational saliency maps for explaining model's behaviorICLR 2021 submissions page
Network dissection with coherency or stability metricImportance and Coherence: Methods for Evaluating Modularity in Neural NetworksICLR 2021 submissions page
ModularityAre Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight MasksICLR 2021 submissions pageCode made anonymous for review, link given in paper
Explainable by designA self-explanatory method for the black problem on discrimination part of CNNICLR 2021 submissions pageseems concepts of game theory applied
Attention not ExplanationWhy is Attention Not So Interpretable?ICLR 2021 submissions page
Ablation SaliencyAblation Path SaliencyICLR 2021 submissions page
Explainable Outlier DetectionExplainable Deep One-Class ClassificationICLR 2021 submissions page
XAI without approximationExplainable AI Wthout Interpretable ModelArxiv
Learning theoretic Local InterpretabilityA LEARNING THEORETIC PERSPECTIVE ON LOCAL EXPLAINABILITYArxiv
GANMEXGANMEX: ONE-VS-ONE ATTRIBUTIONS USING GAN-BASED MODEL EXPLAINABILITYArxiv
Evaluating Local ExplanationsEvaluating local explanation methods on ground truthArtificial Intelligence Journal Elseviersklearn
Structured Attention GraphsStructured Attention Graphs for Understanding Deep Image ClassificationsAAAI 2021PyTorchsee how close to MACE
Ground truth explanationsData Representing Ground-Truth Explanations to Evaluate XAI MethodsAAAI 2021sklearntrained models available in their github repository
AGFVisualization of Supervised and Self-Supervised Neural Networks via Attribution Guided FactorizationAAAI 2021PyTorch
RSPInterpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile ActivationsAAAI 2021
HyDRAHYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural NetworksAAAI 2021PyTorch
SWAGSWAG: Superpixels Weighted by Average Gradients for Explanations of CNNsWACV 2021
FastIFFASTIF: Scalable Influence Functions for Efficient Model Interpretation and DebuggingArxivPyTorch
EVETEVET: Enhancing Visual Explanations of Deep Neural Networks Using Image TransformationsWACV 2021
Local Attribution BaselinesOn Baselines for Local Feature AttributionsAAAI 2021PyTorch
Differentiated ExplanationsDifferentiated Explanation of Deep Neural Networks with Skewed DistributionsIEEE - TPAMI journalPyTorch
Human game based surveyExplainable AI and Adoption of Algorithmic Advisors: an Experimental StudyArxiv
Explainable by designLearning Semantically Meaningful Features for Interpretable ClassificationsArxiv
ExpredExplain and Predict, and then Predict againACM WSDM 2021PyTorch
Progressive InterpretationAn Information-theoretic Progressive Framework for InterpretationArxivPyTorch
UCAMUncertainty Class Activation Map (U-CAM) using Gradient Certainty methodIEEE - TIPProject PagePyTorch
progressive GAN explainability- smiling dataset- ICLR 2020 groupExplaining the Black-box Smoothly - A Counterfactual ApproachArxiv
Head pasted in another image - experimentedWHAT DO DEEP NETS LEARN? CLASS-WISE PATTERNS REVEALED IN THE INPUT SPACEArxiv
Model correctionExplOrs Explanation Oracles and the architecture of explainabilityPaper
Explanations - Knowledge RepresentationA Basic Framework for Explanations in ArgumentationIEEE
Eigen CAMEigen-CAM: Visual Explanations for Deep Convolutional Neural NetworksSpringer
Evaluation of PosthocHow can I choose an explainer? An Application-grounded Evaluation of Post-hoc ExplanationsACM
GLocalXGLocalX - From Local to Global Explanations of Black Box AI ModelsArxiv
Consistent InterpretationsExplainable Models with Consistent InterpretationsAAAI 2021
SIDUIntroducing and assessing the explainable AI (XAI) method: SIDUArxiv
cites This looks like thatExplaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studiesAIJ
i-Algebrai-Algebra: Towards Interactive Interpretability of Deep Neural NetworksAAAI 2021
Shape texture biasSHAPE OR TEXTURE: UNDERSTANDING DISCRIMINATIVE FEATURES IN CNNSICLR 2021
Class agnostic featuresTHE MIND’S EYE: VISUALIZING CLASS-AGNOSTIC FEATURES OF CNNSArxiv
IBEXA Multi-layered Approach for Tailored Black-box ExplanationsPaperCode
Relevant explanationsLearning Relevant ExplanationsPaper
Guided ZoomGuided Zoom: Zooming into Network Evidence to Refine Fine-grained Model DecisionsIEEE
XAI surveyA Survey on Understanding, Visualizations, and Explanation of Deep Neural NetworksArxiv
Pattern theoryConvolutional Neural Network Interpretability with General Pattern TheoryArxivPyTorch
Gaussian Process based explanationsBandits for Learning to Explain from ExplanationsAAAI 2021sklearn
LIFT CAMLIFT-CAM: Towards Better Explanations for Class Activation MappingArxiv
ObAIExRight for the Right Reasons: Making Image Classification Intuitively ExplainablePapertensorflow
VAE based explainerCombining an Autoencoder and a Variational Autoencoder for Explaining the Machine Learning Model PredictionsIEEE
Segmentation based explanationDeep Co-Attention Network for Multi-View Subspace LearningArxivPyTorch
Integrated CAMINTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORINGICASSP 2021PyTorch
Human studyVitrAI - Applying Explainable AI in the Real WorldArxiv
Attribution MaskAttribution Mask: Filtering Out Irrelevant Features By Recursively Focusing Attention on Inputs of DNNsArxivPyTorch
LIME faithfulnessWhat does LIME really see in images?ArxivTensorflow 1.x
Assess model reliabilityIntuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model InputsArxiv
Perturbation + Gradient unificationTowards the Unification and Robustness of Perturbation and Gradient Based ExplanationsArxivhima lakkaraju
Gradients faithful?Do Input Gradients Highlight Discriminative Features?ArxivPyTorch
Untrustworthy predictionsIdentifying Untrustworthy Predictions in Neural Networks by Geometric Gradient AnalysisArxiv
Explaining misclassificationExplaining Inaccurate Predictions of Models through k-Nearest NeighborsPapercites Oscar Li AAAI 2018 prototypes paper
Explanations inside predictionsHave We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their InterpretationsAISTATS 2021
Layerwise interpretationLAYER-WISE INTERPRETATION OF DEEP NEURAL NETWORKS USING IDENTITY INITIALIZATIONArxiv
Visualizing Rule SetsVisualizing Rule Sets: Exploration and Validation of a Design SpaceArxivPyTorch
Human experimentsAre Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-MakingIUI 2021
Attention fine-grained classificationInterpretable Attention Guided Network for Fine-grained Visual ClassificationArxiv
Concept constructionExplaining Classifiers by Constructing Familiar ConceptsPaperPyTorch
EbDHuman-Understandable Decision Making for Visual RecognitionArxiv
Bridging XAI algorithm , Human needsTowards Connecting Use Cases and Methods in Interpretable Machine LearningArxiv
Generative trustworthy classifiersGenerative Classifiers as a Basis for Trustworthy Image ClassificationPaperGithub
Counterfactual explanationsGenerating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric UncertaintiesAISTATS 2021PyTorch
Role categorization of CNN unitsQuantitative Effectiveness Assessment and Role Categorization of Individual Units in Convolutional Neural NetworksICML 2021
Non-trivial counterfactual explanationsBeyond Trivial Counterfactual Explanations with Diverse Valuable ExplanationsArxiv
NP-ProtoPNetThese do not Look Like Those: An Interpretable Deep Learning Model for Image RecognitionIEEE
Correcting neural networks based on explanationsRefining Neural Networks with Compositional ExplanationsArxivCode link given in paper, but page not found
Contrastive reasoningContrastive Reasoning in Neural NetworksArxiv
Concept basedIntersection Regularization for Extracting Semantic AttributesArxiv
Boundary explanationsBoundary Attributions Provide Normal (Vector) ExplanationsArxivPyTorch
Generative CounterfactualsECINN: Efficient Counterfactuals from Invertible Neural NetworksArxiv
ICEInvertible Concept-based Explanations for CNN Models with Non-negative Concept Activation VectorsAAAI 2021
Group CAMGroup-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional NetworksArxivPyTorch
HMM interpretabilityTowards interpretability of Mixtures of Hidden Markov ModelsAAAI 2021sklearn
Empirical ExplainersEfficient Explanations from Empirical ExplainersArxivPyTorch
FixNormFIXNORM: DISSECTING WEIGHT DECAY FOR TRAINING DEEP NEURAL NETWORKSArxiv
CoDA-NetConvolutional Dynamic Alignment Networks for Interpretable ClassificationsCVPR 2021Code link given in paper. Repository not yet created
Like Dr. Chandru sir's (IITPKD) XAI workNeural Response Interpretation through the Lens of Critical PathwaysArxivPyTorch- Pathway GradPyTorch - ROAR
InaugmentInAugment: Improving Classifiers via Internal AugmentationArxivCode yet to be updated
Gradual Grad CAMEnhancing Deep Neural Network Saliency Visualizations with Gradual ExtrapolationArxivPyTorch
A-FMIA-FMI: LEARNING ATTRIBUTIONS FROM DEEP NETWORKS VIA FEATURE MAP IMPORTANCEArxiv
Trust - RegressionTo Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression PredictionsAAAI 2021sklearn
Concept based explanations - studyIS DISENTANGLEMENT ALL YOU NEED? COMPARING CONCEPT-BASED & DISENTANGLEMENT APPROACHESICLR 2021 workshoptensorflow 2.3
Faithful attributionMutual Information Preserving Back-propagation: Learn to Invert for Faithful AttributionArxiv
Counterfactual explanationCounterfactual attribute-based visual explanations for classificationSpringer
User based explanationsThat's (not) the output I expected!” On the role of end user expectations in creating explanations of AI systemsAIJ
Human understandable concept based explanationsTowards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be RemovedArxiv
Improved attributionImproving Attribution Methods by Learning Submodular FunctionsArxiv
SHAP tractabilityOn the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability ResultsArxiv
SHAP explanation networkSHAPLEY EXPLANATION NETWORKSICLR 2021PyTorch
Concept based dataset shift explanationFAILING CONCEPTUALLY: CONCEPT-BASED EXPLANATIONS OF DATASET SHIFTICLR 2021 workshoptensorflow 2
EbDTowards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be RemovedArxiv
Evaluating CAMRevisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental AnalysisArxiv
EFC-CAMExclusive Feature Constrained Class Activation Mapping for Better Visual ExplanationIEEE
Causal InterpretationInstance-wise Causal Feature Selection for Model InterpretationArxivPyTorch
Fairness in LearningLearning to Learn to be Right for the Right ReasonsArxiv
Feature attribution correctnessDo Feature Attribution Methods Correctly Attribute Features?ArxivCode not yet updated
NICENICE: AN ALGORITHM FOR NEAREST INSTANCE COUNTERFACTUAL EXPLANATIONSArxivOwn Python Package
SCGA Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual ConceptsArxiv
Visual ConceptsA Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual ConceptsArxiv
This looks like that - drawbackThis Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep NetworksArxivPyTorch
Exemplar based classificationVisualizing Association in Exemplar-Based ClassificationICASSP 2021
Correcting classificationCORRECTING CLASSIFICATION: A BAYESIAN FRAMEWORK USING EXPLANATION FEEDBACK TO IMPROVE CLASSIFICATION ABILITIESArxiv
Concept Bottleneck NetworksDO CONCEPT BOTTLENECK MODELS LEARN AS INTENDED?ICLR workshop 2021
Sanity for saliencySanity Simulations for Saliency MethodsArxiv
Concept based explanationsCause and Effect: Concept-based Explanation of Neural NetworksArxiv
CLIMEPHow to Explain Neural Networks: A perspective of data space divisionArxiv
Sufficient explanationsProbabilistic Sufficient ExplanationsArxivEmpty Repository
SHAP baselineLearning Baseline Values for Shapley ValuesArxiv
Explainable by DesignEXoN: EXplainable encoder NetworkArxivtensorflow 2.4.0explainable VAE
Concept based explanationsAligning Artificial Neural Networks and Ontologies towards Explainable AIAAAI 2021
XAI via Bayesian teachingABSTRACTION, VALIDATION, AND GENERALIZATION FOR EXPLAINABLE ARTIFICIAL INTELLIGENCEArxiv
Explanation blind spotsDO NOT EXPLAIN WITHOUT CONTEXT: ADDRESSING THE BLIND SPOT OF MODEL EXPLANATIONSArxiv
BLABounded logit attention: Learning to explain image classifiersArxivtensorflowL2X++
Interpretability - mathematical modelThe Definitions of Interpretability and Learning of Interpretable ModelsArxiv
Similar to our ICML workshop 2021 workThe effectiveness of feature attribution methods and its correlation with automatic evaluation scoresArxiv
EDDAEDDA: Explanation-driven Data Augmentation to Improve Model and Explanation AlignmentArxiv
Relevant set explanationsEfficient Explanations With Relevant SetsArxiv
Model transferMaking CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy LearningArxiv
Model correctionFinding and Fixing Spurious Patterns with ExplanationsArxiv
Neuron graph communitiesOn the Evolution of Neuron Communities in a Deep Learning ArchitectureArxiv
Mid level features explanationsA general approach for Explanations in terms of Middle Level FeaturesArxivsee how different from MUSE by Hima Lakkaraju group
Concept based knowledge distillationTowards Black-Box Explainability with Gaussian Discriminant Knowledge DistillationCVPR 2021 workshopcompare and contrast with network dissection
CNN high frequency biasDissecting the High-Frequency Bias in Convolutional Neural NetworksCVPR 2021 workshopTensorflow
Explainable by designEntropy-based Logic Explanations of Neural NetworksArxivPyTorchconcept based
CALMKeep CALM and Improve Visual Feature AttributionArxivPyTorch
Relevance CAMRelevance-CAM: Your Model Already Knows Where to LookCVPR 2021PyTorch
S-LIMES-LIME: Stabilized-LIME for Model ExplanationArxivsklearn
Local + GlobalBest of both worlds: local and global explanations with human-understandable conceptsArxivBeen Kim's group
Guided integrated gradientsGuided Integrated Gradients: an Adaptive Path Method for Removing NoiseCVPR 2021
Concept basedMeaningfully Explaining a Model’s MistakesArxiv
Explainable by designIt’s FLAN time! Summing feature-wise latent representations for interpretabilityArxiv
SimAMSimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural NetworksICML 2021PyTorch
DANCEDANCE: Enhancing saliency maps using decoysICML 2021Tensorflow 1.x
EbD Concept formationExplore Visual Concept Formation for Image ClassificationICML 2021PyTorch
Explainable by designInterpretable Compositional Convolutional Neural NetworksArxiv
Attribution aggregationExplaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature AggregationAAAI 2021 - pdf
Perturbation based activationA Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with PerturbationAAAI 2021
Global explanationsFeature Synergy, Redundancy, and Independence in Global Model Explanations using SHAP Vector DecompositionArxivGithub package
L2ELearning to Explain: Generating Stable Explanations FastACL 2021PyTorchNLE
Joint ShapleyJoint Shapley values: a measure of joint feature importanceArxiv
Explainable by designAlign Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency AlignmentArxiv
Explainable by designSONG: SELF-ORGANIZING NEURAL GRAPHSArxiv
Explainable by designDesigning Shapelets for Interpretable Data-Agnostic ClassificationAIES 2021sklearnInterpretable block of time series extended to other data modalitites like image, text, tabular
Global explanations + Model correctionWhere do Models go Wrong? Parameter-Space Saliency Maps for ExplainabilityArxivPyTorch
HIL- Model correctionHuman-in-the-loop Extraction of Interpretable Concepts in Deep Learning ModelsArxiv
Activation based Cause AnalysisActivation-Based Cause Analysis Method for Neural NetworksIEEE Access 2021
Local explanationsLeveraging Latent Features for Local ExplanationsACM SIGKDD 2021Amit Dhurandhar group
FairnessAdequate and fair explanationsArxiv - Accepted in CD-MAKE 2021
Global explanationsFinding Representative Interpretations on Convolutional Neural NetworksICCV 2021
Groupwise explanationsLearning Groupwise Explanations for Black-Box ModelsIJCAI 2021PyTorch
MathematicalOn Smoother Attributions using Neural Stochastic Differential EquationsIJCAI 2021
AGIExplaining Deep Neural Network Models with Adversarial Gradient IntegrationIJCAI 2021PyTorch
Accountable attributionLongitudinal Distance: Towards Accountable Instance AttributionArxivTensorflow Keras
Global explanationUnderstanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in ImagesArxiv
Concepts based - Explainable by designInducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc ApproachArxivIITH Vineeth sir group
Explainable by designThis looks more like that: Enhancing Self-Explaining Models by Prototypical Relevance PropagationArxiv
MILProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained InterpretabilityArxiv
Concept based explanationsInstance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based ExplanationArxiv
Counterfactual explanation + Theory of MindCX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition ModelsArxiv
Evaluation metricCounterfactual Evaluation for Explainable AIArxiv
CIM - FSCCIM: Class-Irrelevant Mapping for Few-Shot ClassificationArxiv
Causal ConceptsUnsupervised Causal Binary Concepts Discovery with VAE for Black-box Model ExplanationArxiv
ECEEnsemble of Counterfactual ExplainersPaperCode - seems hybrid of tf and torch
Structured ExplanationsFrom Heatmaps to Structured Explanations of Image ClassifiersArxiv
XAI metricAn Objective Metric for Explainable AI - How and Why to Estimate the Degree of ExplainabilityArxiv
DisCERNDisCERN:Discovering Counterfactual Explanations using Relevance Features from NeighbourhoodsArxiv
PSEMTowards Better Model Understanding with Path-Sufficient ExplanationsArxivAmit Dhurandhar sir group
Evaluation trapsThe Logic Traps in Evaluating Post-hoc InterpretationsArxiv
Interactive explanationsExplainability Requires InteractivityArxivPyTorch
CounterNetCounterNet: End-to-End Training of Counterfactual Aware PredictionsArxivPyTorch
Evaluation metric - Concept based explanationDetection Accuracy for Evaluating Compositional Explanations of UnitsArxiv
Explanation - UncertainityEffects of Uncertainty on the Quality of Feature Importance ExplanationsArxiv
Survey PaperTOWARDS USER-CENTRIC EXPLANATIONS FOR EXPLAINABLE MODELS: A REVIEWJISTM Journal Paper
Feature attributionThe Struggles and Subjectivity of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient SubsetsAAAI 2021 workshop
Contextual explanationContext-based image explanations for deep neural networksImage and Vision Computing Journal
Causal + CounterfactualCounterfactual Instances Explain LittleArxiv
Case based PosthocExplaining Deep Learning using examples: Optimal feature weighting methods for twin systems using post-hoc, explanation-by-example in XAIElsevier
Debugging gray box modelToward a Unified Framework for Debugging Gray-box ModelsArxiv
Explainable by designOptimising for Interpretability: Convolutional Dynamic Alignment NetworksArxiv
XAI negative effectExplainability Pitfalls: Beyond Dark Patterns in Explainable AIArxiv
Evaluate attributionsWHO EXPLAINS THE EXPLANATION? QUANTITATIVELY ASSESSING FEATURE ATTRIBUTION METHODSArxiv
Counterfactual explanationsDesigning Counterfactual Generators using Deep Model InversionArxiv
Model correction using explanationConsistent Explanations by Contrastive LearningArxiv
Visualize feature mapsVisualizing Feature Maps for Model Selection in Convolutional Neural NetworksICCV 2021 WorkshopTensorflow 1.15
SPSStochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained RecognitionICCV 2021PyTorch
DMBPGenerating Attribution Maps with Disentangled Masked BackpropagationICCV 2021
Better CAMTowards Better Explanations of Class Activation MappingICCV 2021
LEGStatistically Consistent Saliency EstimationICCV 2021Keras
IBAFine-Grained Neural Network Explanation by Identifying Input Features with Predictive InformationNeurIPS 2021PyTorch
Looks similar to This Looks Like ThatInterpretable Image Recognition by Constructing Transparent Embedding SpaceICCV 2021Code not yet publicly released
Causal ImagenetCAUSAL IMAGENET: HOW TO DISCOVER SPURIOUS FEATURES IN DEEP LEARNING?Arxiv
Model correctionLogic Constraints to Feature ImportancesArxiv
Receptive field Misalignment CAMOn the Receptive Field Misalignment in CAM-based Visual ExplanationsPattern recognition LettersPyTorch
SimplexExplaining Latent Representations with a Corpus of ExamplesArxivPyTorch
Sanity checksRevisiting Sanity Checks for Saliency MapsArxiv - NeurIPS 2021 workshop
Model correctionDebugging the Internals of Convolutional NetworksPDF
SITESelf-Interpretable Model with Transformation Equivariant InterpretationArxivAccepted at NeurIPS 2021EbD
Influential examplesRevisiting Methods for Finding Influential ExamplesArxiv
SOBOLLook at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity AnalysisNeurIPS 2021Tensorflow and PyTorch
Feature vectorsBeyond Importance Scores: Interpreting Tabular ML by Visualizing Feature SemanticsArxivglobal interpretability
OOD in explainabilityThe Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance ExplanationsNeurIPS 2021sklearn
RPS LJERepresenter Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble ModelsNeurIPS 2021PyTorch
Model correctionEditing a Classifier by Rewriting Its Prediction RulesNeurIPS 2021Code
suppressor variable litmus testScrutinizing XAI using linear ground-truth data with suppressor variablesArxiv
Explainable knowledge distillationLearning Interpretation with Explainable Knowledge DistillationArxiv
STEEXSTEEX: Steering Counterfactual Explanations with SemanticsArxivCode
Binary counterfactual explanationCounterfactual Explanations via Latent Space Projection and InterpolationArxiv
ECLAIREEfficient Decompositional Rule Extraction for Deep Neural NetworksArxivR
CartoonXCartoon Explanations of Image ClassifiersResearchgate
concept based explanationExplanations in terms of Hierarchically organised Middle Level FeaturesPapersee how close to MACE and PACE
Concept ballOntology-based 𝑛-ball Concept Embeddings Informing Few-shot Image ClassificationPaper
SPARROWSPARROW: Semantically Coherent Prototypes for Image ClassificationBMVC 2021
XAI evaluation criteriaObjective criteria for explanations of machine learning modelsPaper
Code inversion with human perceptionEXPLORING ALIGNMENT OF REPRESENTATIONS WITH HUMAN PERCEPTIONArxiv
Deformable ProtoPNetDeformable ProtoPNet: An Interpretable Image Classifier Using Deformable PrototypesArxiv
ICSNInteractive Disentanglement: Learning Concepts by Interacting with their Prototype RepresentationsArxiv
HIVEHIVE: Evaluating the Human Interpretability of Visual ExplanationsArxivProject Page
Jitter CAMJitter-CAM: Improving the Spatial Resolution of CAM-Based ExplanationsBMVC 2021PyTorch
Interpreting last layerdentifying Class Specific Filters with L1 Norm Frequency Histograms in Deep CNNsArxiv
FCPForward Composition Propagation for Explainable Neural ReasoningArxiv
ProtopoolInterpretable Image Classification with Differentiable Prototypes AssignmentArxiv
PRELIMPedagogical Rule Extraction for Learning Interpretable ModelsArxiv
Fair correction vectorsFAIR INTERPRETABLE LEARNING VIA CORRECTION VECTORSICLR 2021
Smooth LRPSmoothLRP: Smoothing LRP by Averaging over Stochastic Input VariationsESANN 2021
Causal CAMEXTRACTING CAUSAL VISUAL FEATURES FOR LIMITED LABEL CLASSIFICATIONICIP 2021

2022 Papers

TitlePaper TitleSource LinkCodeTags
SNISemantic Network InterpretationWACV 2022
F-CAMF-CAM: Full Resolution Class Activation Maps via Guided Parametric UpscalingWACV 2022PyTorch
PCACEPCACE: A Statistical Approach to Ranking Neurons for CNN InterpretabilityArxiv
Evaluating Attribution methodsEvaluating Attribution Methods in Machine Learning InterpretabilityIEEE International Conference on Big Data
X-decision makingExplainable Decision Making with Lean and Argumentative ExplanationsArxiv
Include domain knowledge to neural networkA review of some techniques for inclusion of domain‑knowledge into deep neural networksNature
CNN Hierarchical DecompositionDeeply Explain CNN via Hierarchical DecompositionArxiv
Explanatory learningEXPLANATORY LEARNING: BEYOND EMPIRICISM IN NEURAL NETWORKSArxiv
Conceptor CAMConceptor Learning for Class Activation MappingIEEE-TIP
Classifier orthogonalizationCONTROLLING DIRECTIONS ORTHOGONAL TO A CLASSIFIERICLR 2022PyTorch
Attention not explanationAttention cannot be an ExplanationArxiv
CNN sensitivity analysisA Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual AttributesArxiv
Trusting extrapolationTo what extent should we trust AI models when they extrapolate?Arxiv
LAPLAP: An Attention-Based Module for Faithful Interpretation and Knowledge Injection in Convolutional Neural NetworksArxivconcept based explanations
Saliency map evaluation metricsMetrics for saliency map evaluation of deep learning explanation methodsArxiv
LINEXLocally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant LearningArxiv
ROADEvaluating Feature Attribution: An Information-Theoretic PerspectiveArxivPyTorch
CBM-AUCConcept Bottleneck Model with Additional Unsupervised ConceptsArxiv
Explainability as dialogueRethinking Explainability as a Dialogue: A Practitioner’s PerspectiveArxiv
IAAAligning Eyes between Humans and Deep Neural Network through Interactive Attention AlignmentArxiv
Plug inA Novel Plug-in Module for Fine-Grained Visual ClassificationArxivPyTorch
Hierarchical conceptsCause and Effect: Hierarchical Concept-based Explanation of Neural NetworksArxiv
Model correction by designLEARNING ROBUST CONVOLUTIONAL NEURAL NETWORKS WITH RELEVANT FEATURE FOCUSING VIA EXPLANATIONSArxiv
Concept discoveryDiscovering Concepts in Learned Representations using Statistical Inference and Interactive VisualizationArxiv
Rare spurious correlationUnderstanding Rare Spurious Correlations in Neural NetworksArxivPyTorch
CausalMatching Learned Causal Effects of Neural Networks with Domain PriorsArxiv
PYLONImproved image classification explainability with high accuracy heatmapsiScience Journal
Causal counterfactualREALISTIC COUNTERFACTUAL EXPLANATIONS BY LEARNED RELATIONSArxiv
Argumentative Causal explanationForging Argumentative Explanations from Causal ModelsPaper
EVADon’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation AnalysisArxiv
Conceptual modellingConceptSuperimposition: Using Conceptual Modeling Method for Explainable AIPaper
SIDUVisual Explanation of Black-Box Model : Similarity Difference and Uniqueness (SIDU) MethodPattern Recognition JournalTensorflow 2.x
Explainable representationsExplaining, Evaluating and Enhancing Neural Networks’ Learned RepresentationsArxiv
XAI OverviewExplanatory Paradigms in Neural NetworksArxiv
Evaluating attribution methodsEvaluating Feature Attribution Methods in the Image DomainArxivPyTorch
Prototype vector + perturbationThe Need for Empirical Evaluation of Explanation QualityArxiv
ADVISEADVISE: ADaptive Feature Relevance and VISual Explanations for Convolutional Neural NetworksArxivMatlab
Improving Grad CAMImproving the Interpretability of GradCAMs in Deep Classification NetworksScience Direct
Explainable by designInterpretable part-whole hierarchies and conceptual-semantic relationships in neural networksCVPR 2022PyTorch
CAMPDo Explanations Explain? Model Knows BestArxivPyTorch
Attribution stabilityRETHINKING STABILITY FOR ATTRIBUTION-BASED EXPLANATIONSArxiv
SSSCDSparse Subspace Clustering for Concept Discovery (SSCCD)Arxiv
Model improvementBeyond Explaining: Opportunities and Challenges of XAI-Based Model ImprovementArxiv
Causal explanationsTrying to Outrun Causality in Machine Learning: Limitations of Model Explainability Techniques for Identifying Predictive VariablesArxivsklearn
Causal explanationsDiffusion Causal Models for Counterfactual EstimationArxiv
Causal inference influence functionsA Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric StatisticsArxivPyTorch
Causal discoveryCausal discovery for observational sciences using supervised machine learningArxiv
Causal DACausal Domain Adaptation with Copula Entropy based Conditional Independence TestArxiv
Causal experimental designInterventions, Where and How? Experimental Design for Causal Models at ScaleArxivseems ICML format
Causal discoverySCORE MATCHING ENABLES CAUSAL DISCOVERY OF NONLINEAR ADDITIVE NOISE MODELSArxiv
Causal Explanation - Cynthia RudinWHY INTERPRETABLE CAUSAL INFERENCE IS IMPORTANT FOR HIGH-STAKES DECISION MAKING FOR CRITICALLY ILL PATIENTS AND HOW TO DO ITArxiv
Semantically consistent counterfactualsMaking Heads or Tails: Towards Semantically Consistent Visual CounterfactualsArxiv
Posthoc global hyperspherePost-hoc Global Explanation using Hypersphere SetsICAART 2022
CapsNet explanationInvestigation of Capsule Networks Regarding their Potential of Explainability and Image RankingsICAART 2022
XAI evaluationA Unified Study of Machine Learning Explanation Evaluation MetricsArxiv
Concept based counterfactual explanationsDISSECT: Disentangled Simultaneous Explanations via Concept TraversalsICLR 2022tensorflow 1.12Been Kim's group
concept evolutionConceptEvo: Interpreting Concept Evolution in Deep Learning TrainingArxiv
Poly-CAMBackward recursive Class Activation Map refinement for high resolution saliency mapPaper
Interactive Concept explanationConceptExplainer: Interactive Explanation for Deep Neural Networks from a Concept PerspectiveArxiv
Quasi ProtoPNetThink positive: An interpretable neural network for image recognitionNeural Networks Journal
TAMVISUALIZING DEEP NEURAL NETWORKS WITH TOPOGRAPHIC ACTIVATION MAPSArxiv
S-XAISemantic interpretation for convolutional neural networks: What makes a cat a cat?Arxiv
See through DNNPerception Visualization: Seeing Through the Eyes of a DNNArxiv
IOMUnderstanding CNNs from excitationsArxiv
KICEIntegrating Prior Knowledge in Post-hoc ExplanationsArxiv