Interpretability

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Proceedings Article•

A unified approach to interpreting model predictions

[...]

Scott M. Lundberg¹, Su-In Lee¹•Institutions (1)

University of Washington¹

4 Dec 2017

TL;DR: In this article, a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), is presented, which assigns each feature an importance value for a particular prediction.

...read moreread less

Abstract: Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

...read moreread less

19,241 citations

Journal Article•10.1098/RSTA.2015.0202•

Principal component analysis: a review and recent developments

[...]

Ian T. Jolliffe¹, Jorge Cadima², Jorge Cadima³•Institutions (3)

University of Exeter¹, Instituto Superior de Agronomia², University of Lisbon³

13 Apr 2016-Philosophical Transactions of the Royal Society A

TL;DR: The basic ideas of PCA are introduced, discussing what it can and cannot do, and some variants of the technique have been developed that are tailored to various different data types and structures.

...read moreread less

Abstract: Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, not a priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.

...read moreread less

7,488 citations

Journal Article•10.1038/S42256-019-0138-9•

From Local Explanations to Global Understanding with Explainable AI for Trees.

[...]

Scott M. Lundberg¹, Scott M. Lundberg², Gabriel G. Erion¹, Hugh Chen¹, Alex J. DeGrave¹, Jordan M. Prutkin¹, Bala G. Nair¹, Ronit Katz¹, Jonathan Himmelfarb¹, Nisha Bansal¹, Su-In Lee¹ - Show less +7 more•Institutions (2)

University of Washington¹, Microsoft²

17 Jan 2020-Nature Machine Intelligence

TL;DR: An explanation method for trees is presented that enables the computation of optimal local explanations for individual predictions, and the authors demonstrate their method on three medical datasets.

...read moreread less

Abstract: Tree-based machine learning models such as random forests, decision trees and gradient boosted trees are popular nonlinear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here we improve the interpretability of tree-based models through three main contributions. (1) A polynomial time algorithm to compute optimal explanations based on game theory. (2) A new type of explanation that directly measures local feature interaction effects. (3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to (1) identify high-magnitude but low-frequency nonlinear mortality risk factors in the US population, (2) highlight distinct population subgroups with shared risk characteristics, (3) identify nonlinear interaction effects among risk factors for chronic kidney disease and (4) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains. Tree-based machine learning models are widely used in domains such as healthcare, finance and public services. The authors present an explanation method for trees that enables the computation of optimal local explanations for individual predictions, and demonstrate their method on three medical datasets.

...read moreread less

5,629 citations

Journal Article•10.1145/3236009•

A Survey of Methods for Explaining Black Box Models

[...]

Riccardo Guidotti¹, Anna Monreale¹, Salvatore Ruggieri¹, Franco Turini¹, Fosca Giannotti², Dino Pedreschi¹ - Show less +2 more•Institutions (2)

University of Pisa¹, Istituto di Scienza e Tecnologie dell'Informazione²

22 Aug 2018-ACM Computing Surveys

TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.

...read moreread less

Abstract: In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

...read moreread less

4,490 citations

Posted Content•

Towards A Rigorous Science of Interpretable Machine Learning

[...]

Finale Doshi-Velez, Been Kim

28 Feb 2017-arXiv: Machine Learning

TL;DR: This position paper defines interpretability and describes when interpretability is needed (and when it is not), and suggests a taxonomy for rigorous evaluation and exposes open questions towards a more rigorous science of interpretable machine learning.

...read moreread less

Abstract: As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.

...read moreread less

4,042 citations

...

Expand

Year	Papers
2026	90
2025	2,681
2024	2,028
2023	4,151
2022	5,270
2021	1,521

Topic Tools

Papers published on a yearly basis

Papers

A unified approach to interpreting model predictions

Principal component analysis: a review and recent developments

From Local Explanations to Global Understanding with Explainable AI for Trees.

A Survey of Methods for Explaining Black Box Models

Towards A Rigorous Science of Interpretable Machine Learning

Related Topics (5)

Performance Metrics