TNT: An Interpretable Tree-Network-Tree Learning Framework using Knowledge Distillation.
TL;DR: A Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs is proposed, and extensive experiments demonstrated the effectiveness of the proposed method.
read more
Abstract: Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs. Specifically, the proposed TNT learning framework exerts the advantages of different models at different stages: (1) a novel James–Stein Decision Tree (JSDT) is proposed to generate better knowledge representations for DNNs, especially when the input data are in low-frequency or low-quality; (2) the DNNs output high-performing prediction result from the knowledge embedding inputs and behave as a teacher model for the following tree model; and (3) a novel distillable Gradient Boosted Decision Tree (dGBDT) is proposed to learn interpretable trees from the soft labels and make a comparable prediction as DNNs do. Extensive experiments on various machine learning tasks demonstrated the effectiveness of the proposed method.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Neural Networks and Tabular Data: A Survey
TL;DR: For a comprehensive overview of deep learning approaches for tabular data, see as mentioned in this paper , where the authors categorize these methods into three groups: data transformations, specialized architectures, and regularization models.
Deep Multi-Modal Discriminative and Interpretability Network for Alzheimer’s Disease Diagnosis
01 May 2023
TL;DR: Wang et al. as mentioned in this paper proposed a novel method named deep multi-modal discriminative and interpretability network (DMDIN), which aligns samples of the same category gather while samples of different categories stay away.
29
Deep Multi-Modal Discriminative and Interpretability Network for Alzheimer’s Disease Diagnosis
Qi Zhou,Bingliang Xu,Jiashuang Huang,Heyang Wang,Ruting Xu,Wei Shao,Daoqiang Zhang +6 more
TL;DR: Wang et al. as mentioned in this paper proposed a novel method named deep multi-modal discriminative and interpretability network (DMDIN), which aligns samples of the same category gather while samples of different categories stay away.
23
Unlocking the black box: an in-depth review on interpretability, explainability, and reliability in deep learning
Emrullah ŞAHİN,Naciye Nur Arslan,Durmuş Özdemir +2 more
22
Proto2Proto: Can you recognize the car, the way I do?
Monish Keswani,Sriranjani Ramakrishnan,Nishant Reddy,Vineeth N Balasubramanian +3 more
- 25 Apr 2022
TL;DR: Proto2Proto, a novel method to transfer interpretability of one prototypical part network to another via knowledge distillation, achieves interpretability transfer from teacher to student while simultaneously exhibiting competitive performance.
15
References
Random Forests
Leo Breiman
- 01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
XGBoost: A Scalable Tree Boosting System
Tianqi Chen,Carlos Guestrin +1 more
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Greedy function approximation: A gradient boosting machine.
TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
•Posted Content
Distilling the Knowledge in a Neural Network
TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
21.2K
Related Papers (5)
Huimin Zhao,Atish P. Sinha +1 more
- 01 Sep 2005
Dmitry Yu. Ignatov,Andrey Ignatov +1 more
- 25 Apr 2017