MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Zizhao Zhang,Yuanpu Xie,Fuyong Xing,Mason McGough,Lin Yang +4 more
- 08 Jul 2017
- pp 3549-3557
TL;DR: This paper proposes MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process.
read more
Abstract: The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Adaptive Mask-Based Interpretable Convolutional Neural Network (AMI-CNN) for Modulation Format Identification
Xiyue Zhu,Yu Cheng,Jiafeng He,Juan Guo +3 more
TL;DR: An Adaptive Mask-Based Interpretable Convolutional Neural Network (AMI-CNN) that utilizes a mask structure for feature selection during neural network training and feeds the selected features into the classifier for decision making is proposed.
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
TL;DR: Wang et al. as mentioned in this paper proposed a graph-based framework called GNNFormer, which seamlessly integrates graph neural network (GNN) and Transformer into the same framework, for cytopathology report generation.
A Novel Black-Box Complementary Explanation Approach for Thorax Multi-Label Classification
Khaled Bouabdallah,Ahlem Drif,Lars Kaderali +2 more
- 16 May 2024
TL;DR: A novel black-box complementary explanation approach for thorax multi-label classification based on case-based similarities. The approach extracts global and local features of the image, compares them with similar cases, and provides explanations based on the most similar cases.
SPSVO: a self-supervised surgical perception stereo visual odometer for endoscopy
TL;DR: A novel self-supervised Surgical Perception Stereo Visual Odometer framework is proposed to accurately estimate endoscopic pose and better assist surgeons in locating and diagnosing lesions.
Intelligent Robotics and Applications: 12th International Conference, ICIRA 2019, Shenyang, China, August 8–11, 2019, Proceedings, Part III
Haibin Yu,Jinguo Liu,Lianqing Liu,Zhaojie Ju,Yuwang Liu,Dalin Zhou +5 more
- 04 Aug 2019
TL;DR: This work hardwired a theoretical model of empathy in rational reinforcement learning-based agents to enable affective state sharing between agents and found that empathetic agents showed a strong sense of fairness in the ultimatum game which resulted in an evenhanded allocation scheme on resources.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
- 07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
51.9K
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014