MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Zizhao Zhang,Yuanpu Xie,Fuyong Xing,Mason McGough,Lin Yang +4 more
- 08 Jul 2017
- pp 3549-3557
TL;DR: This paper proposes MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process.
read more
Abstract: The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
SPECHT: Self-tuning Plausibility Based Object Detection Enables Quantification of Conflict in Heterogeneous Multi-scale Microscopy
Ben Cardoen,Timothy H. Wong,Parsa Alan,Sieun Lee,Joanne A Matsubara,Ivan R. Nabi,Ghassan Hamarneh +6 more
- 10 Aug 2021
TL;DR: Self-tuning unsupervised analysis of STED super resolution of fluorescent labelled Caveolin-1, confocal microscopy of retina tissue.
The application of panoramic segmentation network to medical image segmentation
Li Wang,RunZe Zhang,YongFang Chen,Yanjiang Wang +3 more
- 06 Dec 2020
TL;DR: In this article, a panoramic image segmentation network is proposed, where the bisenet network architecture is integrated to the segmentation branch, and the mask-rcnn network is employed for the image detection branch.
4
Modularity-Constrained Dynamic Representation Learning for Interpretable Brain Disorder Analysis with Functional MRI
Qianqian Wang,Mengqi Wu,Yuqi Fang,Wei Wang,Lishan Qiao,Ming-Xia Liu +5 more
TL;DR: Experimental results validate that the proposed modularity-constrained dynamic representation learning (MDRL) framework outperforms several state-of-the-art methods in fMRI-based brain disorder analysis and can be potentially used to improve clinical diagnosis.
4
Tracking System for a Coal Mine Drilling Robot for Low-Illumination Environments
TL;DR: Li et al. as mentioned in this paper proposed a low-illumination Long-term Correlation Tracker (LLCT) and designed a visual tracking system for coal mine drilling robots, which combines image enhancement strategies and long-time tracking.
Face Masked and Unmasked Humans Detection and Tracking in Video Surveillance
22 Oct 2022
TL;DR: In this article , the authors proposed to improve current face detection and long-term tracking technology by extracting the facial features of the top regions of the face, taking into account the eye, eyebrow, and forehead.
4
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
- 07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
51.9K
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014