MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

doi:10.1109/CVPR.2017.378

Open AccessProceedings Article10.1109/CVPR.2017.378

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

Zizhao Zhang, +4 more

- 08 Jul 2017

- pp 3549-3557

393

TL;DR: This paper proposes MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1088/1361-6560/ac678a

Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency

Ana M. Barragan-Montero, +12 more

- 14 Apr 2022

- Physics in Medicine and Biology

TL;DR: The main risks and current solutions when applying the latter to workflows in the former are reviewed, and the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples.

...read moreread less

54

•Proceedings Article•10.1145/3313831.3376238

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Yuan Liang, +9 more

- 21 Apr 2020

TL;DR: OralCam is presented, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity by using a deep learning based framework.

...read moreread less

54

Journal Article•10.1016/J.PATCOG.2021.107856

Automatic medical image interpretation: State of the art and future directions

Hareem Ayesha, +8 more

- 01 Jun 2021

- Pattern Recognition

TL;DR: A comprehensive review of recent years' research of medical image captioning published in different international conferences and journals is presented in this article, where their common parameters are extracted to compare their methods, performance, strengths, limitations, and their recommendations are discussed.

...read moreread less

52

•Posted Content

Interpretable Spatio-temporal Attention for Video Action Recognition

Lili Meng, +6 more

- 01 Oct 2018

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Zhang et al. as discussed by the authors proposed an interpretable and easy plug-in spatial-temporal attention mechanism for video action recognition, which employs a convolutional LSTM based attention mechanism to identify the most relevant frames from an input video, and a set of regularizers to ensure that attention mechanism attends to coherent regions in space and time.

...read moreread less

51

•Journal Article•10.1038/s41598-023-31223-5

Medical image captioning via generative pretrained transformers

Alexander Selivanov, +5 more

- 28 Sep 2022

- Dental science reports

TL;DR: In this paper , a model for automatic clinical image caption generation combines the analysis of radiological scans with structured patient information from the textual records using two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.

...read moreread less

51

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article•10.1109/CVPR.2015.7298594

Going deeper with convolutions

Christian Szegedy, +8 more

- 07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

56.6K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

...

Expand

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

Chat with Paper

AI Agents for this Paper

Citations

Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Automatic medical image interpretation: State of the art and future directions

Interpretable Spatio-temporal Attention for Video Action Recognition

Medical image captioning via generative pretrained transformers

References

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Long short-term memory

Going deeper with convolutions

Very Deep Convolutional Networks for Large-Scale Image Recognition

Related Papers (5)

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

U-Net: Convolutional Networks for Biomedical Image Segmentation

ImageNet: A large-scale hierarchical image database

ImageNet Classification with Deep Convolutional Neural Networks