Weakly labeled data augmentation for deep learning: A study on COVID-19 detection in chest X-rays
Sivaramakrishnan Rajaraman,Sameer Antani +1 more
- 30 May 2020
- Vol. 10, Iss: 6, pp 358
TL;DR: Interestingly, adding COVID-19 CXRs to simple weakly labeled augmented training data significantly improves the performance, suggesting that CO VID-19, though viral in origin, creates a uniquely different presentation in CXR compared with other viral pneumonia manifestations.
read more
Abstract: The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a pandemic resulting in over 2.7 million infected individuals and over 190,000 deaths and growing. Assertions in the literature suggest that respiratory disorders due to COVID-19 commonly present with pneumonia-like symptoms which are radiologically confirmed as opacities. Radiology serves as an adjunct to the reverse transcription-polymerase chain reaction test for confirmation and evaluating disease progression. While computed tomography (CT) imaging is more specific than chest X-rays (CXR), its use is limited due to cross-contamination concerns. CXR imaging is commonly used in high-demand situations, placing a significant burden on radiology services. The use of artificial intelligence (AI) has been suggested to alleviate this burden. However, there is a dearth of sufficient training data for developing image-based AI tools. We propose increasing training data for recognizing COVID-19 pneumonia opacities using weakly labeled data augmentation. This follows from a hypothesis that the COVID-19 manifestation would be similar to that caused by other viral pathogens affecting the lungs. We expand the training data distribution for supervised learning through the use of weakly labeled CXR images, automatically pooled from publicly available pneumonia datasets, to classify them into those with bacterial or viral pneumonia opacities. Next, we use these selected images in a stage-wise, strategic approach to train convolutional neural network-based algorithms and compare against those trained with non-augmented data. Weakly labeled data augmentation expands the learned feature space in an attempt to encompass variability in unseen test distributions, enhance inter-class discrimination, and reduce the generalization error. Empirical evaluations demonstrate that simple weakly labeled data augmentation (Acc: 0.5555 and Acc: 0.6536) is better than baseline non-augmented training (Acc: 0.2885 and Acc: 0.5028) in identifying COVID-19 manifestations as viral pneumonia. Interestingly, adding COVID-19 CXRs to simple weakly labeled augmented training data significantly improves the performance (Acc: 0.7095 and Acc: 0.8889), suggesting that COVID-19, though viral in origin, creates a uniquely different presentation in CXRs compared with other viral pneumonia manifestations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal
Laure Wynants,Laure Wynants,Ben Van Calster,Ben Van Calster,Gary S. Collins,Gary S. Collins,Richard D Riley,Georg Heinze,Ewoud Schuit,Marc J.M. Bonten,Darren Dahly,Johanna A A G Damen,Thomas P. A. Debray,Valentijn M.T. de Jong,Maarten De Vos,Paula Dhiman,Paula Dhiman,Maria C Haller,Michael O. Harhay,Liesbet Henckaerts,Pauline Heus,Michael Kammer,Nina Kreuzberger,Anna Lohmann,Kim Luijken,Jie Ma,Glen P. Martin,David J. McLernon,Constanza L Andaur Navarro,Johannes B. Reitsma,Jamie C. Sergeant,Chunhu Shi,Nicole Skoetz,Luc J.M. Smits,Kym I E Snell,Matthew Sperrin,René Spijker,René Spijker,Ewout W. Steyerberg,Toshihiko Takada,Ioanna Tzoulaki,Ioanna Tzoulaki,Sander M. J. van Kuijk,Bas C T van Bussel,Bas C T van Bussel,Iwan C. C. van der Horst,Florien S. van Royen,Jan Y Verbakel,Jan Y Verbakel,Christine Wallisch,Christine Wallisch,Jack Wilkinson,Robert Wolff,Lotty Hooft,Karel G.M. Moons,Maarten van Smeden +55 more
TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.
3.1K
Leveraging Data Science to Combat COVID-19: A Comprehensive Review
Siddique Latif,Muhammad Usman,Sanaullah Manzoor,Waleed Iqbal,Junaid Qadir,Gareth Tyson,Ignacio Castro,Adeel Razi,Maged N. Kamel Boulos,Adrian Weller,Jon Crowcroft +10 more
- 02 Sep 2020
TL;DR: This paper attempts to systematise the various COVID-19 research activities leveraging data science, where data science is defined broadly to encompass the various methods and tools that can be used to store, process, and extract insights from data.
A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19)
TL;DR: In this paper, a review of deep learning based systems for the detection of the new coronavirus (COVID-19) outbreak has been presented, which can be potentially further utilized to combat the outbreak.
Applications of artificial intelligence in battling against covid-19: A literature review.
TL;DR: An overview on the applications of AI in a variety of fields including diagnosis of the disease via different types of tests and symptoms, monitoring patients, identifying severity of a patient, processing covid-19 related imaging tests, epidemiology, pharmaceutical studies, etc.
Medical image analysis based on deep learning approach.
Muralikrishna Puttagunta,S. Ravi +1 more
TL;DR: Deep Learning Approach (DLA) has been widely used in medical imaging to detect the presence or absence of the disease as discussed by the authors, and most of the implementations concentrate on the X-ray images, computerized tomography, mammography images, and digital histopathology images.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger,Philipp Fischer,Thomas Brox +2 more
- 05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.