Weakly labeled data augmentation for deep learning: A study on COVID-19 detection in chest X-rays

doi:10.3390/DIAGNOSTICS10060358

Open AccessJournal Article10.3390/DIAGNOSTICS10060358

Weakly labeled data augmentation for deep learning: A study on COVID-19 detection in chest X-rays

Sivaramakrishnan Rajaraman, +1 more

- 30 May 2020

- Vol. 10, Iss: 6, pp 358

89

TL;DR: Interestingly, adding COVID-19 CXRs to simple weakly labeled augmented training data significantly improves the performance, suggesting that CO VID-19, though viral in origin, creates a uniquely different presentation in CXR compared with other viral pneumonia manifestations.

Abstract: The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a pandemic resulting in over 2.7 million infected individuals and over 190,000 deaths and growing. Assertions in the literature suggest that respiratory disorders due to COVID-19 commonly present with pneumonia-like symptoms which are radiologically confirmed as opacities. Radiology serves as an adjunct to the reverse transcription-polymerase chain reaction test for confirmation and evaluating disease progression. While computed tomography (CT) imaging is more specific than chest X-rays (CXR), its use is limited due to cross-contamination concerns. CXR imaging is commonly used in high-demand situations, placing a significant burden on radiology services. The use of artificial intelligence (AI) has been suggested to alleviate this burden. However, there is a dearth of sufficient training data for developing image-based AI tools. We propose increasing training data for recognizing COVID-19 pneumonia opacities using weakly labeled data augmentation. This follows from a hypothesis that the COVID-19 manifestation would be similar to that caused by other viral pathogens affecting the lungs. We expand the training data distribution for supervised learning through the use of weakly labeled CXR images, automatically pooled from publicly available pneumonia datasets, to classify them into those with bacterial or viral pneumonia opacities. Next, we use these selected images in a stage-wise, strategic approach to train convolutional neural network-based algorithms and compare against those trained with non-augmented data. Weakly labeled data augmentation expands the learned feature space in an attempt to encompass variability in unseen test distributions, enhance inter-class discrimination, and reduce the generalization error. Empirical evaluations demonstrate that simple weakly labeled data augmentation (Acc: 0.5555 and Acc: 0.6536) is better than baseline non-augmented training (Acc: 0.2885 and Acc: 0.5028) in identifying COVID-19 manifestations as viral pneumonia. Interestingly, adding COVID-19 CXRs to simple weakly labeled augmented training data significantly improves the performance (Acc: 0.7095 and Acc: 0.8889), suggesting that COVID-19, though viral in origin, creates a uniquely different presentation in CXRs compared with other viral pneumonia manifestations.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1136/BMJ.M1328

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

Laure Wynants, +55 more

- 07 Apr 2020

- BMJ

TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.

...read moreread less

3.1K

•Journal Article•10.1109/TAI.2020.3020521

Leveraging Data Science to Combat COVID-19: A Comprehensive Review

Siddique Latif, +10 more

- 02 Sep 2020

TL;DR: This paper attempts to systematise the various COVID-19 research activities leveraging data science, where data science is defined broadly to encompass the various methods and tools that can be used to store, process, and extract insights from data.

...read moreread less

266

•Journal Article•10.1109/ACCESS.2021.3058537

A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19)

Md. Milon Islam, +3 more

- 01 Jan 2021

- IEEE Access

TL;DR: In this paper, a review of deep learning based systems for the detection of the new coronavirus (COVID-19) outbreak has been presented, which can be potentially further utilized to combat the outbreak.

...read moreread less

262

•Journal Article•10.1016/J.CHAOS.2020.110338

Applications of artificial intelligence in battling against covid-19: A literature review.

Mohammad-H. Tayarani N.

- 01 Jan 2021

- Chaos Solitons & Fractals

TL;DR: An overview on the applications of AI in a variety of fields including diagnosis of the disease via different types of tests and symptoms, monitoring patients, identifying severity of a patient, processing covid-19 related imaging tests, epidemiology, pharmaceutical studies, etc.

...read moreread less

204

•Journal Article•10.1007/S11042-021-10707-4

Medical image analysis based on deep learning approach.

Muralikrishna Puttagunta, +1 more

- 06 Apr 2021

- Multimedia Tools and Applications

TL;DR: Deep Learning Approach (DLA) has been widely used in medical imaging to detect the presence or absence of the disease as discussed by the authors, and most of the implementations concentrate on the X-ray images, computerized tomography, mammography images, and digital histopathology images.

...read moreread less

184

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Book Chapter•10.1007/978-3-319-24574-4_28

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

- 05 Oct 2015

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

92K

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K