Inflation of test accuracy due to data leakage in deep learning-based classification of OCT images
TL;DR: In this paper , the effect of improper dataset splitting on model evaluation is demonstrated for three classification tasks using three OCT open-access datasets extensively used, Kermany's and Srinivasan's ophthalmology datasets, and AIIMS breast tissue dataset.
read more
Abstract: In the application of deep learning on optical coherence tomography (OCT) data, it is common to train classification networks using 2D images originating from volumetric data. Given the micrometer resolution of OCT systems, consecutive images are often very similar in both visible structures and noise. Thus, an inappropriate data split can result in overlap between the training and testing sets, with a large portion of the literature overlooking this aspect. In this study, the effect of improper dataset splitting on model evaluation is demonstrated for three classification tasks using three OCT open-access datasets extensively used, Kermany's and Srinivasan's ophthalmology datasets, and AIIMS breast tissue dataset. Results show that the classification performance is inflated by 0.07 up to 0.43 in terms of Matthews Correlation Coefficient (accuracy: 5% to 30%) for models tested on datasets with improper splitting, highlighting the considerable effect of dataset handling on model evaluation. This study intends to raise awareness on the importance of dataset splitting given the increased research interest in implementing deep learning on OCT data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Predicting Adolescent Mental Health Outcomes Across Cultures: A Machine Learning Approach
W. Andrew Rothenberg,Andrea Bizzego,Gianluca Esposito,Jennifer E. Lansford,Suha M. Al-Hassan,Dario Bacchini,Marc H. Bornstein,Lei Chang,Kirby Deater-Deckard,Laura Di Giunta,Kenneth A. Dodge,Sevtap Gurdal,Qin Liu,Qian Long,Paul Odhiambo Oburu,Concetta Pastorelli,Ann T. Skinner,Emma Sorbring,Sombat Tapanya,Laurence Steinberg,Liliana Maria Uribe Tirado,Saengduean Yotanyamaneewong,Liane Peña Alampay +22 more
TL;DR: This paper used machine learning models to identify the most important preadolescent risk factors in predicting adolescent mental health, including family context, parenting behaviors, individual child characteristics, and neighborhood and cultural variables.
Barometers behaving badly II: A critical evaluation of Cpx-only and Cpx-Liq thermobarometry in variably-hydrous arc magmas
TL;DR: In this paper , the average Clinopyroxene-Liquid (Cpx-Liq) compositions from N=543 variably-hydrous experiments at crustal conditions (1 bar to 17 kbar) were used to assess the performance of different thermobarometers, and identify the most accurate and precise expressions for application to subduction zone magmas.
22
ConvAttenMixer: Brain Tumor Detection and Type Classification using Convolutional Mixer with External and Self-Attention Mechanisms
TL;DR: ConvAttenMixer, a transformer model, combines convolutional layers with self-attention and external attention mechanisms to enhance brain tumor detection and classification in MRI images, outperforming state-of-the-art baselines with higher precision, recall, and accuracy (0.9794).
20
AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning
TL;DR: Vandergaard et al. as discussed by the authors presented an open-source machine learning framework, AutoPrognosis 2.0, to facilitate the development of diagnostic and prognostic models.
Vision Transformers and Transfer Learning Approaches for Arabic Sign Language Recognition
Nojood M. Alharthi,Salha Alzahrani +1 more
TL;DR: This study aimed to create robust transfer learning models trained on a dataset of 54,049 images depicting 32 alphabets from an ArSL dataset and demonstrated the effectiveness and robustness of using transfer learning with vision transformers for sign language recognition for other low-resourced languages.
18
References
A survey on deep learning in medical image analysis
Geert Litjens,Thijs Kooi,Babak Ehteshami Bejnordi,Arnaud Arindra Adiyoso Setio,Francesco Ciompi,Mohsen Ghafoorian,Jeroen van der Laak,Bram van Ginneken,Clara I. Sánchez +8 more
TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.
12.5K
•Book
Applied Predictive Modeling
Max Kuhn,Kjell Johnson +1 more
- 17 May 2013
TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.
5.9K
A systematic analysis of performance measures for classification tasks
Marina Sokolova,Guy Lapalme +1 more
TL;DR: This paper presents a systematic analysis of twenty four performance measures used in the complete spectrum of Machine Learning classification tasks, i.e., binary, multi-class,multi-labelled, and hierarchical, to produce a measure invariance taxonomy with respect to all relevant label distribution changes in a classification problem.
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
Davide Chicco,Giuseppe Jurman +1 more
TL;DR: This article shows how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario.
Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning
Daniel S. Kermany,Daniel S. Kermany,Michael H. Goldbaum,Wenjia Cai,Carolina C. S. Valentim,Huiying Liang,Sally L. Baxter,Alex McKeown,Ge Yang,Xiaokang Wu,Fangbing Yan,Justin Dong,Made K. Prasadha,Jacqueline Pei,Jacqueline Pei,Magdalene Yin Lin Ting,Jie Zhu,Christina Li,Sierra Hewett,Sierra Hewett,Jason Dong,Ian Ziyar,Alexander Shi,Runze Zhang,Lianghong Zheng,Rui Hou,William Shi,Xin Fu,Xin Fu,Yaou Duan,Viet Anh Nguyen Huu,Viet Anh Nguyen Huu,Cindy Wen,Edward Zhang,Edward Zhang,Charlotte Zhang,Charlotte Zhang,Oulan Li,Oulan Li,Xiaobo Wang,Michael A Singer,Xiaodong Sun,Jie Xu,Ali Tafreshi,M. Anthony Lewis,Huimin Xia,Kang Zhang +46 more
TL;DR: A diagnostic tool based on a deep-learning framework for the screening of patients with common treatable blinding retinal diseases, which demonstrates performance comparable to that of human experts in classifying age-related macular degeneration and diabetic macular edema.
4.2K