Overfitting of Neural Nets Under Class Imbalance: Analysis and Improvements for Segmentation
Zeju Li,Konstantinos Kamnitsas,Ben Glocker +2 more
- 13 Oct 2019
- pp 402-410
TL;DR: In this article, the distribution of logit activations when processing unseen test samples of an underrepresented class tends to shift towards and even across the decision boundary, while the over-represented class seems unaffected.
read more
Abstract: Overfitting in deep learning has been the focus of a number of recent works, yet its exact impact on the behavior of neural networks is not well understood. This study analyzes overfitting by examining how the distribution of logits alters in relation to how much the model overfits. Specifically, we find that when training with few data samples, the distribution of logit activations when processing unseen test samples of an under-represented class tends to shift towards and even across the decision boundary, while the over-represented class seems unaffected. In image segmentation, foreground samples are often heavily under-represented. We observe that sensitivity of the model drops as a result of overfitting, while precision remains mostly stable. Based on our analysis, we derive asymmetric modifications of existing loss functions and regularizers including a large margin loss, focal loss, adversarial training and mixup, which specifically aim at reducing the shift observed when embedding unseen samples of the under-represented class. We study the case of binary segmentation of brain tumor core and show that our proposed simple modifications lead to significantly improved segmentation performance over the symmetric variants.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Fig. 3. The illustration of the proposed asymmetric modifications for the existing four techniques. We make the logit activations of foreground class far away from the decision boundary by setting bias for the foreground class in different ways. 
Table 1. Evaluation of the tumor core segmentation with different amounts of training data and different techniques to counter overfitting. 
Fig. 4. Activations of the classification layer when processing (top) foreground and (bottom) background samples, using 5% training data. Asymmetric modifications lead to better separation of the logits of unseen foreground samples.
Citations
Discriminative ensemble learning for few-shot chest x-ray diagnosis.
TL;DR: The proposed method for few-shot diagnosis of diseases and conditions from chest x-rays using discriminative ensemble learning is modular and easily adaptable to new tasks requiring the training of only the saliency-based classifier.
52
Multimodal brain tumor detection using multimodal deep transfer learning
TL;DR: In this paper , the authors proposed a new multimodal deep transfer learning for MRI brain image segmentation, where the knowledge transfer between and within modalities is considered to tackle the challenge of having different distributions between the training and test sets.
34
A review of fake news detection approaches: A critical analysis of relevant studies and highlighting key challenges associated with the dataset, feature representation, and data fusion
TL;DR: The investigation of fake news detection studies relied on the following aspects and their impact on detection accuracy, namely datasets, overfitting/underfitting, image-based features, feature vector representation, machine learning models, and data fusion.
32
Revolutionizing Groundwater Management with Hybrid AI Models: A Practical Review
M Zaresefat,Reza Derakhshani +1 more
TL;DR: In this paper , the state-of-the-art hybrid machine learning (ML) models used for groundwater management are reviewed and a review of the most cited hybrid ML models employed in this domain is presented.
Enhancing MR image segmentation with realistic adversarial data augmentation
01 Nov 2022
TL;DR: In this article , the authors proposed AdvChain, a generic adversarial data augmentation framework to improve both the diversity and effectiveness of training data for medical image segmentation tasks by generating randomly chained photo-metric and geometric transformations to expand training data.
30
References
Focal Loss for Dense Object Detection
Tsung-Yi Lin,Priya Goyal,Ross Girshick,Kaiming He,Piotr Dollár +4 more
- 07 Aug 2017
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
•Posted Content
Explaining and Harnessing Adversarial Examples
TL;DR: The authors argue that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, which is supported by new quantitative results while giving the first explanation of the most intriguing fact about adversarial examples: their generalization across architectures and training sets.
15.9K
•Proceedings Article
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang,Moustapha Cisse,Yann N. Dauphin,David Lopez-Paz +3 more
- 25 Oct 2017
TL;DR: This work proposes mixup, a simple learning principle that trains a neural network on convex combinations of pairs of examples and their labels, which improves the generalization of state-of-the-art neural network architectures.
•Posted Content
mixup: Beyond Empirical Risk Minimization
TL;DR: Mixup as discussed by the authors trains a neural network on convex combinations of pairs of examples and their labels, and regularizes the neural network to favor simple linear behavior in between training examples, which improves the generalization of state-of-the-art neural network architectures.
4.2K
Efficient Multi-Scale 3D CNN with Fully Connected CRF for Accurate Brain Lesion Segmentation
Konstantinos Kamnitsas,Christian Ledig,Virginia F. J. Newcombe,Joanna P. Simpson,Andrew D. Kane,David K. Menon,Daniel Rueckert,Ben Glocker +7 more
TL;DR: An efficient and effective dense training scheme which joins the processing of adjacent image patches into one pass through the network while automatically adapting to the inherent class imbalance present in the data, and improves on the state-of-the‐art for all three applications.
3.6K
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Tsung-Yi Lin,Priya Goyal,Ross Girshick,Kaiming He,Piotr Dollár +4 more
- 07 Aug 2017
[...]