Handling data irregularities in classification: Foundations, trends, and future challenges

doi:10.1016/J.PATCOG.2018.03.008

Journal Article10.1016/J.PATCOG.2018.03.008

Handling data irregularities in classification: Foundations, trends, and future challenges

Swagatam Das, +2 more

- 01 Sep 2018

- Pattern Recognition

- Vol. 81, pp 674-693

195

TL;DR: This article provides a bird's eye view of data irregularities, beginning with a taxonomy and characterization of various distribution-based and feature-based irregularities, and discusses the notable and recent approaches that have been taken to make the existing stand-alone as well as ensemble classifiers robust against such irregularities.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/ICCV.2019.00178

Generative Adversarial Minority Oversampling

Sankha Subhra Mullick, +2 more

- 22 Mar 2019

TL;DR: In this article, a three-player adversarial game between a convex generator, a multi-class classifier network, and a real/fake discriminator is proposed to perform oversampling in deep learning systems.

...read moreread less

253

•Journal Article•10.1016/J.INS.2019.08.062

Neighbourhood-based undersampling approach for handling imbalanced and overlapped data

Pattaramon Vuttipittayamongkol, +1 more

- 01 Jan 2020

- Information Sciences

TL;DR: Four methods based on neighbourhood searching with different criteria to identify potential overlapped instances are proposed in this paper and show comparable performance with state-of-the-art methods across different common metrics with exceptional and statistically significant improvements in sensitivity.

...read moreread less

243

Journal Article•10.1016/J.KNOSYS.2020.106631

On the class overlap problem in imbalanced data classification.

Pattaramon Vuttipittayamongkol, +2 more

- 05 Jan 2021

- Knowledge Based Systems

TL;DR: Critical discussion and objective evaluation of class overlap in the context of imbalanced data and its impact on classification accuracy and an in-depth critical technical review of existing approaches to handle imbalanced datasets are provided.

...read moreread less

206

Journal Article•10.1007/S10462-019-09719-2

A survey of swarm and evolutionary computing approaches for deep learning

Ashraf Darwish, +2 more

- 01 Mar 2020

- Artificial Intelligence Review

TL;DR: A comprehensive survey of the most recent approaches involving the hybridization of SI and EC algorithms for DL, the architecture of DNNs, and DNN training to improve the classification accuracy is presented.

...read moreread less

180

•Journal Article•10.1016/J.INS.2018.12.002

Enabling Smart Data: Noise filtering in Big Data classification

Diego García-Gil, +4 more

- 01 Apr 2019

- Information Sciences

TL;DR: In this article, two Big Data preprocessing approaches to remove noisy examples are proposed: an homogeneous ensemble and an heterogeneous ensemble filter, with special emphasis in their scalability and performance traits.

...read moreread less

151

...

Expand

References

Journal Article•10.1109/5.726791

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

- 01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

53.5K

•Journal Article•10.1023/A:1022627411411

Support-Vector Networks

Corinna Cortes, +1 more

- 15 Sep 1995

- Machine Learning

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

...read moreread less

42K

Gradient-based learning applied to document recognition

Yann LeCun, +7 more

- 01 Jan 2001

TL;DR: This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.

...read moreread less

32.7K

Journal Article•10.1038/323533A0

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988

- Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

30.1K

•Journal Article•10.1613/JAIR.953

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002

- Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

27.7K