MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

doi:10.1109/TKDE.2012.232

Journal Article10.1109/TKDE.2012.232

MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

Sukarna Barua, +3 more

- 01 Feb 2014

- IEEE Transactions on Knowledge and Data ...

- Vol. 26, Iss: 2, pp 405-425

1.1K

TL;DR: A new method, called Majority Weighted Minority Oversampling TEchnique (MWMOTE), is presented for efficiently handling imbalanced learning problems and is better than or comparable with some other existing methods in terms of various assessment metrics.

Abstract: Imbalanced learning problems contain an unequal distribution of data samples among different classes and pose a challenge to any classifier as it becomes hard to learn the minority class samples. Synthetic oversampling methods address this problem by generating the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. This paper identifies that most of the existing oversampling methods may generate the wrong synthetic minority samples in some scenarios and make learning tasks harder. To this end, a new method, called Majority Weighted Minority Oversampling TEchnique (MWMOTE), is presented for efficiently handling imbalanced learning problems. MWMOTE first identifies the hard-to-learn informative minority class samples and assigns them weights according to their euclidean distance from the nearest majority class samples. It then generates the synthetic samples from the weighted informative minority class samples using a clustering approach. This is done in such a way that all the generated samples lie inside some minority class cluster. MWMOTE has been evaluated extensively on four artificial and 20 real-world data sets. The simulation results show that our method is better than or comparable with some other existing methods in terms of various assessment metrics, such as geometric mean (G-mean) and area under the receiver operating curve (ROC), usually known as area under curve (AUC).

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/tii.2021.3112988

An Imbalance Modified Convolutional Neural Network With Incremental Learning for Chemical Fault Diagnosis

01 Jun 2022

- IEEE Transactions on Industrial Informat...

TL;DR: Wang et al. as discussed by the authors proposed an incremental imbalance modified convolutional neural network (IMM-CNN) to solve the problem of data-driven fault diagnosis in chemical processes.

...read moreread less

12

•Journal Article•10.1016/J.ESWA.2015.10.001

Associative learning on imbalanced environments

L. Cleofas-Sánchez, +3 more

- 15 Jul 2016

- Expert Systems With Applications

TL;DR: A large-scale experimental evaluation with 31 databases, seven classification models and four resampling algorithms is carried out, along with a non-parametric statistical test to discover any significant differences between each pair of classifiers.

...read moreread less

12

•Journal Article•10.1109/access.2022.3218463

A Hybrid Sampling Approach for Imbalanced Binary and Multi-Class Data Using Clustering Analysis

01 Jan 2022

- IEEE Access

TL;DR: CBHSID as mentioned in this paper uses the calculated mean as a threshold value to segregate majority and minority classes, and removes data observations that are away from the center of sub-cluster during under-sampling.

...read moreread less

12

Proceedings Article•10.1109/ICARCV.2014.7064454

Multi-exemplar based clustering for imbalanced data

Yangtao Wang, +1 more

- 01 Dec 2014

TL;DR: This paper proposed a new approach called multi-exemplar merging clustering(MEMC) for imbalanced data in this paper which is composed of two stages of processing: multiple exemplars identification stage and exemplars merging stage.

...read moreread less

12

Journal Article•10.1007/s11831-023-10059-2

A Systematic Literature Review on Swarm Intelligence Based Intrusion Detection System: Past, Present and Future

Dukka Karun Kumar Reddy, +5 more

- 01 Mar 2024

- Archives of Computational Methods in Eng...

12

...

Expand

References

•Journal Article•10.1613/JAIR.953

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002

- Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

27.7K

•Book

C4.5: Programs for Machine Learning

J. Ross Quinlan

- 15 Oct 1992

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.

...read moreread less

27.2K

•Journal Article•10.1016/J.PATREC.2005.10.010

An introduction to ROC analysis

Tom Fawcett

- 01 Jun 2006

- Pattern Recognition Letters

TL;DR: The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.

...read moreread less

21.3K

•Journal Article•10.1023/A:1022643204877

Induction of Decision Trees

J. R. Quinlan

- 25 Mar 1986

- Machine Learning

TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.

...read moreread less

18.8K

•Journal Article•10.1006/JCSS.1997.1504

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Yoav Freund, +1 more

- 01 Aug 1997

TL;DR: The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.

...read moreread less

18.6K