Sequential targeting: A continual learning approach for data imbalance in text classification

doi:10.1016/J.ESWA.2021.115067

Journal Article10.1016/J.ESWA.2021.115067

Sequential targeting: A continual learning approach for data imbalance in text classification

Joel Jang, +3 more

- 01 Oct 2021

- Expert Systems With Applications

- Vol. 179, pp 115067

16

TL;DR: A novel training method, Sequential Targeting (ST), is proposed, independent of the effectiveness of the representation method, which enforces an incremental learning setting by splitting the data into mutually exclusive subsets and training the learner adaptively.

Abstract: Text classification has numerous use cases including sentiment analysis, spam detection, document classification, hate speech detection, etc. In realistic settings, classification on text data confronts imbalanced data conditions where classes of interest usually compose a minor fraction. Deep neural networks used for text classification, such as recurrent neural networks and transformer networks, suffer from a lack of efficient methods addressing imbalanced data. Traditional data-level methods attempting to mitigate distributional skew include oversampling and undersampling. The oversampling methods destruct the quality of original language representation of the sparse data coming from minority classes whereas the undersampling methods fail to fully utilize the rich context of majority classes. We address such issues in data-driven approaches by enforcing continual learning on imbalanced data by partitioning the training data distribution into mutually exclusive subsets and performing continual learning, treating the individual subsets as distinct tasks. We demonstrate the effectiveness of our method through experiments on the IMDB dataset and constructed datasets from real-world data. The experimental results show that the proposed method improves by 56.38 %p on the IMDB dataset and by 16.89 %p and 34.76 %p on the constructed datasets compared to the baseline method in terms of the F1-score metric.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Learning without Forgetting

Zhizhong Li, +1 more

- 29 Jun 2016

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.

...read moreread less

2.6K

Journal Article•10.1016/j.aej.2023.08.038

A survey on hate speech detection and sentiment analysis using machine learning and deep learning models

Malliga Subramanian, +4 more

- 01 Oct 2023

- alexandria engineering journal

TL;DR: This survey article provides a comprehensive overview of recent advancements in hate speech detection and sentiment analysis using machine learning and deep learning models, highlighting methodologies, datasets, challenges, and areas for future research to promote a more inclusive online environment.

...read moreread less

27

Journal Article•10.1016/j.eswa.2023.119658

Text FCG: Fusing Contextual Information via Graph Learning for text classification

Yizhao Wang, +4 more

- 01 Feb 2023

- Expert systems with applications

TL;DR: Wang et al. as mentioned in this paper proposed TextFCGNN (Text Contextual Information via Graph Neural Networks), which constructs a single graph for all words in each text and labels the edges by fusing its various contextual relations.

...read moreread less

24

•Journal Article•10.1155/2022/4725639

Evolving Long Short-Term Memory Network-Based Text Classification

Arjun Singh, +7 more

- 21 Feb 2022

- Computational Intelligence and Neuroscie...

TL;DR: An evolving LSTM (ELSTM) network is proposed using a multiobjective genetic algorithm (MOGA) to optimize the architecture and weights of L STM.

...read moreread less

17

Journal Article•10.1016/j.datak.2024.102306

Effective Text Classification using BERT, MTM LSTM, and DT

Saman Jamshidi, +8 more

- 01 Apr 2024

10

...

Expand

References

•Journal Article•10.3156/JSOFT.29.5_177_2

Generative Adversarial Nets

Ian Goodfellow, +7 more

- 08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

48.6K

Proceedings Article•10.3115/V1/D14-1162

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

- 01 Oct 2014

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

41.6K

•Journal Article•10.1613/JAIR.953

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002

- Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

27.7K

•Proceedings Article

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013

TL;DR: Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

...read moreread less

27.5K

Proceedings Article•10.18653/V1/N19-1423

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

24.6K