Open AccessProceedings Article
Multitask Semi-Supervised Learning for Class-Imbalanced Discourse Classification
Alexander Spangher,Jonathan May,Sz-Rung Shiang,Lingjia Deng +3 more
- 01 Nov 2021
- pp 498-517
14
TL;DR: This article showed that a multitask learning approach can combine discourse datasets from similar and diverse domains to improve discourse classification and showed an improvement of 4.9% Micro F1 score over current state-of-the-art benchmarks on the NewsDiscourse dataset, one of the largest discourse datasets recently published.
read more
Abstract: As labeling schemas evolve over time, small differences can render datasets following older schemas unusable. This prevents researchers from building on top of previous annotation work and results in the existence, in discourse learning in particular, of many small class-imbalanced datasets. In this work, we show that a multitask learning approach can combine discourse datasets from similar and diverse domains to improve discourse classification. We show an improvement of 4.9% Micro F1-score over current state-of-the-art benchmarks on the NewsDiscourse dataset, one of the largest discourse datasets recently published, due in part to label correlations across tasks, which improve performance for underrepresented classes. We also offer an extensive review of additional techniques proposed to address resource-poor problems in NLP, and show that none of these approaches can improve classification accuracy in our setting.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications
Laith Alzubaidi,Jinshuai Bai,Aiman Al-Sabaawi,José Santamaría,Ahmed Shihab Albahri,Bashar S. Al-Dabbagh,Mohammed A. Fadhel,Mohamed Manoufali,Ali H. Al-Timemy,Ye Duan,Laith Farhan,Yi Lu,Ashish Gupta,Yuantong Gu +13 more
TL;DR: In this article , the authors present a survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization.
On Supervised Class-Imbalanced Learning: An Updated Perspective and Some Key Challenges
TL;DR: In this article , the authors provide a comprehensive summary of the rich pool of research works attempting to combat the adversarial effects of class imbalance efficiently and highlight the need for techniques tailored for such a paradigm.
32
•Posted Content
Sequential Sentence Classification in Research Papers using Cross-Domain Multi-Task Learning.
TL;DR: This paper proposed a uniform deep learning architecture and multi-task learning to improve sequential sentence classification in scientific texts across domains by exploiting training data from multiple domains, which can enhance academic search engines to support researchers in finding and exploring research literature more effectively.
12
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing
Sophie Henning,William Beluch,Alexander Fraser,Annemarie Friedrich +3 more
- 01 Jan 2023
TL;DR: A survey of methods for addressing class imbalance in deep-learning based natural language processing (NLP) tasks. Covers various types of imbalance, approaches based on sampling, data augmentation, loss functions, staged learning, and model design.
Cross-domain multi-task learning for sequential sentence classification in research papers
20 Jun 2022
TL;DR: In this paper , a novel uniform deep learning architecture and multi-task learning for cross-domain sequential sentence classification in scientific texts are proposed. But the authors do not consider the issue of different text structure of full papers and abstracts.
References
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
Attention Is All You Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Łukasz Kaiser,Illia Polosukhin +7 more
- 01 Jan 2017
Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
51.8K
SMOTE: synthetic minority over-sampling technique
TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
•Posted Content
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Michael Lewis,Luke Zettlemoyer,Veselin Stoyanov +9 more
TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.