Open AccessPosted Content
A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios
TL;DR: In this paper, a survey of low-resource natural language processing methods is presented, including data augmentation, distant supervision, and transfer learning settings that reduce the need for target supervision.
read more
Abstract: Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing. After a discussion about the different dimensions of data availability, we give a structured overview of methods that enable learning when training data is sparse. This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific low-resource setting. Further key aspects of this work are to highlight open issues and to outline promising directions for future research.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Sentiment Analysis Approach to Predict an Individual's Awareness of the Precautionary Procedures to Prevent COVID-19 Outbreaks in Saudi Arabia.
Sumayh S. Aljameel,Dina A. Alabbad,Norah A. Alzahrani,Shouq M. Alqarni,Fatimah A. Alamoudi,Lana M. Babili,Somiah K. Aljaafary,Fatima M. Alshamrani +7 more
TL;DR: A model that predicts an individual’s awareness of the precautionary procedures in five main regions in Saudi Arabia can support the medical sectors and decision-makers to decide the appropriate procedures for each region based on their attitudes towards the pandemic.
91
A Survey on Recent Named Entity Recognition and Relationship Extraction Techniques on Clinical Texts
Priyankar Bose,Sriram Srinivasan,William C. Sleeman,Jatinder R. Palta,Rishabh Kapoor,Preetam Ghosh +5 more
TL;DR: This comprehensive survey on clinical NER and RE encompass current challenges, state-of-the-art practices, and future directions in information extraction from clinical text.
84
•Posted Content
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey.
TL;DR: The authors focus on domain adaptation for NMT, particularly the case where a system may need to translate sentences from multiple domains, and divide techniques into those relating to data selection, model architecture, parameter adaptation procedure, and inference procedure.
46
•Posted Content
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
TL;DR: The authors provided an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting, summarizing the landscape of methods and carrying out experiments on 11 datasets covering topics/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks.
16
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
TL;DR: The authors survey approaches to domain adaptation for NMT, particularly where a system may need to translate across multiple domains, and highlight the benefits of domain adaptation and multidomain adaptation techniques to other lines of NMT research.
References
On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages.
Yi Zhu,Benjamin Heinzerling,Ivan Vulić,Michael Strube,Roi Reichart,Anna Korhonen +5 more
- 01 Nov 2019
TL;DR: The authors provided a comprehensive analysis focused on the usefulness of subwords for word representation learning in truly low-resource scenarios and for three representative morphological tasks: fine-grained entity typing, morphological tagging, and named entity recognition.
16
Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction
Xiang Deng,Huan Sun +1 more
TL;DR: A new strategy named 2-hop DS to enhance distantly supervised RE, based on the observation that there exist a large number of relational tables on the Web which contain entity pairs that share common relations.
Denoising Multi-Source Weak Supervision for Neural Text Classification
Wendi Ren,Yinghao Li,Hanting Su,David Kartchner,Cassie S. Mitchell,Chao Zhang +5 more
- 01 Nov 2020
TL;DR: Denoise as discussed by the authors proposes a label denoiser to estimate the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels.
Learning Structured Representations of Entity Names using ActiveLearning and Weak Supervision
Kun Qian,Poornima Chozhiyath Raman,Yunyao Li,Lucian Popa +3 more
- 01 Nov 2020
TL;DR: This paper presents a novel learning framework that combines active learning and weak supervision to solve the problem of implicit structured representations of entity names without context and external knowledge.
Viability of Neural Networks for Core Technologies for Resource-Scarce Languages
TL;DR: The neural network models evaluated perform better than the baselines for compound analysis, are viable and comparable to the baseline on most languages for POS tagging and NER, and are viable, but not on par with the baseline, for Afrikaans lemmatization.
15
Related Papers (5)
Jing Shao,Siyu Chen,Yangguang Li,Kun Wang,Zhenfei Yin,Yinan He,Teng Jianing,Qinghong Sun,Mengya Gao,Jihao Liu,Huang Gengshi,Guanglu Song,Yichao Wu,Yuming Huang,Fenggang Liu,Huan Peng,Shuo Qin,Chengyu Wang,Yujie Wang,Conghui He,Ding Liang,Yu Liu,Fengwei Yu,Junjie Yan,Dahua Lin,Xiaogang Wang,Yu Qiao +26 more