An Analysis of Active Learning Strategies for Sequence Labeling Tasks
Burr Settles,Mark Craven +1 more
- 25 Oct 2008
- pp 1070-1079
TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
read more
Abstract: Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed light on the best active learning approaches for sequence labeling tasks such as information extraction and document segmentation. We survey previously used query selection strategies for sequence models, and propose several novel algorithms to address their shortcomings. We also conduct a large-scale empirical comparison using multiple corpora, which demonstrates that our proposed methods advance the state of the art.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Active Learning Literature Survey
Burr Settles
- 01 Jan 2009
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
6.7K
A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective
TL;DR: This survey performs a comprehensive study of data collection from a data management point of view, providing a research landscape of these operations, guidelines on which technique to use when, and identify interesting research challenges.
858
Learning Loss for Active Learning
Donggeun Yoo,In So Kweon +1 more
- 15 Jun 2019
TL;DR: In this article, the authors propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, where a small parametric module, named ''loss prediction module'' to a target network, and learn it to predict target losses of unlabeled inputs.
•Book
Data Mining: The Textbook
Charu C. Aggarwal
- 27 Apr 2015
TL;DR: This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues.
786
Knockoff Nets: Stealing Functionality of Black-Box Models
Tribhuvanesh Orekondy,Bernt Schiele,Mario Fritz +2 more
- 18 Jun 2019
TL;DR: This work forms model functionality stealing as a two-step approach: querying a set of input images to the blackbox model to obtain predictions, and training a ``knockoff'' with queried image-prediction pairs.
References
Reducing labeling effort for structured prediction tasks
Aron Culotta,Andrew McCallum +1 more
- 09 Jul 2005
TL;DR: A new active learning paradigm is proposed which reduces not only how many instances the annotator must label, but also how difficult each instance is to annotate, which can vary widely in structured prediction tasks.
•Proceedings Article
Query Learning Strategies Using Boosting and Bagging
Naoki Abe,Hiroshi Mamitsuka +1 more
- 24 Jul 1998
465
BioCreAtIvE Task 1A: gene mention finding evaluation
TL;DR: The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire.
Sample Selection for Statistical Parsing
TL;DR: It is found that sample selection can significantly reduce the size of annotated training corpora and that uncertainty is a robust predictive criterion that can be easily applied to different learning models.
MMR-based Active Machine Learning for Bio Named Entity Recognition
Seokhwan Kim,Yu Song,Kyungduk Kim,Jeong-Won Cha,Gary Geunbae Lee +4 more
- 04 Jun 2006
TL;DR: A new active learning paradigm which considers not only the uncertainty of the classifier but also the diversity of the corpus is presented, which incorporated MMR-based active machine-learning idea into the biomedical named-entity recognition system.
Related Papers (5)
Burr Settles
- 01 Jan 2009
David D. Lewis,William A. Gale +1 more
- 01 Aug 1994
[...]
H. S. Seung,Manfred Opper,Haim Sompolinsky +2 more
- 01 Jul 1992