An Analysis of Active Learning Strategies for Sequence Labeling Tasks
Burr Settles,Mark Craven +1 more
- 25 Oct 2008
- pp 1070-1079
TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
read more
Abstract: Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed light on the best active learning approaches for sequence labeling tasks such as information extraction and document segmentation. We survey previously used query selection strategies for sequence models, and propose several novel algorithms to address their shortcomings. We also conduct a large-scale empirical comparison using multiple corpora, which demonstrates that our proposed methods advance the state of the art.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Active Learning Literature Survey
Burr Settles
- 01 Jan 2009
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
6.7K
A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective
TL;DR: This survey performs a comprehensive study of data collection from a data management point of view, providing a research landscape of these operations, guidelines on which technique to use when, and identify interesting research challenges.
858
Learning Loss for Active Learning
Donggeun Yoo,In So Kweon +1 more
- 15 Jun 2019
TL;DR: In this article, the authors propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, where a small parametric module, named ''loss prediction module'' to a target network, and learn it to predict target losses of unlabeled inputs.
•Book
Data Mining: The Textbook
Charu C. Aggarwal
- 27 Apr 2015
TL;DR: This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues.
786
Knockoff Nets: Stealing Functionality of Black-Box Models
Tribhuvanesh Orekondy,Bernt Schiele,Mario Fritz +2 more
- 18 Jun 2019
TL;DR: This work forms model functionality stealing as a two-step approach: querying a set of input images to the blackbox model to obtain predictions, and training a ``knockoff'' with queried image-prediction pairs.
References
•Proceedings Article
Learning to Extract Signature and Reply Lines from Email.
Vitor R. Carvalho,William W. Cohen +1 more
- 01 Jan 2004
TL;DR: Methods for automatically identifying signature blocks and reply lines in plain-text email messages are described, based on applying machine learning methods to a sequential representation of an email message, in which each email is represented as a sequence of lines.
A probability analysis on the value of unlabeled data for classification problems
Tong Zhang,Frank J. Oles +1 more
- 01 Jan 2000
TL;DR: It is demonstrated that Fisher information matrices can be used to judge the asymp-totic value of unlabeled data and this methodology is applied to both passive partially supervised learning and active learning.
Related Papers (5)
Burr Settles
- 01 Jan 2009
David D. Lewis,William A. Gale +1 more
- 01 Aug 1994
[...]
H. S. Seung,Manfred Opper,Haim Sompolinsky +2 more
- 01 Jul 1992