Active learning using adaptive resampling

doi:10.1145/347090.347110

Proceedings Article10.1145/347090.347110

Active learning using adaptive resampling

Vijay S. Iyengar, +2 more

- 01 Aug 2000

- pp 91-98

112

TL;DR: An active learning method is presented that uses adaptive resampling in a natural way to signi cantly reduce the size of the required labeled set and generates a classi cation model that achieves the high accuracies possible with current adaptive Resampling methods.

Abstract: Classi cation modeling (a.k.a. supervised learning) is an extremely useful analytical technique for developing predictive and forecasting applications. The explosive growth in data warehousing and internet usage has made large amounts of data potentially available for developing classi cation models. For example, natural language text is widely available in many forms (e.g., electronic mail, news articles, reports, and web page contents). Categorization of data is a common activity which can be automated to a large extent using supervised learning methods. Examples of this include routing of electronic mail, satellite image classi cation, and character recognition. However, these tasks require labeled data sets of su ciently high quality with adequate instances for training the predictive models. Much of the on-line data, particularly the unstructured variety (e.g., text), is unlabeled. Labeling is usually a expensive manual process done by domain experts. Active learning is an approach to solving this problem and works by identifying a subset of the data that needs to be labeled and uses this subset to generate classi cation models. We present an active learning method that uses adaptive resampling in a natural way to signi cantly reduce the size of the required labeled set and generates a classi cation model that achieves the high accuracies possible with current adaptive resampling methods.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Active Learning Literature Survey

Burr Settles

- 01 Jan 2009

TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

...read moreread less

6.7K

Proceedings Article•10.1145/775047.775087

Interactive deduplication using active learning

Sunita Sarawagi, +1 more

- 23 Jul 2002

TL;DR: This work presents the design of a learning-based deduplication system that uses a novel method of interactively discovering challenging training pairs using active learning and investigates various design issues that arise in building a system to provide interactive response, fast convergence, and interpretable output.

...read moreread less

840

•Journal Article•10.1007/S10791-008-9070-Z

A machine learning approach to sentiment analysis in multilingual Web texts

Erik Boiy, +1 more

- 01 Oct 2009

- Information Retrieval

TL;DR: This paper presents machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French and investigates the role of active learning techniques for reducing the number of examples to be manually annotated.

...read moreread less

515

Journal Article•10.1021/CI200409X

Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis?

Alexandre Varnek, +2 more

- 25 May 2012

- Journal of Chemical Information and Mode...

TL;DR: New approaches and concepts are focused on that may provide efficient solutions of common problems in chemoinformatics: improvement of predictive performance of structure-property (activity) models, generation of structures possessing desirable properties, model applicability domain, modeling of properties with functional endpoints, and accounting for multiple molecular species.

...read moreread less

233

•Journal Article•10.1023/B:MACH.0000011806.12374.C3

Active Sampling for Class Probability Estimation and Ranking

Maytal Saar-Tsechansky, +1 more

- 01 Feb 2004

TL;DR: In this paper, the authors present a sampling-based active learning method for estimating class probabilities and class-based rankings, which uses weighted sampling to account for a potential example's informative value for the rest of the input space.

...read moreread less

197

...

Expand

References

UCI Repository of machine learning databases

Catherine Blake

- 01 Jan 1998

14.1K

•Proceedings Article

Experiments with a new boosting algorithm

Yoav Freund, +1 more

- 03 Jul 1996

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

...read moreread less

9.1K

•Journal Article•10.1214/AOS/1016218223

Additive Logistic Regression : A Statistical View of Boosting

Jerome H. Friedman, +2 more

- 01 Apr 2000

- Annals of Statistics

TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.

...read moreread less

7.4K

Experiment with a new boosting algorithm

Y. Freund

- 01 Jan 1996

7.4K

•Proceedings Article

A Comparative Study on Feature Selection in Text Categorization

Yiming Yang, +1 more

- 08 Jul 1997

TL;DR: This paper finds strong correlations between the DF IG and CHI values of a term and suggests that DF thresholding the simplest method with the lowest cost in computation can be reliably used instead of IG or CHI when the computation of these measures are too expensive.

...read moreread less

5.6K

...

Expand

Active learning using adaptive resampling

Chat with Paper

AI Agents for this Paper

Citations

Active Learning Literature Survey

Interactive deduplication using active learning

A machine learning approach to sentiment analysis in multilingual Web texts

Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis?

Active Sampling for Class Probability Estimation and Ranking

References

UCI Repository of machine learning databases

Experiments with a new boosting algorithm

Additive Logistic Regression : A Statistical View of Boosting

Experiment with a new boosting algorithm

A Comparative Study on Feature Selection in Text Categorization

Related Papers (5)

Query by committee

A sequential algorithm for training text classifiers

Improving Generalization with Active Learning

Support vector machine active learning with applications to text classification

Heterogenous uncertainty sampling for supervised learning