Proceedings Article10.1145/347090.347110
Active learning using adaptive resampling
Vijay S. Iyengar,Chidanand Apte,Tong Zhang +2 more
- 01 Aug 2000
- pp 91-98
TL;DR: An active learning method is presented that uses adaptive resampling in a natural way to signi cantly reduce the size of the required labeled set and generates a classi cation model that achieves the high accuracies possible with current adaptive Resampling methods.
read more
Abstract: Classi cation modeling (a.k.a. supervised learning) is an extremely useful analytical technique for developing predictive and forecasting applications. The explosive growth in data warehousing and internet usage has made large amounts of data potentially available for developing classi cation models. For example, natural language text is widely available in many forms (e.g., electronic mail, news articles, reports, and web page contents). Categorization of data is a common activity which can be automated to a large extent using supervised learning methods. Examples of this include routing of electronic mail, satellite image classi cation, and character recognition. However, these tasks require labeled data sets of su ciently high quality with adequate instances for training the predictive models. Much of the on-line data, particularly the unstructured variety (e.g., text), is unlabeled. Labeling is usually a expensive manual process done by domain experts. Active learning is an approach to solving this problem and works by identifying a subset of the data that needs to be labeled and uses this subset to generate classi cation models. We present an active learning method that uses adaptive resampling in a natural way to signi cantly reduce the size of the required labeled set and generates a classi cation model that achieves the high accuracies possible with current adaptive resampling methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Active Learning Literature Survey
Burr Settles
- 01 Jan 2009
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
6.7K
Interactive deduplication using active learning
Sunita Sarawagi,Anuradha Bhamidipaty +1 more
- 23 Jul 2002
TL;DR: This work presents the design of a learning-based deduplication system that uses a novel method of interactively discovering challenging training pairs using active learning and investigates various design issues that arise in building a system to provide interactive response, fast convergence, and interpretable output.
A machine learning approach to sentiment analysis in multilingual Web texts
Erik Boiy,Marie-Francine Moens +1 more
TL;DR: This paper presents machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French and investigates the role of active learning techniques for reducing the number of examples to be manually annotated.
Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis?
TL;DR: New approaches and concepts are focused on that may provide efficient solutions of common problems in chemoinformatics: improvement of predictive performance of structure-property (activity) models, generation of structures possessing desirable properties, model applicability domain, modeling of properties with functional endpoints, and accounting for multiple molecular species.
233
Active Sampling for Class Probability Estimation and Ranking
Maytal Saar-Tsechansky,Foster Provost +1 more
- 01 Feb 2004
TL;DR: In this paper, the authors present a sampling-based active learning method for estimating class probabilities and class-based rankings, which uses weighted sampling to account for a potential example's informative value for the rest of the input space.
References
•Proceedings Article
Experiments with a new boosting algorithm
Yoav Freund,Robert E. Schapire +1 more
- 03 Jul 1996
TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Additive Logistic Regression : A Statistical View of Boosting
TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
•Proceedings Article
A Comparative Study on Feature Selection in Text Categorization
Yiming Yang,Jan O. Pedersen +1 more
- 08 Jul 1997
TL;DR: This paper finds strong correlations between the DF IG and CHI values of a term and suggests that DF thresholding the simplest method with the lowest cost in computation can be reliably used instead of IG or CHI when the computation of these measures are too expensive.
5.6K
Related Papers (5)
[...]
H. S. Seung,Manfred Opper,Haim Sompolinsky +2 more
- 01 Jul 1992
David D. Lewis,William A. Gale +1 more
- 01 Aug 1994
David D. Lewis,Jason A. Catlett +1 more
- 10 Jul 1994