Journal Article10.3233/IDA-2003-7305
Iterative cross-training: An algorithm for web page categorization
Nuanwan Soonthornphisaj,Boonserm Kijsirikul +1 more
- 01 Aug 2003
- Vol. 7, Iss: 3, pp 233-253
TL;DR: This paper proposes a novel approach called Iterative Cross-Training (ICT), which is considered to be an effective approach for the Web page categorization task and is evaluated and analyzed with the supervised learning algorithms, Co-Training and Expectation Maximization.
read more
Abstract: The goal of Web page categorization is to classify Web documents into a certain number of predefined categories. Previous works in this area employed a large number of labeled training documents for supervised learning. The problem is that, it is difficult to create labeled training documents. Though it is not so easy to manually categorize unlabeled documents for creating training data, it is easy to collect unlabeled ones. Therefore, a new machine learning algorithm is investigated to overcome these difficulties and effectively utilize unlabeled documents. We propose a novel approach called Iterative Cross-Training (ICT). In this paper, we applied the algorithm to Web page categorization on three data sets. The performance of ICT was evaluated and analyzed with the supervised learning algorithms, Co-Training and Expectation Maximization. We found that ICT is considered to be an effective approach for the Web page categorization task.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DISTRIBUTED APPROACH to WEB PAGE CATEGORIZATION USING MAP- REDUCE PROGRAMMING MODEL
P. Malarvizhi,Ramachandra V. Pujeri +1 more
- 01 Jan 2012
TL;DR: The experimental results show that the proposed parallel web page categorization approach achieves satisfactory results in finding the right category for any given web page.
3
Experiments on the Use of Machine Learning Classification Methods in Online Crime Text Filtering and Classification
Abstract: With the exponential growth of textual information available from the Internet, there has been an emergent need to find relevant and in-time knowledge about crimes from crime types terms selection and SVM classifier exhibits the best performance on classifying crime documents into their appropriate crime types.
2
Semi-supervised online structure learning for composite event recognition
Evangelos Michelioudakis,Evangelos Michelioudakis,Alexander Artikis,Alexander Artikis,Georgios Paliouras +4 more
TL;DR: This work presents a novel approach for completing the supervision of a semi-supervised structure learning task that incorporates graph-cut minimisation, a technique that derives labels for unlabelled data, based on their distance to their labelled counterparts, and employs a suitable structural distance for measuring the distance between sets of logical atoms.
References
•Book
The Nature of Statistical Learning Theory
Vladimir Vapnik
- 01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
46K
•Book
Data Mining: Practical Machine Learning Tools and Techniques
Ian H. Witten,Eibe Frank,Mark Hall +2 more
- 25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
25.4K
Bagging predictors
Leo Breiman
- 01 Aug 1996
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
Thorsten Joachims
- 21 Apr 1998
TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.