Proceedings Article10.1109/ICIECS.2009.5366100
Class Selection Based Iterative Supervised Latent Semantic Indexing for Text Categorization
Ming-Bo Wang,Cheng-Lin Liu +1 more
- 28 Dec 2009
- pp 1-4
TL;DR: An iterative SLSI framework based on class selection is proposed, and a method, which selects a class at each iteration using a simple classifier and computes the main bias vector of one class only is proposed.
read more
Abstract: Latent Semantic Indexing (LSI) is an effective technique for feature extraction in text mining, and supervised LSI (SLSI) algorithms have been proposed to exploit the class labels of training data. In this paper, we propose an iterative SLSI framework based on class selection. We show that a previous iterative SLSI algorithm is an instance of the framework. We also propose a method under our framework, which selects a class at each iteration using a simple classifier and computes the main bias vector of one class only. Our experiments demonstrate that the proposed method both improves the classification accuracy and reduces the computation cost. Keywords-Supervised Latent Semantic Indexing; Text
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
References
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
•Book
Introduction to Modern Information Retrieval
Gerard Salton,Michael J. McGill +1 more
- 01 Jan 1983
TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.
12.6K
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
•Proceedings Article
A Comparative Study on Feature Selection in Text Categorization
Yiming Yang,Jan O. Pedersen +1 more
- 08 Jul 1997
TL;DR: This paper finds strong correlations between the DF IG and CHI values of a term and suggests that DF thresholding the simplest method with the lowest cost in computation can be reliably used instead of IG or CHI when the computation of these measures are too expensive.
5.6K