Book Chapter10.1007/978-81-322-1602-5_75
Text Classification Using Machine Learning Methods-A Survey
Basant Agarwal,Namita Mittal +1 more
- 01 Jan 2014
- pp 701-709
90
TL;DR: This paper presents various text classification approaches using machine learning techniques, and feature selection techniques for reducing the high-dimensional feature vector.
read more
Abstract: Text classification is used to organize documents in a predefined set of classes. It is very useful in Web content management, search engines; email filtering, etc. Text classification is a difficult task due to high- dimensional feature vector comprising noisy and irrelevant features. Various feature reduction methods have been proposed for eliminating irrelevant features as well as for reducing the dimension of feature vector. Relevant and reduced feature vector is used by machine learning model for better classification results. This paper presents various text classification approaches using machine learning techniques, and feature selection techniques for reducing the high-dimensional feature vector.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A novel multivariate filter method for feature selection in text classification problems
TL;DR: A novel filter method for feature selection, called Multivariate Relative Discrimination Criterion (MRDC), is proposed for text classification, which focuses on the reduction of redundant features using minimal-redundancy and maximal-relevancy concepts.
223
Mobile Virtual Reality as an Educational Platform: A Pilot Study on the Impact of Immersion and Positive Emotion Induction in the Learning Process
E. Olmos-Raya,J. Ferreira-Cavalcanti,Manuel Contero,M.C. Castellanos-Baena,I.A. Chicci-Giglioli,Mariano Alcañiz +5 more
TL;DR: In this paper, the authors evaluated the influence of emotional induction and level of immersion on knowledge acquisition and motivation, and found that positive emotion induction had a positive effect on the interest subscale of the motivation assessment tool used for both immersive conditions.
Supervised learning Methods for Bangla Web Document Categorization
Ashis Kumar Mandal,Rikta Sen +1 more
TL;DR: In this paper, the authors explored the use of machine learning approaches, or more specifically, four supervised learning methods, namely Decision Tree(C 4.5), KNN, Naïve Bays (NB), and Support Vector Machine (SVM) for categorization of Bangla web documents.
•Journal Article
Effective methods for improving naive Bayes text classifiers
TL;DR: This paper proposes and evaluates some general and effective techniques for improving performance of the naive Bayes text classifier, and suggests document model based parameter estimation and document length normalization to alleviate the problems in the traditional multinomial approach for text classification.
81
Term-weighting learning via genetic programming for text classification
Hugo Jair Escalante,Mauricio Garcia-Limon,Alicia Morales-Reyes,Mario Graff,Manuel Montes-y-Gómez,Eduardo F. Morales,Jose Martinez-Carranza +6 more
TL;DR: A novel approach to learning term-weighting schemes (TWS) in the context of text classification is described and it is shown that TWS learned with the genetic program outperform traditional schemes and other TWSs proposed in recent works.
References
•Proceedings Article
Experiments with a new boosting algorithm
Yoav Freund,Robert E. Schapire +1 more
- 03 Jul 1996
TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
An algorithm for suffix stripping
TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.
9.1K
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
•Proceedings Article
A Comparative Study on Feature Selection in Text Categorization
Yiming Yang,Jan O. Pedersen +1 more
- 08 Jul 1997
TL;DR: This paper finds strong correlations between the DF IG and CHI values of a term and suggests that DF thresholding the simplest method with the lowest cost in computation can be reliably used instead of IG or CHI when the computation of these measures are too expensive.
5.6K