Centroid-Based Document Classification: Analysis and Experimental Results

doi:10.1007/3-540-45372-5_46

Open AccessBook Chapter10.1007/3-540-45372-5_46

Centroid-Based Document Classification: Analysis and Experimental Results

Eui-Hong Han, +1 more

- 13 Sep 2000

- pp 424-431

471

TL;DR: The authors' experiments show that this centroidbased classifier consistently and substantially outperforms other algorithms such as Naive Bayesian, k-nearest-neighbors, and C4.5, on a wide range of datasets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1145/242224.242229

Machine learning

Thomas G. Dietterich

- 01 Dec 1996

- ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

14K

•Journal Article

An extensive empirical study of feature selection metrics for text classification

George Forman

- 01 Mar 2003

- Journal of Machine Learning Research

TL;DR: An empirical comparison of twelve feature selection methods evaluated on a benchmark of 229 text classification problem instances, revealing that a new feature selection metric, called 'Bi-Normal Separation' (BNS), outperformed the others by a substantial margin in most situations and was the top single choice for all goals except precision.

...read moreread less

2.8K

•Proceedings Article

Computing semantic relatedness using Wikipedia-based explicit semantic analysis

Evgeniy Gabrilovich, +1 more

- 06 Jan 2007

TL;DR: This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments.

...read moreread less

2.4K

•Journal Article•10.3390/INFO10040150

Text Classification Algorithms: A Survey

Kamran Kowsari, +5 more

- 17 Apr 2019

- Information-an International Interdiscip...

TL;DR: A brief overview of text classification algorithms is discussed in this article, where different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods are discussed, and the limitations of each technique and their application in real-world problems are discussed.

...read moreread less

1.2K

•Book Chapter•10.1007/978-1-4614-3223-4_6

A survey of text classification algorithms

Charu C. Aggarwal, +1 more

- 01 Aug 2012

TL;DR: A survey of a wide variety of text classification algorithms for a number of diverse domains, including target marketing, medical diagnosis, news group filtering, and document organization is provided.

...read moreread less

1K

...

Expand

References

Genetic algorithms in search, optimization and machine learning

David E. Goldberg

- 01 Jan 1989

TL;DR: This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.

...read moreread less

58.6K

•Book

Genetic algorithms in search, optimization, and machine learning

David E. Goldberg

- 01 Sep 1988

TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.

...read moreread less

52.8K

•Book

The Nature of Statistical Learning Theory

Vladimir Vapnik

- 01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

46K

•Journal Article•10.1023/A:1022627411411

Support-Vector Networks

Corinna Cortes, +1 more

- 15 Sep 1995

- Machine Learning

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

...read moreread less

42K

•Book

Statistical methods

George W. Snedecor

- 01 Jan 1967

29.9K

...

Expand

Centroid-Based Document Classification: Analysis and Experimental Results

Chat with Paper

AI Agents for this Paper

Citations

Machine learning

An extensive empirical study of feature selection metrics for text classification

Computing semantic relatedness using Wikipedia-based explicit semantic analysis

Text Classification Algorithms: A Survey

A survey of text classification algorithms

References

Genetic algorithms in search, optimization and machine learning

Genetic algorithms in search, optimization, and machine learning

The Nature of Statistical Learning Theory

Support-Vector Networks

Statistical methods

Related Papers (5)

Machine learning in automated text categorization

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

Term Weighting Approaches in Automatic Text Retrieval

An algorithm for suffix stripping

Indexing by Latent Semantic Analysis