Document classification

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Proceedings Article•10.18653/V1/N16-1174•

Hierarchical Attention Networks for Document Classification

[...]

Zichao Yang¹, Diyi Yang¹, Chris Dyer¹, Xiaodong He², Alexander J. Smola¹, Eduard Hovy¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Microsoft²

13 Jun 2016

TL;DR: Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin.

...read moreread less

Abstract: We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it has a hierarchical structure that mirrors the hierarchical structure of documents; (ii) it has two levels of attention mechanisms applied at the wordand sentence-level, enabling it to attend differentially to more and less important content when constructing the document representation. Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin. Visualization of the attention layers illustrates that the model selects qualitatively informative words and sentences.

...read moreread less

5,773 citations

Journal Article•10.1016/J.PATCOG.2004.03.009•

Learning multi-label scene classification

[...]

Matthew Boutell¹, Jiebo Luo², Xipeng Shen¹, Chris Brown¹•Institutions (2)

University of Rochester¹, Eastman Kodak Company²

01 Sep 2004-Pattern Recognition

TL;DR: A framework to handle semantic scene classification, where a natural scene may contain multiple objects such that the scene can be described by multiple class labels, is presented and appears to generalize to other classification problems of the same nature.

...read moreread less

2,554 citations

Proceedings Article•10.1145/775152.775226•

Mining the peanut gallery: opinion extraction and semantic classification of product reviews

[...]

Kushal B. Dave¹, Steve Lawrence¹, David M. Pennock•Institutions (1)

Princeton University¹

20 May 2003

TL;DR: This work develops a method for automatically distinguishing between positive and negative reviews and draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation.

...read moreread less

Abstract: The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. Our classifier draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation. The best methods work as well as or better than traditional machine learning. When operating on individual sentences collected from web searches, performance is limited due to noise and ambiguity. But in the context of a complete web-based tool and aided by a simple method for grouping sentences into attributes, the results are qualitatively quite useful.

...read moreread less

2,464 citations

Proceedings Article•

From Word Embeddings To Document Distances

[...]

Matt J. Kusner¹, Yu Sun¹, Nicholas I. Kolkin¹, Kilian Q. Weinberger¹•Institutions (1)

Washington University in St. Louis¹

6 Jul 2015

TL;DR: It is demonstrated on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the Word Mover's Distance metric leads to unprecedented low k-nearest neighbor document classification error rates.

...read moreread less

Abstract: We present the Word Mover's Distance (WMD), a novel distance function between text documents. Our work is based on recent results in word embeddings that learn semantically meaningful representations for words from local cooccurrences in sentences. The WMD distance measures the dissimilarity between two text documents as the minimum amount of distance that the embedded words of one document need to "travel" to reach the embedded words of another document. We show that this distance metric can be cast as an instance of the Earth Mover's Distance, a well studied transportation problem for which several highly efficient solvers have been developed. Our metric has no hyperparameters and is straight-forward to implement. Further, we demonstrate on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the WMD metric leads to unprecedented low k-nearest neighbor document classification error rates.

...read moreread less

2,252 citations

Journal Article•10.5555/944790.944808•

One-class svms for document classification

[...]

Larry M. Manevitz¹, Malik Yousef¹•Institutions (1)

University of Haifa¹

01 Mar 2002-Journal of Machine Learning Research

TL;DR: The SVM approach as represented by Schoelkopf was superior to all the methods except the neural network one, where it was, although occasionally worse, essentially comparable.

...read moreread less

Abstract: We implemented versions of the SVM appropriate for one-class classification in the context of information retrieval. The experiments were conducted on the standard Reuters data set. For the SVM implementation we used both a version of Schoelkopf et al. and a somewhat different version of one-class SVM based on identifying "outlier" data as representative of the second-class. We report on experiments with different kernels for both of these implementations and with different representations of the data, including binary vectors, tf-idf representation and a modification called "Hadamard" representation. Then we compared it with one-class versions of the algorithms prototype (Rocchio), nearest neighbor, naive Bayes, and finally a natural one-class neural network classification method based on "bottleneck" compression generated filters.The SVM approach as represented by Schoelkopf was superior to all the methods except the neural network one, where it was, although occasionally worse, essentially comparable. However, the SVM methods turned out to be quite sensitive to the choice of representation and kernel in ways which are not well understood; therefore, for the time being leaving the neural network approach as the most robust.

...read moreread less

1,470 citations

...

Expand

Year	Papers
2025	5
2024	9
2023	25
2022	55
2021	152
2020	220

Topic Tools

Papers published on a yearly basis

Papers

Hierarchical Attention Networks for Document Classification

Learning multi-label scene classification

Mining the peanut gallery: opinion extraction and semantic classification of product reviews

From Word Embeddings To Document Distances

One-class svms for document classification

Related Topics (5)

Performance Metrics