From frequency to meaning: vector space models of semantics

doi:10.1613/JAIR.2934

Open AccessJournal Article10.1613/JAIR.2934

From frequency to meaning: vector space models of semantics

Peter D. Turney, +1 more

- 01 Jan 2010

- Journal of Artificial Intelligence Resea...

- Vol. 37, Iss: 1, pp 141-188

3.2K

TL;DR: The goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs, and to provide pointers into the literature for those who are less familiar with the field.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 05 Dec 2013

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

24.1K

•Posted Content

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 16 Oct 2013

- arXiv: Computation and Language

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.

...read moreread less

22.9K

Journal Article•10.1145/242224.242229

Machine learning

Thomas G. Dietterich

- 01 Dec 1996

- ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

14K

•Journal Article•10.1162/TACL_A_00051

Enriching Word Vectors with Subword Information

Piotr Bojanowski, +3 more

- 12 Jun 2017

- Transactions of the Association for Comp...

TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.

...read moreread less

10.3K

•Proceedings Article

Distributed Representations of Sentences and Documents

Quoc V. Le, +1 more

- 21 Jun 2014

TL;DR: Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.

...read moreread less

8.9K

...

Expand

References

•Proceedings Article•10.1145/509907.509965

Similarity estimation techniques from rounding algorithms

Moses Charikar

- 19 May 2002

TL;DR: It is shown that rounding algorithms for LPs and SDPs used in the context of approximation algorithms can be viewed as locality sensitive hashing schemes for several interesting collections of objects.

...read moreread less

2.9K

•Journal Article

An extensive empirical study of feature selection metrics for text classification

George Forman

- 01 Mar 2003

- Journal of Machine Learning Research

TL;DR: An empirical comparison of twelve feature selection methods evaluated on a benchmark of 229 text classification problem instances, revealing that a new feature selection metric, called 'Bi-Normal Separation' (BNS), outperformed the others by a substantial margin in most situations and was the top single choice for all goals except precision.

...read moreread less

2.8K

Book Chapter•10.1007/BFB0020217

Kernel Principal Component Analysis

Bernhard Schölkopf, +2 more

- 08 Oct 1997

TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.

...read moreread less

2.6K

Journal Article•10.1037//0096-3445.115.1.39

Attention, similarity, and the identification-categorization relationship.

Robert M. Nosofsky

- 01 Mar 1986

- Journal of Experimental Psychology: Gene...

TL;DR: In this paper, a unified quantitative approach to modeling subjects' identification and categorization of multidimensional perceptual stimuli is proposed and tested, where subjects identify and categorize the same set of perceptually confusable stimuli varying on separable dimensions.

...read moreread less

2.5K