Normalized Google distance

Topic Tools

Papers published on a yearly basis

Papers

Journal Article•10.2307/417141•

WordNet : an electronic lexical database

[...]

01 Sep 2000-Language

TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.

...read moreread less

Abstract: Part 1 The lexical database: nouns in WordNet, George A. Miller modifiers in WordNet, Katherine J. Miller a semantic network of English verbs, Christiane Fellbaum design and implementation of the WordNet lexical database and searching software, Randee I. Tengi. Part 2: automated discovery of WordNet relations, Marti A. Hearst representing verb alterations in WordNet, Karen T. Kohl et al the formalization of WordNet by methods of relational concept analysis, Uta E. Priss. Part 3 Applications of WordNet: building semantic concordances, Shari Landes et al performance and confidence in a semantic annotation task, Christiane Fellbaum et al WordNet and class-based probabilities, Philip Resnik combining local context and WordNet similarity for word sense identification, Claudia Leacock and Martin Chodorow using WordNet for text retrieval, Ellen M. Voorhees lexical chains as representations of context for the detection and correction of malapropisms, Graeme Hirst and David St-Onge temporal indexing through lexical chaining, Reem Al-Halimi and Rick Kazman COLOR-X - using knowledge from WordNet for conceptual modelling, J.F.M. Burg and R.P. van de Riet knowledge processing on an extended WordNet, Sanda M. Harabagiu and Dan I Moldovan appendix - obtaining and using WordNet.

...read moreread less

14,437 citations

Proceedings Article•

An Information-Theoretic Definition of Similarity

[...]

Dekang Lin

24 Jul 1998

TL;DR: This work presents an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model and demonstrates how this definition can be used to measure the similarity in a number of different domains.

...read moreread less

Abstract: Similarity is an important and widely used concept Previous definitions of similarity are tied to a particular application or a form of knowledge representation We present an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model We demonstrate how our definition can be used to measure the similarity in a number of different domains

...read moreread less

4,637 citations

Journal Article•10.1109/TKDE.2007.48•

The Google Similarity Distance

[...]

Rudi Cilibrasi, Paul M. B. Vitányi

01 Mar 2007-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A new theory of similarity between words and phrases based on information distance and Kolmogorov complexity is presented, which is applied to construct a method to automatically extract similarity, the Google similarity distance, of Words and phrases from the WWW using Google page counts.

...read moreread less

Abstract: Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers, the equivalent of "society" is "database," and the equivalent of "use" is "a way to search the database". We present a new theory of similarity between words and phrases based on information distance and Kolmogorov complexity. To fix thoughts, we use the World Wide Web (WWW) as the database, and Google as the search engine. The method is also applicable to other search engines and databases. This theory is then applied to construct a method to automatically extract similarity, the Google similarity distance, of words and phrases from the WWW using Google page counts. The WWW is the largest database on earth, and the context information entered by millions of independent users averages out to provide automatic semantics of useful quality. We give applications in hierarchical clustering, classification, and language translation. We give examples to distinguish between colors and numbers, cluster names of paintings by 17th century Dutch masters and names of books by English novelists, the ability to understand emergencies and primes, and we demonstrate the ability to do a simple automatic English-Spanish translation. Finally, we use the WordNet database as an objective baseline against which to judge the performance of our method. We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87 percent with the expert crafted WordNet categories

...read moreread less

1,888 citations

Combining Local Context and Wordnet Similarity for Word Sense Identification

[...]

Christiane Fellbaum, George A. Miller

1 Jan 1998

TL;DR: This chapter contains sections titled: Introducfion, Training and Testing Data, Experiment 1: The Local Context Classifier, Experiment 2: Measuring Word Similarity In Wordnet, and Combining Local Context and Wordnet Similarity Measures.

...read moreread less

Abstract: This chapter contains sections titled: Introducfion, Training and Testing Data, Experiment 1: The Local Context Classifier, Experiment 2: Measuring Word Similarity In Wordnet, Experiment 3: Combining Local Context and Wordnet Similarity Measures, Conclusions, References

...read moreread less

1,888 citations

Proceedings Article•

WordNet: similarity - measuring the relatedness of concepts

[...]

Ted Pedersen¹, Siddharth Patwardhan², Jason Michelizzi¹•Institutions (2)

University of Minnesota¹, University of Utah²

25 Jul 2004

TL;DR: WordNet::Similarity is a freely available software package that makes it possible to measure the semantic similarity and relatedness between a pair of concepts (or synsets).

...read moreread less

Abstract: WordNet: Similarity is a freely available software package that makes it possible to measure the semantic similarity or relatedness between a pair of concepts (or word senses). It provides six measures of similarity, and three measures of relatedness, all of which are based on the lexical database WordNet. These measures are implemented as Perl modules which take as input two concepts, and return a numeric value that represents the degree to which they are similar or related.

...read moreread less

1,484 citations

...

Expand

Year	Papers
2021	1
2020	8
2019	7
2018	2
2017	5
2016	6

Topic Tools

Papers published on a yearly basis

Papers

WordNet : an electronic lexical database

An Information-Theoretic Definition of Similarity

The Google Similarity Distance

Combining Local Context and Wordnet Similarity for Word Sense Identification

WordNet: similarity - measuring the relatedness of concepts

Related Topics (5)

Performance Metrics