Proceedings Article10.1145/502716.502746
Automatically indexing documents: content vs. reference
Shannon Bradshaw,Kristian J. Hammond +1 more
- 13 Jan 2002
- pp 180-181
TL;DR: It is indicated that reference identifies the value of documents more accurately and with a greater diversity of language than content, which is superior to indexing documents based on their content.
read more
Abstract: Authors cite other work in many types of documents. Notable among these are research papers and web pages. Recently, several researchers have proposed using the text surrounding citations (references) as a means of automatically indexing documents for search engines, claiming that this technique is superior to indexing documents based on their content [1,2]. While we ourselves have made this claim, we acknowledge that little empirical data has been presented to support it. Therefore, in the limited space available we present a terse overview of a study comparing reference to content as bases for indexing documents. This study indicates that reference identifies the value of documents more accurately and with a greater diversity of language than content.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Patent
Automatic method and system for formulating and transforming representations of context used by information services
Kristian J. Hammond,Jerome Budzik,Lawrence Birnbaum +2 more
- 30 Jul 2003
TL;DR: In this article, an information retrieval system for automatically retrieving information related to the context of an active task being manipulated by a user is presented, where the system observes the operation of the active task and user interactions and utilizes predetermined criteria to generate a context representation.
206
Patent
Method and system for assessing relevant properties of work contexts for use by information services
Kristian J. Hammond,Jerome Budzik,Lawrence Birnbaum +2 more
- 11 Apr 2014
TL;DR: In this paper, an information retrieval system for automatically retrieving information related to the context of an active task being manipulated by a user is presented, where the system observes the operation of the active task and user interactions, and utilizes predetermined criteria to generate context representation to generate queries or search terms for conducting information search.
125
Patent
Query preprocessing and pipelining
Eric B. Watson,Marcelo A. F. Calbucci,Sally Salas,Darren A. Shakib +3 more
- 26 Jan 2004
TL;DR: The authors modify queries by grouping terms as phrases, correcting spelling errors, and augmenting the query with category terms that trigger query execution on certain data sources to better reflect the user's intent.
49
Patent
Index partitioning based on document relevance for document indexes
Darren A. Shakib,Gaurav Sareen,Michael Burrows +2 more
- 22 Jan 2004
TL;DR: In this paper, index queries reference the first partition and move to a subsequent partition when a static rank for the subsequent partition is higher than a weighted portion of the target score added to a weighted part of a dynamic rank corresponding to the relevance of the results set generated thus far.
45
An unsupervised approach to automatic classification of scientific literature utilizing bibliographic metadata
TL;DR: An unsupervised approach for automatic classification of scientific literature archived in digital libraries and repositories according to a standard library classification scheme based on identifying all the references cited in the document to be classified.
35
References
The anatomy of a large-scale hypertextual Web search engine
Sergey Brin,Lawrence Page +1 more
- 01 Apr 1998
TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
•Journal Article
The Anatomy of a Large-Scale Hypertextual Web Search Engine.
Sergey Brin,Lawrence Page +1 more
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
13.3K
•Proceedings Article
The Anatomy of Large-scale Hypertextual Web Search Engine
S. Brin
- 01 Jan 1998
TL;DR: We present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext to produce better search results.
9.7K
The vocabulary problem in human-system communication
TL;DR: It is shown how this fundamental property of language limits the success of various design methodologies for vocabulary-driven interaction, and an optimal strategy, unlimited aliasing, is derived and shown to be capable of several-fold improvements.
1.6K
Searching the Web: the public and their queries
TL;DR: It is found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features, and the language of Web queries is distinctive.