Topic

Metasearch engine

About: Metasearch engine is a research topic. Over the lifetime, 2590 publications have been published within this topic receiving 79273 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Journal Article•10.1016/S0169-7552(98)00110-X•

The anatomy of a large-scale hypertextual Web search engine

[...]

Sergey Brin¹, Lawrence Page¹•Institutions (1)

Stanford University¹

1 Apr 1998

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

16,670 citations

Proceedings Article•10.1145/371920.372165•

Rank aggregation methods for the Web

[...]

Cynthia Dwork, Ravi Kumar¹, Moni Naor², Dandapani Sivakumar¹•Institutions (2)

IBM¹, Weizmann Institute of Science²

1 Apr 2001

TL;DR: A set of techniques for the rank aggregation problem is developed and compared to that of well-known methods, to design rank aggregation techniques that can be used to combat spam in Web searches.

...read moreread less

Abstract: We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. We develop a set of techniques for the rank aggregation problem and compare their performance to that of well-known methods. A primary goal of our work is to design rank aggregation techniques that can e ectively combat \spam," a serious problem in Web searches. Experiments show that our methods are simple, e cient, and e ective.

...read moreread less

2,235 citations

Journal Article•10.1002/1097-4571(2000)9999:9999<::AID-ASI1591>3.3.CO;2-I•

Searching the Web: the public and their queries

[...]

Amanda Spink¹, Dietmar Wolfram², Major B. J. Jansen³, Tefko Saracevic⁴•Institutions (4)

Pennsylvania State University¹, University of Wisconsin–Milwaukee², University of Maryland, College Park³, Rutgers University⁴

01 Feb 2001-Journal of the Association for Information Science and Technology

TL;DR: It is found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features, and the language of Web queries is distinctive.

...read moreread less

Abstract: In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.

...read moreread less

1,193 citations

Proceedings Article•10.1145/1148170.1148177•

Improving web search ranking by incorporating user behavior information

[...]

Eugene Agichtein¹, Eric D. Brill¹, Susan T. Dumais¹•Institutions (1)

Microsoft¹

6 Aug 2006

TL;DR: In this paper, the authors show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithm by as much as 31% relative to the original performance.

...read moreread less

Abstract: We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into the ranking process and explore the contributions of user feedback compared to other common web search features. We report results of a large scale evaluation over 3,000 queries and 12 million user interactions with a popular web search engine. We show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithms by as much as 31% relative to the original performance.

...read moreread less

1,177 citations

Journal Article•10.1613/JAIR.587•

Learning to order things

[...]

William W. Cohen¹, Robert E. Schapire¹, Yoram Singer¹•Institutions (1)

AT&T Labs¹

01 Jan 1999-Journal of Artificial Intelligence Research

TL;DR: An on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm is considered, and it is shown that the problem of finding the ordering that agrees best with a learned preference function is NP-complete.

...read moreread less

Abstract: There are many applications in which it is desirable to order rather than classify instances. Here we consider the problem of learning how to order instances given feedback in the form of preference judgments, i.e., statements to the effect that one instance should be ranked ahead of another. We outline a two-stage approach in which one first learns by conventional means a binary preference function indicating whether it is advisable to rank one instance before another. Here we consider an on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm. In the second stage, new instances are ordered so as to maximize agreement with the learned preference function. We show that the problem of finding the ordering that agrees best with a learned preference function is NP-complete. Nevertheless, we describe simple greedy algorithms that are guaranteed to find a good approximation. Finally, we show how metasearch can be formulated as an ordering problem, and present experimental results on learning a combination of "search experts," each of which is a domain-specific query expansion strategy for a web search engine.

...read moreread less

1,080 citations

...

Expand

Performance Metrics

2,652

Papers

31,382

Citations

No. of papers in the topic in previous years
Year	Papers
2025	2
2024	5
2023	19
2022	21
2021	12
2020	9

Metasearch engine

Topic Tools

Papers published on a yearly basis

Papers

The anatomy of a large-scale hypertextual Web search engine

Rank aggregation methods for the Web

Searching the Web: the public and their queries

Improving web search ranking by incorporating user behavior information

Learning to order things

Related Topics (5)

Performance Metrics