Efficient processing of complex features for information retrieval

Open Access

Efficient processing of complex features for information retrieval

- 01 Jan 2008

24

TL;DR: The TupleFlow framework, an extension of MapReduce, provides a basis for custom binned indexes, which efficiently store feature data, and work in binning probabilities shows how to effectively map language model probabilities into the space of small positive integers.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Book

Search Engines: Information Retrieval in Practice

W. Bruce Croft, +2 more

- 16 Feb 2009

TL;DR: This text provides the background and tools needed to evaluate, compare and modify search engines and numerous programming exercises make extensive use of Galago, a Java-based open source search engine.

...read moreread less

1.1K

Journal Article•10.14778/1687553.1687568

Building a high-level dataflow system on top of Map-Reduce: the Pig experience

Alan Gates, +8 more

- 01 Aug 2009

TL;DR: Pig is a high-level dataflow system that aims at a sweet spot between SQL and Map-Reduce, and performance comparisons between Pig execution and raw Map- Reduce execution are reported.

...read moreread less

472

•Proceedings Article

CHI '01 Extended Abstracts on Human Factors in Computing Systems

Marilyn Tremaine

- 31 Mar 2001

TL;DR: The CHI Conference provides a forum for people to meet both formally and informally, to share and to learn as discussed by the authors, and we trust that you will find here the intellectually exciting and personally rewarding experiences that bring people back to this conference year after year.

...read moreread less

399

•Book

Faceted Search

Daniel Tunkelang

- 29 Jun 2009

TL;DR: This lecture explores the history, theory, and practice of faceted search, and offers a self-contained treatment of the topic, with an extensive bibliography for those who would like to pursue particular aspects in more depth.

...read moreread less

365

Synthesis Lectures on Information Concepts, Retrieval, and Services

Daniel Tunkelang, +1 more

- 01 Jan 2009

TL;DR: This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN’s), and divergence-based models to create a consolidated and balanced view on the main models.

...read moreread less

231

...

Expand

References

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

Journal Article•10.1145/1327452.1327492

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008

- Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

18.6K

Journal Article•10.1016/S0169-7552(98)00110-X

The anatomy of a large-scale hypertextual Web search engine

Sergey Brin, +1 more

- 01 Apr 1998

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

16.6K

•Proceedings Article

The PageRank Citation Ranking : Bringing Order to the Web

Lawrence Page, +3 more

- 11 Nov 1999

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.

...read moreread less

16.4K

•Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more

- 01 Jan 1998

- Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

13.3K

...

Expand

Efficient processing of complex features for information retrieval

Chat with Paper

AI Agents for this Paper

Citations

Search Engines: Information Retrieval in Practice

Building a high-level dataflow system on top of Map-Reduce: the Pig experience

CHI '01 Extended Abstracts on Human Factors in Computing Systems

Faceted Search

Synthesis Lectures on Information Concepts, Retrieval, and Services

References

MapReduce: simplified data processing on large clusters

MapReduce: simplified data processing on large clusters

The anatomy of a large-scale hypertextual Web search engine

The PageRank Citation Ranking : Bringing Order to the Web

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Related Papers (5)

Query evaluation: strategies and optimizations

Efficient query evaluation using a two-level retrieval process

Relevance-Based Language Models

Mining correlations between medically dependent features and image retrieval models for query classification

Relevance feedback in information retrieval