Patent
High-Performance Streaming Dictionary
Michael A. Bender,Martin Farach-Colton,Yonatan R. Fogel,Zardosht Kasheff,Bradley C. Kuszmaul,Vincenzo Liberatore,Barry Perlman,Richard F. Prohaska,David S. Wells +8 more
- 06 Apr 2010
146
TL;DR: In this article, a high-performance dictionary data structure is defined for storing data in a disk storage system, which supports full transactional semantics, concurrent access from multiple transactions, and logging and recovery.
read more
Abstract: A method, apparatus and computer program product for storing data in a disk storage system is presented. A high-performance dictionary data structure is defined. The dictionary data structure is stored on a disk storage system. Key-value pairs can be inserted and deleted into the dictionary data structure. Updates run faster than one insertion per disk-head movement. The structure can also be stored on any system with two or more levels of memory. The dictionary is high performance and supports with full transactional semantics, concurrent access from multiple transactions, and logging and recovery. Keys can be looked up with only a logarithmic number of transfers, even for keys that have been recently inserted or deleted. Queries can be performed on ranges of key-value pairs, including recently inserted or deleted pairs, at a constant fraction of the bandwidth of the disk.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Patent
Method and system for indexing and searching timed media information based upon relevance intervals
Michael Scott Morton,Sibley Verbeck Simon,Noam Carl Unger,Robert Rubinoff,Anthony R. Davis,Kyle Aveni-Deforge +5 more
- 31 Jul 2013
TL;DR: In this paper, a method and system for indexing, searching, and retrieving information from timed media files based on relevance intervals is presented, where a portion of a timed media file is returned, which is selected specifically to be relevant to the given information representations.
276
Patent
Multi-tier caching
Shrikar Archak,Sagar Dixit,Spillane Richard P,Erez Zadok +3 more
- 13 Jun 2011
TL;DR: In this article, a method for maintaining an index in multi-tier data structure includes providing a plurality of a storage devices forming the multilevel data structure, caching an index of key-value pairs across the multilayer data structure.
115
Patent
Semantic Discovery and Mapping Between Data Sources
Alexander Gorelik,Lingling Yan +1 more
- 06 Oct 2011
TL;DR: In this paper, an apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems.
100
Patent
Disambiguation and tagging of entities
David F. Houghton
- 12 May 2010
TL;DR: In this paper, a disambiguation process is used to reduce the potential matches to a single known entity, by ranking candidate entities according to a hierarchy of criteria, and then using a hierarchical ranking of the candidate entities.
75
Patent
Generating Topic-Specific Language Models
David F. Houghton,Seth Murray,Sibley Verbeck Simon +2 more
- 30 Jun 2010
TL;DR: In this article, a topic specific language model was created by performing an initial pass on an audio signal using a generic or basis language model and then determining topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics.
61
References
Space/time trade-offs in hash coding with allowable errors
TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Introduction to Algorithms, 2nd edition.
TH Cormen,CE Leiserson,RL Rivest,Cliff Stein +3 more
- 01 Jan 2001
4K
•Book
Transaction Processing: Concepts and Techniques
Jim Gray,Andreas Reuter +1 more
- 01 Jan 1992
TL;DR: Using transactions as a unifying conceptual framework, the authors show how to build high-performance distributed systems and high-availability applications with finite budgets and risk.
3.8K
The design and implementation of a log-structured file system
Mendel Rosenblum,John Ousterhout +1 more
TL;DR: In this paper, a log-structured file system called Sprite LFS is proposed, which uses a segment cleaner to compress the live information from heavily fragmented segments in order to speed up file writing and crash recovery.
2.4K
Summary cache: a scalable wide-area web cache sharing protocol
TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
2.3K
Related Papers (5)
David Friedman,Prasad V. Bagal,William H. Bridge,Richard L. Long +3 more
- 04 Dec 2009
Micha Anholt,Or Ordentlich,Naftali Sommer,Ofir Shalvi +3 more
- 19 Jun 2013
Timothy J. Johnson
- 20 May 1996