Shard

Topic Tools

Papers published on a yearly basis

Papers

Patent•

Large distributed database clustering systems and methods

[...]

27 Jun 2013

TL;DR: In this article, the authors present a system for managing asynchronous replication in a distributed database environment, while providing for scaling of the distributed database, where a cluster of nodes can be assigned roles for managing partitions of data within the database and processing database requests.

...read moreread less

Abstract: Systems and methods are provided for managing asynchronous replication in a distributed database environment, while providing for scaling of the distributed database. A cluster of nodes can be assigned roles for managing partitions of data within the database and processing database requests. In one embodiment, each cluster includes a node with a primary role to process write operations and mange asynchronous replication of the operations to at least one secondary node. Each cluster or set of nodes can host one or more partitions of database data. Collectively, the cluster or set of nodes define a shard cluster that hosts all the data of the distributed database. Each shard cluster, individual nodes, or sets of nodes can be configured to manage the size of any hosted partitions, splitting database partitions, migrating partitions, and/or managing expansion of shard clusters to encompass new systems.

...read moreread less

209 citations

Patent•

System and method for performing shard migration to support functions of a cloud-based service

[...]

Tamar Bercovici, Florian Jourda, Benjamin Trombley-Shapiro

8 Jul 2013

TL;DR: In this paper, the authors present a horizontally scaled database based on data ownership for cloud-based collaboration and/or storage platform/service, which comprises multiple shard databases, and all files and folders owned by a user are stored on a single shard database.

...read moreread less

Abstract: Systems and methods of maintaining a horizontally scaled database based on data ownership for a cloud-based platform (e.g., cloud-based collaboration and/or storage platform/service) are disclosed. The system database comprises multiple shard databases, and all files and folders owned by a user are stored on a single shard database. When a user transfers ownership of a file and/or a folder to a second user, the transferred file and/or folder is stored on the shard database that stores all of the data for the second user.

...read moreread less

107 citations

Proceedings Article•10.1145/1996014.1996021•

Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store

[...]

Kurt Rohloff¹, Richard E. Schantz¹•Institutions (1)

BBN Technologies¹

8 Jun 2011

TL;DR: The Clause-Iteration algorithms form the basis of the scalable, SHARD graph-store built on the Hadoop implementation of MapReduce, which performs favorably when compared to existing "industrial" graph-stores on a standard benchmark graph with 800 million edges.

...read moreread less

Abstract: Graph data processing is an emerging application area for cloud computing because there are few other information infrastructures that cost-effectively permit scalable graph data processing. We present a scalable cloud-based approach to process queries on graph data utilizing the MapReduce model. We call this approach the Clause-Iteration approach. We present algorithms that, when used in conjunction with a MapReduce framework, respond to SPARQL queries over RDF data. Our innovation in the Clause-Iteration approach comes from 1) the iterative construction of query responses by incrementally growing the number of query clauses considered in a response, and 2) our use of flagged keys to join the results of these incremental responses. The Clause-Iteration algorithms form the basis of our scalable, SHARD graph-store built on the Hadoop implementation of MapReduce. SHARD performs favorably when compared to existing "industrial" graph-stores on a standard benchmark graph with 800 million edges. We discuss design considerations and alternatives associated with constructing scalable graph processing technologies.

...read moreread less

85 citations

Patent•

Sharding method and apparatus using directed graphs

[...]

Arif Merchant¹, Mahesh Kallahalla¹, Ram Swaminathan¹•Institutions (1)

Hewlett-Packard¹

16 May 2003

TL;DR: In this paper, a method and apparatus is used to divide a storage volume into shards (202-210), where the division is made using a directed graph having a vertex for each block in the storage volume and directed-edges between pairs of vertices representing a shard of blocks.

...read moreread less

Abstract: A method and apparatus is used to divide a storage volume into shards (202-210). The division is made using a directed graph having a vertex for each block in the storage volume and directed-edges between pairs of vertices representing a shard of blocks (304), associating a weight with each directed edge that represents the dissimilarity for the shard of blocks between the corresponding pair of vertices (306), selecting a maximum number of shards (K) for dividing the storage volume (402), identifying a minimum aggregate weight associated with a current vertex for a combination of no more than K shards (512-514), performing the identification of the minimum aggregate weight for vertices in the directed graph (406), and picking the smallest aggregated weight associated with the last vertex to determine a sharding that spans the storage volume and provides a minimal dissimilarity among no more than K shards of blocks (408).

...read moreread less

72 citations

Patent•

Partitioning database data in a sharded database

[...]

Cory M. Isaacson, Andrew F. Grove

4 Oct 2013

TL;DR: In this article, a sharded database system for partitioning data among a plurality of shard servers is presented, which includes a first shard, a second shard and a shard control record.

...read moreread less

Abstract: A sharded database system configured for partitioning data amongst a plurality of shard servers is provided. In one implementation the sharded database system comprises a sharded database including a first shard server, a second shard server, and a shard control record. The shard control record is configured to define a first data structure for distributing a first plurality of data records or rows based on a first sharding by monotonic key range across the first and second shard servers. The sharded database is also configured to further distribute the first plurality of records or rows across the first shard server and the second shard server via a subsidiary hashing method. A method of partitioning data of a database is also provided.

...read moreread less

70 citations

...

Expand

Year	Papers
2021	26
2020	34
2019	35
2018	32
2017	30
2016	25

Topic Tools

Papers published on a yearly basis

Papers

Large distributed database clustering systems and methods

System and method for performing shard migration to support functions of a cloud-based service

Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store

Sharding method and apparatus using directed graphs

Partitioning database data in a sharded database

Related Topics (5)

Performance Metrics