Failover

Topic Tools

Papers published on a yearly basis

Papers

Patent•

High-availability cluster virtual server system

[...]

Omar M. A. Gadir, Kartik Subbanna, Ananda R. Vayyala, Hariprasad Shanmugam, Amod P. Bodas, Tarun Kumar Tripathy, Ravi S. Indurkar, Kurma H. Rao - Show less +4 more

23 Jul 2001

TL;DR: In this paper, the authors propose to assign failover priorities to virtual servers in a cluster of two or more autonomous server nodes, where each virtual server has one or more virtual IP addresses and load balancing can be provided by distributing virtual servers from a failed node to multiple different nodes.

...read moreread less

Abstract: Systems and methods, including computer program products, providing high-availability in server systems. In one implementation, a server system is cluster of two or more autonomous server nodes, each running one or more virtual servers. When a node fails, its virtual servers are migrated to one or more other nodes. Connectivity between nodes and clients is based on virtual IP addresses, where each virtual server has one or more virtual IP addresses. Virtual servers can be assigned failover priorities, and, in failover, higher priority virtual servers can be migrated before lower priority ones. Load balancing can be provided by distributing virtual servers from a failed node to multiple different nodes. When a port within a node fails, the node can reassign virtual IP addresses from the failed port to other ports on the node until no good ports remain and only then migrate virtual servers to another node or nodes.

...read moreread less

1,351 citations

Journal Article•10.14778/1454159.1454167•

PNUTS: Yahoo!'s hosted data serving platform

[...]

Brian F. Cooper¹, Raghu Ramakrishnan¹, Utkarsh Srivastava¹, Adam Silberstein¹, Philip Bohannon¹, Hans-Arno Jacobsen¹, Nick Puz¹, Daniel Weaver¹, Ramana Yerneni¹ - Show less +5 more•Institutions (1)

Yahoo!¹

1 Aug 2008

TL;DR: PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees and utilizes automated load-balancing and failover to reduce operational complexity.

...read moreread less

Abstract: We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The first version of the system is currently serving in production. We describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results.

...read moreread less

1,182 citations

Proceedings Article•

Megastore: Providing Scalable, Highly Available Storage for Interactive Services

[...]

Jason Baker¹, Christopher N. Bond¹, James C. Corbett¹, J. J. Furman¹, Andrey Khorlin¹, James Larson¹, Jean-Michel Leon¹, Yawei Li¹, Alexander Lloyd¹, Vadim Yushprakh¹ - Show less +6 more•Institutions (1)

Google¹

1 Jan 2011

TL;DR: Megastore provides fully serializable ACID semantics within ne-grained partitions of data, which allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters.

...read moreread less

Abstract: Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide fully serializable ACID semantics within ne-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters. This paper describes Megastore’s semantics and replication algorithm. It also describes our experience supporting a wide range of Google production services built with Megastore.

...read moreread less

849 citations

Patent•

Failover processing in a storage system

[...]

Richard Meyer, Kumar Gajjar, Chan Ng, Andrey Gusev

13 Feb 2002

TL;DR: In this paper, a failover recovery approach encapsulates the knowledge of failureover recovery between components within a storage server and between storage server systems, including information about what components are participating in a Failover Set, how they are configured for failover, what is the Fail-Stop policy, and what are the steps to perform when "failing-over" a component.

...read moreread less

Abstract: Failover processing in storage server system utilizes policies for managing fault tolerance (FT) and high availability (HA) configurations. The approach encapsulates the knowledge of failover recovery between components within a storage server and between storage server systems. This knowledge includes information about what components are participating in a Failover Set, how they are configured for failover, what is the Fail-Stop policy, and what are the steps to perform when “failing-over” a component.

...read moreread less

506 citations

Proceedings Article•10.1145/2619239.2626309•

Fastpass: a centralized "zero-queue" datacenter network

[...]

Jonathan Perry¹, Amy Ousterhout¹, Hari Balakrishnan¹, Devavrat Shah¹, Hans Fugal² - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Facebook²

17 Aug 2014

TL;DR: This paper describes Fastpass, a datacenter network architecture built using this principle that achieves high throughput comparable to current networks at a 240x reduction is queue lengths, and achieves much fairer and consistent flow throughputs than the baseline TCP.

...read moreread less

Abstract: An ideal datacenter network should provide several properties, including low median and tail latency, high utilization (throughput), fair allocation of network resources between users or applications, deadline-aware scheduling, and congestion (loss) avoidance. Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection decisions are distributed among the endpoints and routers. Instead, we propose that each sender should delegate control---to a centralized arbiter---of when each packet should be transmitted and what path it should follow. This paper describes Fastpass, a datacenter network architecture built using this principle. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmitted, while the second determines the path to use for that packet. In addition, Fastpass uses an efficient protocol between the endpoints and the arbiter and an arbiter replication strategy for fault-tolerant failover. We deployed and evaluated Fastpass in a portion of Facebook's datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a 240x reduction is queue lengths (4.35 Mbytes reducing to 18 Kbytes), achieves much fairer and consistent flow throughputs than the baseline TCP (5200x reduction in the standard deviation of per-flow throughput with five concurrent connections), scalability from 1 to 8 cores in the arbiter implementation with the ability to schedule 2.21 Terabits/s of traffic in software on eight cores, and a 2.5x reduction in the number of TCP retransmissions in a latency-sensitive service at Facebook.

...read moreread less

488 citations

...

Expand

Year	Papers
2025	45
2024	38
2023	46
2022	64
2021	50
2020	118

Topic Tools

Papers published on a yearly basis

Papers

High-availability cluster virtual server system

PNUTS: Yahoo!'s hosted data serving platform

Megastore: Providing Scalable, Highly Available Storage for Interactive Services

Failover processing in a storage system

Fastpass: a centralized "zero-queue" datacenter network

Related Topics (5)

Performance Metrics