About: Disk mirroring is a research topic. Over the lifetime, 1370 publications have been published within this topic receiving 38318 citations. The topic is also known as: mirroring.
TL;DR: Five levels of RAIDs are introduced, giving their relative cost/performance, and a comparison to an IBM 3380 and a Fujitsu Super Eagle is compared.
Abstract: Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disks (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAID to an IBM 3380 and a Fujitsu Super Eagle.
TL;DR: In this article, a disk drive system and method capable of dynamically allocating data is provided, where a RAID subsystem and disk manager dynamically allocate data across the pool of storage and a plurality of disk drives based on RAID-to-disk mapping.
Abstract: A disk drive system and method capable of dynamically allocating data is provided. The disk drive system may include a RAID subsystem having a pool of storage, for example a page pool of storage that maintains a free list of RAIDs, or a matrix of disk storage blocks that maintain a null list of RAIDs, and a disk manager having at least one disk storage system controller. The RAID subsystem and disk manager dynamically allocate data across the pool of storage and a plurality of disk drives based on RAID-to-disk mapping. The RAID subsystem and disk manager determine whether additional disk drives are required, and a notification is sent if the additional disk drives are required. Dynamic data allocation and data progression allow a user to acquire a disk drive later in time when it is needed. Dynamic data allocation also allows efficient data storage of snapshots/point-in-time copies of virtual volume pool of storage, instant data replay and data instant fusion for data backup, recovery etc., remote data storage, and data progression, etc.
TL;DR: In this article, a detailed characterization of low-level disk access on three different systems over a two month period is presented, where the authors provide detailed information about the disk accesses on these systems.
Abstract: Disk access patterns are becoming ever more important to understand as the gap between processor and disk performance increases. The study presented here is a detailed characterization of every lowlevel disk access generated by three quite different systems over a two month period. The contributions of this paper are the detailed information we provide about the disk accesses on these systems (many of our results are significantly different from those reported in the literature, which provide summary data only for file-level access on small-memory systems); and the analysis of a set of optimizations that could be applied at the disk level to improve performance. Our traces show that the majority of all operations are writes; disk accesses are rarely sequential; 25‐ 50% of all accesses are asynchronous; only 13‐41% of accesses are to user data (the rest result from swapping, metadata, and program execution); and I/O activity is very bursty: mean request queue lengths seen by an incoming request range from 1.7 to 8.9 (1.2‐1.9 for reads, 2.0‐14.8 for writes), while we saw 95th percentile queue lengths as large as 89 entries, and maxima of over 1000. Using a simulator to analyze the effect of write caching at the disk level, we found that using a small non-volatile cache at each disk allowed writes to be serviced considerably faster than with a regular disk. In particular, short bursts of writes go much faster ‐ and such bursts are common: writes rarely come singly. Adding even 8 KB of non-volatile memory per disk could reduce disk traffic by 10‐ 18%, and 90% of metadata write traffic can be absorbed with as little as 0.2 MB per disk of nonvolatile RAM. Even 128KB of NVRAM cache in each disk can improve write performance by as much as a factor of three. FCFS scheduling for the cached writes gave better performance than a more advanced technique at small cache sizes. Our results provide quantitative input to people investigating improved file system designs (such as log-based ones), as well as to I/O subsystem and disk controller designers.
TL;DR: In this paper, a data processing system with a RAID cache disk subsystem utilizes three RAID cache disks to provide increased performance along with increased reliability, especially in the event of a failure of one of the disk controllers.
Abstract: A data processing system with a RAID cache disk subsystem utilizes three RAID cache disk controllers to provide increased performance along with increased reliability, especially in the event of a failure of one of the disk controllers. Disk writes are mirrored in two disk controllers in order to guarantee integrity in the event of a disk controller or interface failure. Typically this write caching must be terminated when one of the controllers fails in order to maintain integrity. In the present invention, write caching continues utilizing the two remaining disk controllers.
TL;DR: This dissertation presents analytic models for disk-array lifetime, evaluates these against event-driven simulation, and applies them to an example redundant disk array, showing that a 10% overhead for an N + 1-parity encoding plus a 10%, overhead for on-line spares can provide higher reliability than the 100% overhead of conventional mirrored disks.
Abstract: During the past decade, advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secondary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of current disk subsystems but, by leveraging parallelism, many times their performance. Unfortunately, arrays of small disks may have much higher failure rates than the single large disks they replace. Redundant Arrays of Inexpensive Disks (RAID) use simple redundancy schemes to provide high data reliability. This dissertation investigates the data encoding, performance, and reliability of redundant disk arrays.
Organizing redundant data into a disk array is treated as a coding problem in this dissertation. Among alternatives examined, codes as simple as parity are shown to effectively correct single, self-identifying disk failures.
The performance advantages of striping data across multiple disks are reviewed in this dissertation. For large transfers this parallelism reduces response time. Striping data also automatically distributes independent, small accesses across disks to increase throughput. This dissertation evaluates the performance lost to the maintenance of redundant data. This loss is negligible for large transfers but can be significant for small writes because of increases in aggregate disk service time.
Because disk arrays include redundancy to protect against the high failure rates caused by large numbers of disk components, it is crucial that disk failures be characterized. This dissertation provides evidence that disk lifetimes can be modeled as exponential random variables.
Building on an exponential model for disk lifetimes, this dissertation presents analytic models for disk-array lifetime, evaluates these against event-driven simulation, and applies them to an example redundant disk array. These models incorporate the effects of independent and dependent disk failures (shared support hardware) as well as the effects of on-line spare disks. For the example redundant disk array, these models show that a 10% overhead for an N + 1-parity encoding plus a 10% overhead for on-line spares can provide higher reliability than the 100% overhead of conventional mirrored disks.