Data consistency

Topic Tools

Papers published on a yearly basis

Papers

Journal Article•10.1145/360363.360369•

The notions of consistency and predicate locks in a database system

[...]

Kapali P. Eswaran¹, Jim Gray¹, Raymond A. Lorie¹, Irving L. Traiger¹•Institutions (1)

IBM¹

01 Nov 1976-Communications of The ACM

TL;DR: It is argued that a transaction needs to lock a logical rather than a physical subset of the database, and an implementation of predicate locks which satisfies the consistency condition is suggested.

...read moreread less

Abstract: In database systems, users access shared data under the assumption that the data satisfies certain consistency constraints. This paper defines the concepts of transaction, consistency and schedule and shows that consistency requires that a transaction cannot request new locks after releasing a lock. Then it is argued that a transaction needs to lock a logical rather than a physical subset of the database. These subsets may be specified by predicates. An implementation of predicate locks which satisfies the consistency condition is suggested.

...read moreread less

2,140 citations

Journal Article•10.1038/NSB0497-269•

Improved R-factors for diffraction data analysis in macromolecular crystallography

[...]

Kay Diederichs¹, P. Andrew Karplus²•Institutions (2)

University of Konstanz¹, Cornell University²

01 Apr 1997-Nature Structural & Molecular Biology

TL;DR: It is proved that Rsym is seriously flawed, because it has an implicit dependence on the redundancy of the data, and a corrected R-factor, Rmeas, is introduced as the equivalent robust indicator of data consistency.

...read moreread less

Abstract: The quantity Rsym (also called Rmerge) is almost universally used for describing X-ray diffraction data quality. Here, we prove that Rsym is seriously flawed, because it has an implicit dependence on the redundance of the data. A corrected R-factor, Rmeas, is introduced as the equivalent robust indicator of data consistency. In addition, we introduce Rmrgd an R-factor that reflects the gain in accuracy upon averaging of equivalent reflections, as a useful indicator of the quality of reduced data. These new data quality indicators better reveal the benefits of highly redundant data and should stimulate improvements in data quality through increased merging of data from multiple crystals.

...read moreread less

903 citations

Proceedings Article•

NOVA: a log-structured file system for hybrid volatile/non-volatile main memories

[...]

Jian Xu¹, Steven Swanson¹•Institutions (1)

University of California, San Diego¹

22 Feb 2016

TL;DR: NoVA is presented, a file system designed to maximize performance on hybrid memory systems while providing strong consistency guarantees, which adapts conventional log-structured file system techniques to exploit the fast random access that NVMs provide.

...read moreread less

Abstract: Fast non-volatile memories (NVMs) will soon appear on the processor memory bus alongside DRAM. The resulting hybrid memory systems will provide software with sub-microsecond, high-bandwidth access to persistent data, but managing, accessing, and maintaining consistency for data stored in NVM raises a host of challenges. Existing file systems built for spinning or solid-state disks introduce software overheads that would obscure the performance that NVMs should provide, but proposed file systems for NVMs either incur similar overheads or fail to provide the strong consistency guarantees that applications require. We present NOVA, a file system designed to maximize performance on hybrid memory systems while providing strong consistency guarantees. NOVA adapts conventional log-structured file system techniques to exploit the fast random access that NVMs provide. In particular, it maintains separate logs for each inode to improve concurrency, and stores file data outside the log to minimize log size and reduce garbage collection costs. NOVA's logs provide metadata, data, and mmap atomicity and focus on simplicity and reliability, keeping complex metadata structures in DRAM to accelerate lookup operations. Experimental results show that in write-intensive workloads, NOVA provides 22% to 216× throughput improvement compared to state-of-the-art file systems, and 3.1× to 13.5× improvement compared to file systems that provide equally strong data consistency guarantees.

...read moreread less

547 citations

Proceedings Article•

Improving data quality: consistency and accuracy

[...]

Gao Cong¹, Wenfei Fan², Floris Geerts³, Xibei Jia², Shuai Ma² - Show less +1 more•Institutions (3)

Microsoft¹, University of Edinburgh², Transnational University Limburg³

23 Sep 2007

TL;DR: This paper proposes two algorithms: one for automatically computing a repair D' that satisfies a given set of CFDs, and the other for incrementally finding a repair in response to updates to a clean database.

...read moreread less

Abstract: Two central criteria for data quality are consistency and accuracy. Inconsistencies and errors in a database often emerge as violations of integrity constraints. Given a dirty database D, one needs automated methods to make it consistent, i.e., find a repair D' that satisfies the constraints and "minimally" differs from D. Equally important is to ensure that the automatically-generated repair D' is accurate, or makes sense, i.e., D' differs from the "correct" data within a predefined bound. This paper studies effective methods for improving both data consistency and accuracy. We employ a class of conditional functional dependencies (CFDs) proposed in [6] to specify the consistency of the data, which are able to capture inconsistencies and errors beyond what their traditional counterparts can catch. To improve the consistency of the data, we propose two algorithms: one for automatically computing a repair D' that satisfies a given set of CFDs, and the other for incrementally finding a repair in response to updates to a clean database. We show that both problems are intractable. Although our algorithms are necessarily heuristic, we experimentally verify that the methods are effective and efficient. Moreover, we develop a statistical method that guarantees that the repairs found by the algorithms are accurate above a predefined rate without incurring excessive user interaction.

...read moreread less

382 citations

Proceedings Article•10.5555/2750482.2750495•

NV-Tree: reducing consistency cost for NVM-based single level systems

[...]

Jun Yang¹, Qingsong Wei¹, Cheng Chen¹, Chundong Wang¹, Khai Leong Yong¹, Bingsheng He² - Show less +2 more•Institutions (2)

Data Storage Institute¹, Nanyang Technological University²

16 Feb 2015

TL;DR: NV-Tree, a consistent and cache-optimized B+Tree variant with reduced CPU cacheline flush, and NV-Store, a key-value store based on NV- tree, are implemented and evaluated on an NVDIMM server.

...read moreread less

Abstract: The non-volatile memory (NVM) has DRAM-like performance and disk-like persistency which make it possible to replace both disk and DRAM to build single level systems. To keep data consistency in such systems is non-trivial because memory writes may be reordered by CPU and memory controller. In this paper, we study the consistency cost for an important and common data structure, B+Tree. Although the memory fence and CPU cacheline flush instructions can order memory writes to achieve data consistency, they introduce a significant overhead (more than 10X slower in performance). Based on our quantitative analysis of consistency cost, we propose NV-Tree, a consistent and cache-optimized B+Tree variant with reduced CPU cacheline flush. We implement and evaluate NV-Tree and NV-Store, a key-value store based on NV-Tree, on an NVDIMM server. NV-Tree outperforms the state-of-art consistent tree structures by up to 12X under write-intensive workloads. NV-Store increases the throughput by up to 4.8X under YCSB workloads compared to Redis.

...read moreread less

356 citations

...

Expand

Year	Papers
2025	20
2024	26
2023	40
2022	64
2021	32
2020	51

Topic Tools

Papers published on a yearly basis

Papers

The notions of consistency and predicate locks in a database system

Improved R-factors for diffraction data analysis in macromolecular crystallography

NOVA: a log-structured file system for hybrid volatile/non-volatile main memories

Improving data quality: consistency and accuracy

NV-Tree: reducing consistency cost for NVM-based single level systems

Related Topics (5)

Performance Metrics