About: Delta encoding is a research topic. Over the lifetime, 233 publications have been published within this topic receiving 4572 citations. The topic is also known as: delta compression.
TL;DR: It is shown that delta encoding can provide remarkable improvements in response size and response delay for an important subset of HTTP content types, and that the combination of delta encoding and data compression yields the best results.
Abstract: Caching in the World Wide Web currently follows a naive model, which assumes that resources are referenced many times between changes. The model also provides no way to update a cache entry if a resource does change, except by transferring the resource's entire new value. Several previous papers have proposed updating cache entries by transferring only the differences, or "delta," between the cached entry and the current value.In this paper, we make use of dynamic traces of the full contents of HTTP messages to quantify the potential benefits of delta-encoded responses. We show that delta encoding can provide remarkable improvements in response size and response delay for an important subset of HTTP content types. We also show the added benefit of data compression, and that the combination of delta encoding and data compression yields the best results.We propose specific extensions to the HTTP protocol for delta encoding and data compression. These extensions are compatible with existing implementations and specifications, yet allow efficient use of a variety of encoding techniques.
TL;DR: This contribution studies the application of delta compression during the transfer of memory pages in order to increase migration throughput and thus reduce downtime and discusses some general effects ofDelta compression on live migration and analyze when it is beneficial to use this technique.
Abstract: Despite the widespread support for live migration of Virtual Machines (VMs) in current hypervisors, these have significant shortcomings when it comes to migration of certain types of VMs. More specifically, with existing algorithms, there is a high risk of service interruption when migrating VMs with high workloads and/or over low-bandwidth networks. In these cases, VM memory pages are dirtied faster than they can be transferred over the network, which leads to extended migration downtime. In this contribution, we study the application of delta compression during the transfer of memory pages in order to increase migration throughput and thus reduce downtime. The delta compression live migration algorithm is implemented as a modification to the KVM hypervisor. Its performance is evaluated by migrating VMs running different type of workloads and the evaluation demonstrates a significant decrease in migration downtime in all test cases. In a benchmark scenario the downtime is reduced by a factor of 100. In another scenario a streaming video server is live migrated with no perceivable downtime to the clients while the picture is frozen for eight seconds using standard approaches. Finally, in an enterprise application scenario, the delta compression algorithm successfully live migrates a very large system that fails after migration using the standard algorithm. Finally, we discuss some general effects of delta compression on live migration and analyze when it is beneficial to use this technique.
TL;DR: This work evaluates the two main inter-file compression techniques, data chunking and delta encoding, and compares them with traditional intra- file compression, and reports on experimental results from a range of representative archival data sets.
Abstract: The ever-increasing volume of archival data that need to be retained for long periods of time has motivated the design of low-cost, high-efficiency storage systems. Inter-file compression has been proposed as a technique to improve storage efficiency by exploiting the high degree of similarity among archival data. We evaluate the two main inter-file compression techniques, data chunking and delta encoding, and compare them with traditional intra-file compression. We report on experimental results from a range of representative archival data sets.
TL;DR: In this article, a redundancy elimination mechanism is proposed, which applies aspects of duplicate block elimination and delta encoding at the block level, which divides file objects into content-defined blocks or chunks.
Abstract: A redundancy elimination mechanism is provided, which applies aspects of duplicate block elimination and delta encoding at the block level. The redundancy elimination mechanism divides file objects into content-defined blocks or “chunks.” Identical chunks are suppressed. The redundancy elimination mechanism also performs resemblance detection on remaining chunks to identify chunks with sufficient redundancy to benefit from delta encoding of individual chunks. Any remaining chunks that do not benefit from delta encoding are compressed. Resemblance detection is optimized by merging groups of fingerprints into super fingerprints. This merging can be constructed to ensure that if two objects have a single super fingerprint in common, they are extremely likely to be substantially similar.
TL;DR: In this paper, the authors present a new technique for replicating backup datasets across a wide area network (WAN) that not only eliminates duplicate regions of files (deduplication) but also compresses similar areas of files with delta compression, which is available as a feature of EMC Data Domain systems.
Abstract: Replicating data off site is critical for disaster recovery reasons, but the current approach of transferring tapes is cumbersome and error prone. Replicating across a wide area network (WAN) is a promising alternative, but fast network connections are expensive or impractical in many remote locations, so improved compression is needed to make WAN replication truly practical. We present a new technique for replicating backup datasets across a WAN that not only eliminates duplicate regions of files (deduplication) but also compresses similar regions of files with delta compression, which is available as a feature of EMC Data Domain systems.Our main contribution is an architecture that adds stream-informed delta compression to already existing deduplication systems and eliminates the need for new, persistent indexes. Unlike techniques based on knowing a file's version or that use a memory cache, our approach achieves delta compression across all data replicated to a server at any time in the past. From a detailed analysis of datasets and statistics from hundreds of customers using our product, we achieve an additional 2X compression from delta compression beyond deduplication and local compression, which enables customers to replicate data that would otherwise fail to complete within their backup window.