TL;DR: A decoding scheme coding for Huffman codes is proposed, which requires only a few computations per codeword, independent of the number of codewords n, the height of the Huffman tree h, or the length of acodeword.
Abstract: Huffman (1952) codes are being widely used in image and video compression. We propose a decoding scheme coding for Huffman codes, which requires only a few computations per codeword, independent of the number of codewords n, the height of the Huffman tree h, or the length of a codeword. The memory requirement for the proposed scheme depends on the Huffman tree, for sparse Huffman trees (JPEG, H.263, MPEG), it is O(n).
TL;DR: In this paper, the variable length coding process is performed with reference to a variable length code table showing how variable length codes are allocated, and in a comparison process between an event derived from the quantized transform coefficients and a reference event included in the variable-length code table, transformation process are performed to increase a possibility of performing variable length coded with satisfactory coding efficiency.
Abstract: In an image coding method of the present invention, after a process such as DCT is performed to digital image data, quantization process is performed, and then, to resultant quantized transform coefficients, variable length coding process is performed with reference to a variable length code table showing how variable length codes are allocated, and in a comparison process between an event derived from the quantized transform coefficients and a reference event included in the variable length code table, transformation process is performed to increase a possibility of performing variable length coding with satisfactory coding efficiency.
TL;DR: With this algorithm, a compression ratio higher than that of the Lossless JPEG method for 512×512 images can be obtained and the newly proposed algorithm provides a good means for lossless image compression.
Abstract: A novel lossless image-compression scheme is proposed in this paper. A two-stage structure is embedded in this scheme. A linear predictor is used to decorrelate the raw image data in the first stage. Then in the second stage, an effective scheme based on the Huffman coding method is developed to encode the residual image. This newly proposed scheme could reduce the cost for the Huffman coding table while achieving high compression ratio. With this algorithm, a compression ratio higher than that of the Lossless JPEG method for 512×512 images can be obtained. In other words, the newly proposed algorithm provides a good means for lossless image compression.
TL;DR: A new data structure is investigated, which allows fast decoding of texts encoded by canonical Huffman codes, with storage requirements much lower than for conventional Huffman trees, and decoding is faster, because a part of the bit-comparisons necessary for the decoding may be saved.
Abstract: A new data structure is investigated, which allows fast decoding of texts encoded by canonical Huffman codes. The storage requirements are much lower than for conventional Huffman trees, O(log^2 n) for trees of depth O(log n), and decoding is faster, because a part of the bit-comparisons necessary for the decoding may be saved. Empirical results on large real-life distributions show a reduction of up to 50% and more in the number of bit operations. The basic idea is then generalized, yielding further savings.
TL;DR: This work considers the problem of constructing and transmitting the prelude for Huffman (1952) coding and proposes structures that are of direct relevance in applications that mimic one-pass operation through the use of semistatic compression on a block-by-block basis.
Abstract: We consider the problem of constructing and transmitting the prelude for Huffman (1952) coding. With careful organization of the required operations and an appropriate representation for the prelude, it is possible to make semistatic coding efficient even when S, the size of the source alphabet, is of the same magnitude as m, the length of the message being coded. The proposed structures are of direct relevance in applications that mimic one-pass operation through the use of semistatic compression on a block-by-block basis.
TL;DR: A simple parallel algorithm for decoding a Huffman encoded file is presented, exploiting the tendency of Huffman codes to resynchronize quickly in most cases.
Abstract: A simple parallel algorithm for decoding a Huffman encoded file is presented, exploiting the tendency of Huffman codes to resynchronize quickly in most cases. An extention to JPEG decoding is mentioned.
TL;DR: In this article, the Huffman codebook selection section includes a code length calculation section for calculating the code length which would result from a Huffman encoding operation of each group of data using each Huffman Codebook.
Abstract: An encoder of the present invention includes: a number G of storage sections (G is an integer equal to or greater than 1) for storing a number G of groups of data; a Huffman codebook selection section for selecting one of a number H of Huffman codebooks (H is an integer equal to or greater than 1) for each of the groups of data stored in the respective storage sections, each of the Huffman codebooks having a codebook number; a number G of Huffman encoding sections, each of the Huffman encoding sections Huffman-encoding a corresponding one of the G groups of data by using one of the Huffman codebooks which is selected by the Huffman codebook selection section for the one group of data; and a codebook number encoding section for encoding the codebook number of each Huffman codebook selected by the Huffman codebook selection section. The Huffman codebook selection section includes a code length calculation section for calculating a code length which would result from a Huffman encoding operation of each of the G groups of data using each Huffman codebook, and a control section for selecting one of the Huffman codebooks which is suitable for the group of data based on the code length calculated by the code length calculation section. When the Huffman codebook selected is an unsigned codebook, a number of bits required for sign information has previously been added to the code length calculated by the code length calculation section.
TL;DR: In this article, an encoder of the present invention includes: G storage sections for storing G groups of data; a selection section for selecting one of H Huffman codebooks having codebook numbers for each of the groups; G encoding sections Huffman-encoding the G groups by using the selected Huffman codesbook; and an encoding section for encoding the codebook number of each Huffman codedbook selected.
Abstract: An encoder of the present invention includes: G storage sections for storing G groups of data; a selection section for selecting one of H Huffman codebooks having codebook numbers for each of the groups of data; G encoding sections Huffman-encoding the G groups of data by using the selected Huffman codebook; and an encoding section for encoding the codebook number of each Huffman codebook selected. The selection section includes a calculation section for calculating a code length and a control section for selecting one of the Huffman codebooks. When the Huffman codebook selected is an unsigned codebook, a number of bits required for sign information has previously been added to the calculated code length.
TL;DR: This paper presents an efficient algorithm for decoding canonical Huffman codes with lookup tables, which reads a number of bits at each step of the decoding process and outputs the corresponding symbol.
Abstract: Summary form only given. This paper presents an efficient algorithm for decoding canonical Huffman codes with lookup tables. Canonical codes are a subclass of Huffman codes, that have a numerical sequence property, i.e., codewords with the same length are binary representations of consecutive integers. In the case of decoding with look-up tables we read a number of bits at each step of the decoding process. We look up the value of the read bit sequence in a table and if this bit sequence contains a codeword, we output the corresponding symbol. Otherwise we proceed with the decoding, using the next look-up table or using some other method.
TL;DR: The algorithm and data structure described in this work allow fast decoding of Huffman codes, that can be efficiently implemented without using bit operations.
Abstract: Huffman codes or minimum-redundancy prefix codes is one of the most widespread compression techniques. Canonical codes are a subclass of Huffman codes. The canonical codes have a numerical sequence property, i.e. codewords with the same length are binary representations of consecutive integers. Once the length of the current codeword is known, it can be decoded by several arithmetic operations. The algorithm and data structure described in this work allow fast decoding of Huffman codes, that can be efficiently implemented without using bit operations.
TL;DR: In this article, Huffman tables are reduced in size by testing for the length of valid Huffman codes in a compressed data stream and using an offset corresponding to a test criterion yielding a particular test result to provide a direct index into Huffman table symbol values.
Abstract: Huffman encoding, particularly from a packed data format, is simplified by using two different table formats depending on code length. Huffman tables are also reduced in size thereby. Decoding is performed in reduced time by testing for the length of valid Huffman codes in a compressed data stream and using an offset corresponding to a test criterion yielding a particular test result to provide a direct index into Huffman table symbol values while greatly reducing the size of look-up tables used for such a purpose.
TL;DR: A data compression system that employs adaptive Huffman method for generating variable-length codes and is very effective for compressing database files (provides compression ratio up to 52.51%) in a real-time environment.
Abstract: A number of data compression techniques have been introduced to reduce the text/data storage and transmission costs. This paper describes the development of a data compression system that employs adaptive Huffman method for generating variable-length codes. Construction of the tree is discussed for gathering latest information about the entered message. The encoder process of the system encodes frequently occurring characters with shorter bit codes and infrequently appearing characters with longer bit codes. Adaptive, sibling, swapping, escape code, and re-scaling mechanisms of the model are briefly explained as they are extremely useful in enhancing compression efficiency. The decoder process expands the encoded text back to the original text and works very much like the encoder process. Experimental results are tabulated which demonstrate that the developed system is very effective for compressing database files (provides compression ratio up to 52.51%) in a real-time environment.
TL;DR: A novel method for representing and coding wideband signals using permutations and a novel algorithm for encoding the permutation information efficiently that achieves coding gains over Huffman coding is introduced.
Abstract: We introduce a novel method for representing and coding wideband signals using permutations. The signal samples are first sorted, and then encoded using differential pulse code modulation. We show that our method is optimal for DPCM coding and develop a novel algorithm for encoding the permutation information efficiently. We show that the new algorithm achieves coding gains over Huffman coding.
TL;DR: This paper presents a memory-efficient data structure to represent the single-side growing Huffman tree, and designs an O (1) -time parallel Huffman decoding algorithm on a concurrent read exclusive write parallel random-access machine (CREW PRAM) using d processors.
TL;DR: An efficient implementation of a Huffman code is based on the Shannon-Fano construction, and an important question is: how complex is such an implementation?
Abstract: An efficient implementation of a Huffman code is based on the Shannon-Fano construction. An important question is: how complex is such an implementation? In the past authors have considered this question assuming an ordered source symbol alphabet. For of the compression of blocks of binary symbols this ordering must be performed explicitly and it turns out to be the complexity bottleneck.
TL;DR: In this article, gain-adaptive quantization is used to represent digital audio signal components more efficiently using non-uniform length symbols than can be represented by other coding techniques using uniform length symbols.
Abstract: Techniques like Huffman coding can be used to represent digital audio signal components more efficiently using non-uniform length symbols than can be represented by other coding techniques using uniform length symbols. Unfortunately, the coding efficiency that can be achieved by Huffman coding depends on the probability density function of the information to be coded and the Huffman coding process itself requires considerable processing and memory ressources. A coding process that uses gain-adaptive quantization according to the present invention can realize the advantage of using non-uniform length symbols while overcoming the shortcomings of Huffman coding. In gain-adaptive quantization, the magnitudes of signal components to be encoded are compared to one or more thresholds and placed into classes according to the results of the comparison. The magnitudes of the components placed into one of the classes are modified according to a gain factor that is related to the threshold used to classify the components. Preferably, the gain factor may be expressed as a function of only the threshold value. Gain-adaptive quantization may be used to encode frequency subband signals in split-band audio coding systems. Additional features including cascaded gain-adaptive quantization, intra-frame coding, split-interval and non-overloading quantizers are disclosed.
TL;DR: It is shown that with a simple modification to the Huffman algorithm, it is possible to construct a unique Huffman code so that the longest codewords are as short as possible.
Abstract: A Huffman code is an iterative algorithm built over the associated Huffman tree, in which the two nodes with lowest weights are combined into a new node with a weight that is the sum of the weights of its two children. Such a construction is not unique but fortunately with a simple modification to the Huffman algorithm, it is possible to construct a unique Huffman code so that the longest codewords are as short as possible. Here we deal with such modified Huffman codes and present precise asymptotic results on the average redundancy of such codes for memoryless sources.
TL;DR: A lossless coding scheme for the encoding of MPEG-1 Layer III encoded audio bitstreams is described which uses a combination of linear predictive coding and arithmetic coding to exploit redundancies.
Abstract: In this paper we describe a lossless coding scheme for the encoding of MPEG-1 Layer III encoded audio bitstreams. Commonly known as MP3, the MPEG-1 Layer III standard has proved widely popular for the transmission of encoded audio files (MP3's) over the Internet. However, the MPEG-1 Layer III standard has been designed with a wide range of applications in mind. As such, the frame sizes are kept small and redundancies between samples in neighboring frames are not exploited. We propose a design which uses a combination of linear predictive coding and arithmetic coding to exploit such redundancies. The proposed coder was tested on a number of Layer III encoded audio (MP3) files and shown to produce an average coding gain of 12.2% over the original Layer III encoded files.
TL;DR: The tree structure is revived in an enhanced form that allows encoding to progress naturally from root to leaf and no post-encoding reversal is demanded resulting in constant-latency operation regardless of codeword length.
Abstract: The Huffman compression algorithm makes reference to a binary tree abstraction that can be employed directly as a data structure for decoding. Unfortunately, the same convenient arrangement has heretofore not served the encoding task. In this paper, the tree structure is revived in an enhanced form that allows encoding to progress naturally from root to leaf. Because this solution is tree based, codewords are not subject to length limitation. Yet, in marked contrast with other unbounded encoders, memory outlay is fixed by the size of the alphabet. Moreover this storage expense is low in comparison with non-tree-based solutions. Also unlike previous tree structures, no post-encoding reversal is demanded resulting in constant-latency operation regardless of codeword length. Furthermore, only simple addition operators are required at each step. Despite its advantages, implementation is uncomplicated and codebook formatting is trivial.
TL;DR: In this article, a hybrid linear prediction coding (HLC) was proposed to provide high perceptual quality of reproduced speech signals having substantial differences of energy in various frequency bands, such as having a significant amount of information at low frequency and high frequency.
Abstract: A speech coding system that employs hybrid linear prediction coding during extraction of linear prediction coefficients within ITU-Recommendation speech coding standards. The present invention is operable within linear prediction speech coding systems including code-excited linear prediction speech coding systems, and it provides for a substantially improved perceptual quality of reproduced speech signals when compared to conventional speech coding methods that employ the commonly known auto-correlation method that is based on minimizing the linear prediction coding (LPC) prediction error energy. The invention is operable to provide for high perceptual quality of reproduced speech signals having substantial differences of energy in various frequency bands. For example, for speech signals having information dispersed broadly across the frequency spectrum, such as having a significant amount of information at low frequency and a significant amount of information at high frequency, the invention provides a way to maintain a high perceptual quality across the broad frequency range. The invention generates a single set of linear prediction coefficients (LPCs) either directly from the speech signal in certain embodiments of the invention, or alternatively, interveningly through the use of line spectral frequencies (LSFs) that are generated from different sets of linear prediction coefficients (LPCs) generated from the speech signal itself in other embodiments of the invention.
TL;DR: A zonal morphologically based model instead of the traditional word based one is proposed and a new term is introduced, namely semantic coverage, which is an analogue of the compact set over all possible words domain.
Abstract: Summary form only given. We discuss some actual problems of natural language processing. We consider the flexile language case. We propose a zonal morphologically based model instead of the traditional word based one. If we deal with flexile language there should be an interim layer of language units. We assume that the morphs layer should be considered. It seems to be natural and non-restrictive. By their word order we divide them into four categories or zones: prefixes (P), roots (R), suffices (S), and endings (E). We introduce a new term, namely semantic coverage. Semantic coverage is an analogue of the compact set over all possible words domain. We survey some aspects of the architecture of the morphological processing system. We consider modified Huffman coding that is used in facsimile hardware. We know that a facsimile machine processes only black and white pixel series. Furthermore, they alternate constantly. We may map black and white pixel series to morphological zones. Huffman coding prefixes can be redefined to fit the four zone structure. Another way to fit the facsimile paradigm is the two step appliance of Huffman coding, i.e., we join zones by pairs then apply the coding inside joint pairs, and finally, we use the coding for outside pairs. We note that branching of the multi-level system should be reasonable. All modifications of the basic architecture should influence the root part only. The variable part of the root zone is regulated by a threshold for overflow control. Other parts should be considered as unchangeable because of their constancy as morphological units of the language. The perspectives and problems of flexile language text processing are discussed too.
TL;DR: This work introduces conditional Huffman encoding of DCT run-length events to improve the coding efficiency of low- and medium-bit rate video compression algorithms and classifies blocks according to coding mode and signal type and energy.
Abstract: We introduce conditional Huffman encoding of DCT run-length events to improve the coding efficiency of low- and medium-bit rate video compression algorithms. We condition the Huffman code for each run-length event on a classification of the current block. We classify blocks according to coding mode and signal type, which are known to the decoder, and according to energy, which the decoder must receive as side information. Our classification schemes improve coding efficiency with little or no increased running time and some increased memory use.
TL;DR: A combined compression and encryption scheme using Dynamic Huffman Coding is described and how to make it resistant against known attacks is shown and the result of incorporating this system into JPEG is given.
Abstract: Recent developments in the Internet and Web based technologies require faster communication of multimedia data in a secure form. Standard compression algorithms such arithmetic coding schemes, and propose methods of protecting against these attacks. In the next section we review DHC, and in Section 3 describe DHC encryption scheme and examine possible attacks. In Section 4, we propose an encryption scheme that protects against the attacks and in Section 5 give the results of our experiments. Section 6, concludes the paper. as JPEG and MPEG use an entropy coding stage. By incorporating security in this stage it it possible to add security to compression systems at very low cost. In this paper we describe a combined compression and encryption scheme using Dynamic Huffman Coding. We examine se2 curity of the system and show how to make it resistant against known attacks. We also give the result of incorporating this system into JPEG.
TL;DR: It is found that a large saving in complexity, execution time, and memory size is achieved when the commonly-used source encoding algorithms are applied to the n/sup th/ order extension of the resulting binary source.
Abstract: In this paper, we propose an efficient source encoding technique based on mapping a non-binary information source with a large alphabet onto an equivalent binary source using weighted fixed-length code assignments. The weighted codes are chosen such that the entropy of the resulting binary source multiplied by the code length is made as close as possible to that of the original non-binary source. It is found that a large saving in complexity, execution time, and memory size is achieved when the commonly-used source encoding algorithms are applied to the n/sup th/ order extension of the resulting binary source. This saving is due to the large reduction in the number of symbols in the alphabet of the new extended binary source. As an example to validate the effectiveness of this approach, text compression using Huffman encoder applied to the n/sup th/ order extended binary source is studied. It is found that the bit-wise Huffman encoder of the 4th-order extended binary source (16 symbols) achieves compression efficiency close to that of the conventional Huffman encoder (256 symbols).
TL;DR: Agarwal et al. as discussed by the authors described a method for compressing connected component objects (300 ) of bi-level images of images of objects in a bi-dimensional space.
Abstract: Methods, apparatus, and computer readable medium for compressing connected component objects ( 300 ) of bi-level images. The compression apparatus ( 204 ) can take various forms including apparatus for coding a stroke of an object ( 300 ) or for coding the entirety of the object ( 300 ), including plural strokes. The compression apparatus ( 204 ) typically includes a referencing module ( 205 ) for identifying at least one reference node ( 310 ), a coding module ( 206 ) for successively coding pixel runs ( 311-314 ) such that at least one run ( 311 ) is coded relative to the reference node ( 310 ) and other runs ( 312-314 ) are coded relative to previously coded runs, and a closing module ( 207 ) for terminating the process. Certain forms of the apparatus operate in a horizontal or a vertical mode only, never operate in horizontal mode during two consecutive coding operations, code each run using two code-words, and/or utilize modified Huffman coding techniques. Various compression methods of the general nature described above are also disclosed.
TL;DR: It is argued that this class of coders, the Tunstall-like coder, is very appropriate for many situations, especially when the probabiliities vary, when channel errors may occur, or when fast operation is needed.
Abstract: Entropy coders fall into several general categories: Huffman and Huffman-like coders that parse the input into fixed length pieces and encode each with a variable length output, arithmetic coders that take an arbitrarily long string as an input and encode with a single output string, and Tunstall-like coders that parse the input into variable length strings and encode each with a fixed length output. This paper is about a Tunstall-like coder called BAC (for Block Arithmetic Coding). We argue that this class of coders is very appropriate for many situations, especially when the probabiliities vary, when channel errors may occur, or when fast operation is needed. In particular, we discuss how an H.263 encoder/decoder can be modified to replace the syntax arithmetic code with a block arithmetic code to get greater speed and better error resiliency.
TL;DR: It is proved that the maximum data expansion of Huffman coding is at most 0.83485 bits per symbol, improving on the previous best known bound of 1.268bits per symbol.
Abstract: We prove that the maximum data expansion of Huffman coding is at most 0.83485 bits per symbol, improving on the previous best known bound of 1.268 bits per symbol. The bound is very close to the 0.8 bits per symbol conjectured by Cheng et al. (1995).