TL;DR: In this article, a sliding window data compression algorithm is combined with Huffman encoding on the strings and raw bytes, and the Huffman table, in a compressed form, is prepended to the encoded output data.
Abstract: An apparatus and method for converting an input data character stream into a variable length encoded data stream in a data compression system. A sliding window data compression algorithm is combined with Huffman encoding on the strings and raw bytes. The Huffman table, in a compressed form, is prepended to the encoded output data. The Huffman codes representing the shortest strings encode both the string length and part of the string offset. Assigning Huffman codes to represent the combined length and offset allows the use of a smaller sliding window size without sacrificing compression ratio. The smaller window size allows implementations in software and hardware to minimize memory usage, thus reducing cost.
TL;DR: In this article, the Huffman table that provides the smallest number of bits to encode is selected for encoding the set of signals into the encoded bitstream, and a histogram is used to determine the number of bytes needed to encode the signals.
Abstract: Selecting a Huffman table to encode a set of signals, such as video signals, from a set of predefined Huffman tables. A histogram is generated for the set of signals and used to determine the number of bits to encode the set of signals for each of the predefined Huffman tables. The Huffman table that provides the smallest number of bits to encode is selected for encoding the set of signals into the encoded bitstream. In a preferred embodiment, a dynamic Huffman table is also generated for the set of signals and used to estimate the number of bits to encode the set of signals including the number of bits to encode the dynamic Huffman table. If using the dynamic Huffman table results in fewer bits in the encoded bitstream, then the dynamic Huffman table is used (and explicitly encoded in the bitstream) instead of the selected predefined Huffman table.
TL;DR: A tree clustering and a pattern matching algorithm to avoid high sparsity of the tree is proposed and the method is shown to be very efficient in memory size and fast searching for the symbol.
Abstract: The fast Huffman decoding algorithm has been used in JPEG, MPEG and image data compression standards, etc. And code compression is a key element in high speed digital data transport. A major compression is performed by converting the fixed-length codes to variable-length codes through an entropy coding scheme. Huffman coding combined with run-length coding is shown to be a very efficient coding scheme. To speed up the process of search for a symbol in a Huffman tree and to reduce the memory size we have proposed a tree clustering and a pattern matching algorithm to avoid high sparsity of the tree. The method is shown to be very efficient in memory size and fast searching for the symbol. For an experimental video data with Huffman codes extended up to 16 bits in length, i.e. it is used for the standard JPEG, the result of experiments show that the proposed algorithm has a very high speed and performance. The design of the decoder is carried out using silicon-gate CMOS process. >
TL;DR: A new approach to adaptive Huffman coding of 2-D DCT coefficients for image sequence compression based on the popular motion-compensated interframe coding, which employs self-switching multiple Huffman codebooks for entropy coding of quantized transform coefficients.
Abstract: This paper presents a new approach to adaptive Huffman coding of 2-D DCT coefficients for image sequence compression. Based on the popular motion-compensated interframe coding, the proposed method employs self-switching multiple Huffman codebooks for entropy coding of quantized transform coefficients. Unlike the existing multiple codebook approaches where the type of block (intra/inter or luminance/chrominance) selects a codebook, the proposed method jointly utilizes the type of block, the quantizer step size, and the zigzag scan position for the purpose of codebook selection. In addition, as another utilization of the quantizer step size and the scan position, the proposed method uses a variable-length “Escape” sequence for encoding rare symbols. Experimental results show that the proposed method with two codebooks provides 0.1–0.4 dB improvement over the single-codebook scheme and this margin turns out to be substantially larger than that the MPEG-2, two-codebook approach has over the single-codebook approach.
TL;DR: The programmable scheme can be easily integrated into data paths of video processors to support different Huffman tables used in image/video applications.
Abstract: Huffman coding, a variable-length entropy coding scheme, is an integral component of international standards on image and video compression including high-definition television (HDTV). The high-bandwidth HDTV systems of data rate in excess of 100 Mpixels/s presents a challenge for designing a fast and economic circuit for intrinsically sequential Huffman decoding operations. This paper presents an algorithm and a circuit implementation for parallel decoding of programmable Huffman codes by using the numerical properties of Huffman codes. The 1.2 /spl mu/m CMOS implementation for a single JPEG AC table of 256 codewords of up to 16-b codeword lengths is estimated to run at 10 MHz with a chip area of 11 mm/sup 2/, decoding one codeword per cycle. The design can be pipelined to deliver a throughput of 80 MHz for decoding input streams of consecutive Huffman codes. Furthermore, our programmable scheme can be easily integrated into data paths of video processors to support different Huffman tables used in image/video applications. >
TL;DR: In this article, the authors provided a coding apparatus including an input unit for inputting image data, a coding unit for coding the image data input by the input unit, and an error detection/correction coding unit to perform error detection coding or correction coding of the image coded by the coding unit.
Abstract: According to the present invention, there is provided a coding apparatus including an input unit for inputting image data, a coding unit for coding the image data input by the input unit, and an error detection/correction coding unit for performing error detection coding or correction coding of the image data coded by the coding unit, the coding unit coding the image data by selectively using first and second coding modes based on different coding methods, and the error detection/correction coding unit performing different error correction and correction coding operations depending on the coding mode. In addition, there is provided a decoding apparatus for inputting coded data which is coded by selectively using first and second coding modes based on different coding methods and has undergone different error correction coding operations depending on the first and second coding modes, and coding mode data indicating a specific one of the first and second coding modes by which the coded data has been coded, and decoding the coded data, including a detection unit for detecting the coding mode data, and an error code correction unit for performing error code correction of the coded data in accordance with an output from the detection unit.
TL;DR: A scheme that uses two alternating Huffman codes to encode a discrete independent and identically distributed source with a dominant symbol, and allows the most likely symbol to be encoded using less than one bit per sample is examined.
Abstract: In this article we examine a scheme that uses two alternating Huffman codes to encode a discrete independent and identically distributed source with a dominant symbol. One Huffman code encodes the length of runs of the dominant symbol, the other encodes the remaining symbols. We call this combined strategy alternating runlength Huffman (ARH) coding. This is a popular scheme, used for example in the efficient pyramid image coder (EPIC) subband coding algorithm. Since the runlengths of the dominant symbol are geometrically distributed, they can be encoded using the Huffman codes identified by Golomb (1966) and later generalized by Gallager and Van Voorhis (1975). This runlength encoding allows the most likely symbol to be encoded using less than one bit per sample, providing a simple method for overcoming a drawback of prefix codes-that the redundancy approaches one as the largest symbol probability P approaches one. For ARH coding, the redundancy approaches zero as P approaches one. Comparing the average code rate of ARH with direct Huffman coding we find that: 1. If P 0.618, ARH is more efficient than Huffman coding. We give examples of applying ARH coding to some specific sources.
TL;DR: In the method, the global level dependencies are thus handled by block coding and the local pixel-to-pixel dependencies by Huffman coding, in which HuffMan coding is used to encode the different bit patterns at the lowest level of the hierarchy.
TL;DR: In this article, a code length calculation circuit has an AC code length table for prestoring variable-length codes and their corresponding code lengths in corresponding relationship, and the circuit can compute from the variable length code inputted its code length according to the AC codelength table.
Abstract: In a digital coding apparatus or a digital coding and decoding apparatus for image data compression and expansion by means of Huffman coding, a Huffman coding circuit converts a combination of ZERO RUN and VALUE into a variable-length code. A code length calculation circuit has an AC code length table for prestoring variable-length codes and their code lengths in corresponding relationship. The code length calculation circuit inputs not a ZERORUN-VALUE combination but a variable-length code from the Huffman coding circuit, thereby calculating from the variable-length code inputted its code length according to the AC code length table.
TL;DR: Novel tools from finite group theory are applied to derive a compact form of representation for permutation, called permutation-cyclic-representation (PCR) vectors, with which various regularities and constraints in the structure of positional information are displayed, whereby the coding is made very easy using a runlength and Huffman method.
Abstract: We present the theory and practice of permutation coding as a new tool for very low-bit-rate image compression. Conventional source coding deals with the data information of signals, while the permutation coding achieves compression through efficiently representing the positional information (i.e., position permutation) caused by ordering the data information into order statistics. A set of four theorems is presented. The first one reveals the information-theoretic relationship between data and permutation information and the rest solves the efficient coding problem. For this, novel tools from finite group theory are applied to derive a compact form of representation for permutation, called permutation-cyclic-representation (PCR) vectors, with which various regularities and constraints in the structure of positional information are displayed, whereby the coding is made very easy using a runlength and Huffman method. A block DCT-based permutation coding algorithm (the BCPC) is developed attempting to combine the DCT's excellent features of energy packing and magnitude ordering that are found to be amenable to permutation coding. This mutually beneficial characteristic significantly reduces the coding bit-rate. Simulation results are provided for real images, showing an improvement by 3-4 dB in the peak-SNR index as compared to those representing the state-of-the-art.
TL;DR: This algorithm should also be applicable to other ideogram-based or oriental-language texts and has the potential to reduce the dictionary size in a bigram- or trigram-based semi-adaptive compression scheme for English texts.
TL;DR: A two-stage lossless image compression scheme is presented, the first stage of which decorrelates the raw image data by using a two-dimensional, causal linear predictor, resulting in significant entropy reduction.
Abstract: A two-stage lossless image compression scheme is presented, the first stage of which decorrelates the raw image data by using a two-dimensional, causal linear predictor, resulting in significant entropy reduction. The second stage uses standard coding techniques. Different two-stage schemes are compared with respect to their performances on several images, using bilevel coding, arithmetic coding, and adaptive Huffman coding.
TL;DR: The authors aim is to obtain a simple and practical statistical algorithm in order to improve the processing speed while maintaining a high compression ratio.
Abstract: Summary form only given. Dynamic Huffman coding uses a binary code tree data structure to encode the relative frequency counts of the symbols being coded. The authors aim is to obtain a simple and practical statistical algorithm in order to improve the processing speed while maintaining a high compression ratio. The algorithm proposed uses a self-organizing rule (transpose heuristic) to reconstruct the code tree. It renews the code tree by only switching the ordered positions of corresponding symbols. This method is called self organized dynamic Huffman coding. To achieve a higher compression ratio they employ context modelling.
TL;DR: Two new algorithms that are based on the 16-bit or 32-bit sampling character set and on the unique features of languages with a large number of distinct characters to improve the data compression ratios for multilingual text documents are proposed.
Abstract: Summary form only given. We propose two new algorithms that are based on the 16-bit or 32-bit sampling character set and on the unique features of languages with a large number of distinct characters to improve the data compression ratios for multilingual text documents. We choose Chinese language using 16 bit character sampling as the representative language in our study. The first approach, called the static Chinese Huffman coding, introduces the concept of a single Chinese character in the Huffman tree. Experimental results showed that the improvement in compression ratio obtained. The second approach, called the dictionary-based Chinese Huffman coding, includes the concept of Chinese words in the Huffman coding.
TL;DR: This algorithm has the potential to reduce the dictionary size in a bigram or trigram-based semi-adaptive compression scheme for English texts and should also be applicable to other ideogram-based or oriental language texts.
Abstract: This paper presents a data compression scheme for Chinese text files. Due to the skewness of the distribution of Chinese ideograms, the Huffman coding method is adopted. By storing the Huffman tree in the coding table and representing the Huffman tree using the Zaks sequence, the algorithm produces significant improvement on the compression results. The proposed method is evaluated by comparing its performance with three well-known compression algorithms and an algorithm specially designed to compress the coding table. This algorithm should also be applicable to other ideogram-based or oriental language texts. Also, it has the potential to reduce the dictionary size in a bigram or trigram-based semi-adaptive compression scheme for English texts.
TL;DR: A Huffman decoder for decoding data words encoded according to the Huffman coding can be found in this paper, where the decoder can decode data words according to either H.261 or MPEG standards.
Abstract: A Huffman decoder for decoding data words encoded according to the Huffman coding
provisions of either H.261 or MPEG standards, the data words including an identifier that
identifies the Huffman code standard under which the data words were coded, comprising :
means for receiving the Huffman coded data words, including means for reading the
identifier to determine which standard governed the Huffman coding of the received data words,
and means for converting the data words to JPEG Huffman coded data words, if necessary, in
response to reading the identifier that identifies the Huffman coded data words as H.261 or
MPEG Huffman coded ;
means, operably connected to the Huffman coded data words receiving means, for generating an
index number associated with each JPEG Huffman coded data word receiving an index number
from the index number generating means, and including an output that is a decoded data word
corresponding to the index number.
TL;DR: A tree clustering algorithm is proposed to speed up the process of search for a symbol in a Huffman tree and to reduce the memory size to avoid high sparsity of the tree.
Abstract: Code compression is a key element in high-speed digital data transport. A major compression is performed by converting the fixed-length codes to variable-length codes through a (semi-)entropy coding scheme. Huffman coding is shown to be a very efficient coding scheme. To speed up the process of search for a symbol in a Huffman tree and to reduce the memory size we have proposed a tree clustering algorithm to avoid high sparsity of the tree. The method is shown to be extremely efficient in memory requirement, and fast in searching for the symbol. For an experimental video data with Huffman codes extended up to 13 bits in length, the entire memory space is shown to be 122 words, compared to 2/sup 13/=8192 words in a normal situation. >
TL;DR: The requirements on causality and computational complexity implied by arithmetic and zerotree coding will be studied and other schemes proposed for the choice of the predictive coefficient contexts that are suggested by image analysis.
Abstract: Image coding requires an effective representation of images to provide dimensionality reduction, a quantization strategy to maintain quality, and finally the error free encoding of quantized coefficients In the coding of quantized coefficients, Huffman coding and arithmetic coding have been used most commonly and are suggested as alternatives in the JPEG standard In some recent work, zerotree coding has been proposed as an alternate method, that considers the dependece of quantized coefficients from subband to subband, and thus appears as a generalization of the context-based approach often used with arithmetic coding In this paper, we propose to review these approaches and discuss them as special cases of an analysis based approach to the coding of coefficients The requirements on causality and computational complexity implied by arithmetic and zerotree coding will be studied and other schemes proposed for the choice of the predictive coefficient contexts that are suggested by image analysis