TL;DR: Two different cost functions for Support Vectors are made use: training with an e insensitive loss and Huber's robust loss function and how to choose the regularization parameters in these models are discussed.
Abstract: Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.
TL;DR: A self-organized neural network performing two tasks: vector quantization of the submanifold in the data set (input space) and nonlinear projection of these quantizing vectors toward an output space, providing a revealing unfolding of theSub manifold.
Abstract: We present a new strategy called "curvilinear component analysis" (CCA) for dimensionality reduction and representation of multidimensional data sets. The principle of CCA is a self-organized neural network performing two tasks: vector quantization (VQ) of the submanifold in the data set (input space); and nonlinear projection (P) of these quantizing vectors toward an output space, providing a revealing unfolding of the submanifold. After learning, the network has the ability to continuously map any new point from one space into another: forward mapping of new points in the input space, or backward mapping of an arbitrary position in the output space.
TL;DR: In this paper, a multimedia compression system for generating frame rate scaleable data in the case of video, and, more generally, universally scaleability data is presented, which is composed of multiple additive layers for each characteristic across which the data is scaleable.
Abstract: A multimedia compression system for generating frame rate scaleable data in the case of video, and, more generally, universally scaleable data. Universally scaleable data is scaleable across all of the relevant characteristics of the data. In the case of video, these characteristics include frame rate, resolution, and quality. The scaleable data generated by the compression system is comprised of multiple additive layers for each characteristic across which the data is scaleable. In the case of video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating each of these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured scalar quantization for generating the quality layers). The compression system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scaleable data.
TL;DR: In this article, a hierarchical vector quantization table that outputs embedded code is proposed for image compression, which can be divided into codebook design and fill-in procedures for each stage, using splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure.
Abstract: An image compression system includes a vectorizer and a hierarchical vector quantization table that outputs embedded code. The vectorizer converts an image into image vectors representing respective blocks of image pixels. The table provides computation-free transformation and compression of the image vectors. Table design can be divided into codebook design and fill-in procedures for each stage. Codebook design for the preliminary stages uses a splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure. Codebook design for the final stage uses a greedily-grown and then entropy-pruned tree-structure variation of GLA with an entropy-constrained distortion measure. Table fill-in for all stages uses an unweighted proximity measure for assigning inputs to codebook vectors. Transformations and compression are fast because they are computation free. The hierarchical, multi-stage, character of the table allow it to operate with low memory requirements. The embedded output allows convenient scalability suitable for collaborative video applications over heterogeneous networks.
TL;DR: A new technique, mean-normalized vector quantization (M-NVQ), is proposed which produces compression performances approaching the theoretical minimum compressed image entropy of 5 bits/pixel.
Abstract: The structure of hyperspectral images reveals spectral responses that would seem ideal candidates for compression by vector quantization. This paper outlines the results of an investigation of lossless vector quantization of 224-band Airborne/Visible Infrared imaging Spectrometer (AVIRIS) images. Various vector formation techniques are identified and suitable quantization parameters are investigated. A new technique, mean-normalized vector quantization (M-NVQ), is proposed which produces compression performances approaching the theoretical minimum compressed image entropy of 5 bits/pixel. Images are compressed from original image entropies of between 8.28 and 10.89 bits/pixel to between 4.83 and 5.90 bits/pixel.
TL;DR: In this paper, a method and apparatus for adaptive bit allocation and hybrid lossless entropy encoding is presented, which includes three components: (1) a transform stage, (2) a quantization stage, and (3) a loss-less entropy coder stage.
Abstract: A method and apparatus for adaptive bit allocation and hybrid lossless entropy encoding. The system includes three components: (1) a transform stage, (2) a quantization stage, and (3) a lossless entropy coder stage. The transform stage (1) uses a wavelet transform algorithm. The quantization stage (2) adaptively estimates values for parameters defining an approximation between quantization size and the logarithm of quantization error, and recursively calculates the optimal quantization size for each band to achieve a desired bit rate. The baseband and subbands are transformed into quantization matrices using the corresponding quantization sizes. The lossless entropy coder stage (3) uses the observation that the entropy property of run lengths of zero index values in the subband quantization matrices is different from the entropy property of non-zero indices. Each quantization matrix is parsed so that each non-zero index is extracted into a separate stream, and the remaining position information is parsed into an odd stream of run length values for "0" and an even stream of run length values for "1". These three streams are Huffman coded separately in conventional fashion.
TL;DR: Predictive rate-distortion (RD) optimized motion estimation techniques are studied and developed for very low bit-rate video coding and indicate that they yield very good computation-performance tradeoffs.
Abstract: Predictive rate-distortion (RD) optimized motion estimation techniques are studied and developed for very low bit-rate video coding. Four types of predictors are studied: mean, weighted mean, median, and statistical mean. The weighted mean is obtained using conventional linear prediction techniques. The statistical mean is obtained using a finite-state machine modeling method based on dynamic vector quantization. By employing prediction, the motion vector search can then be constrained to a small area. The effective search area is reduced further by varying its size based on the local statistics of the motion field, through using a Lagrangian as the search matching measure and imposing probabilistic models during the search process. The proposed motion estimation techniques are analyzed within a simple DCT-based video coding framework, where an RD criterion is used for alternating among three coding modes for each 8/spl times/8 block: motion only, motion-compensated prediction and DCT, and intra-DCT. Experimental results indicate that our techniques yield very good computation-performance tradeoffs. When such techniques are applied to an RD optimized H.263 framework at very low bit rates, the resulting H.263 compliant video coder is shown to outperform the H.263 TMN5 coder in terms of compression performance and computations simultaneously.
TL;DR: In this article, first utterances from a database of known spoken words are classified and segmented into three broad phonetic classes (BPC) voiced, unvoiced, and silence.
Abstract: For machine segmenting of speech, first utterances from a database of known spoken words are classified and segmented into three broad phonetic classes (BPC) voiced, unvoiced, and silence Next, using preliminary segmentation positions as anchor points, sequence-constrained vector quantization is used for further segmentation into phoneme-like units Finally, exact tuning to the segmented phonemes is done through Hidden-Markov Modelling and after training a diphone set is composed for further usage
TL;DR: An image-adaptive JPEG encoding algorithm that jointly optimizes quantizer selection, coefficient "thresholding", and Huffman coding within a rate-distortion (R-D) framework is developed.
Abstract: Striving to maximize baseline (Joint Photographers Expert Group-JPEG) image quality without compromising compatibility of current JPEG decoders, we develop an image-adaptive JPEG encoding algorithm that jointly optimizes quantizer selection, coefficient "thresholding", and Huffman coding within a rate-distortion (R-D) framework. Practically speaking, our algorithm unifies two previous approaches to image-adaptive JPEG encoding: R-D optimized quantizer selection and R-D optimal thresholding. Conceptually speaking, our algorithm is a logical consequence of entropy-constrained vector quantization (ECVQ) design principles in the severely constrained instance of JPEG-compatible encoding. We explore both viewpoints: the practical, to concretely derive our algorithm, and the conceptual, to justify the claim that our algorithm approaches the best performance that a JPEG encoder can achieve. This performance includes significant objective peak signal-to-noise ratio (PSNR) improvement over previous work and at high rates gives results comparable to state-of-the-art image coders. For example, coding the Lena image at 1.0 b/pixel, our JPEG encoder achieves a PSNR performance of 39.6 dB that slightly exceeds the quoted PSNR results of Shapiro's wavelet-based zero-tree coder. Using a visually based distortion metric, we can achieve noticeable subjective improvement as well. Furthermore, our algorithm may be applied to other systems that use run-length encoding, including intraframe MPEG and subband or wavelet coding.
TL;DR: The resulting asymptotic performance in terms of distortion increase in decibels is shown to be linear in the relative entropy between the true and estimated probability densities.
Abstract: Gersho's (1979) bounds on the asymptotic performance of vector quantizers are valid for vector distortions which are powers of the Euclidean norm. Yamada, Tazaki, and Gray (1980) generalized the results to distortion measures that are increasing functions of the norm of their argument. In both cases, the distortion is uniquely determined by the vector quantization error, i.e., the Euclidean difference between the original vector and the codeword into which it is quantized. We generalize these asymptotic bounds to input-weighted quadratic distortion measures and measures that are approximately output-weighted-quadratic when the distortion is small, a class of distortion measures often claimed to be perceptually meaningful. An approximation of the asymptotic distortion based on Gersho's conjecture is derived as well. We also consider the problem of source mismatch, where the quantizer is designed using a probability density different from the true source density. The resulting asymptotic performance in terms of distortion increase in decibels is shown to be linear in the relative entropy between the true and estimated probability densities.
TL;DR: It is shown that getting the smallest clusters under a formal notion of minimizing the maximum intercluster distance does not guarantee an optimal solution for the quantization criterion, Nevertheless the use of an efficient clustering algorithm by Teofilo F. Gonzalez, which is optimal with respect to the approximation bound of the clustering problem has resulted in a fast and effective quantizer.
Abstract: One of the numerical criteria for color image quantization is to minimize the maximum discrepancy between original pixel colors and the corresponding quantized colors. This is typically carried out by first grouping color points into tight clusters and then finding a representative for each cluster. In this article we show that getting the smallest clusters under a formal notion of minimizing the maximum intercluster distance does not guarantee an optimal solution for the quantization criterion. Nevertheless our use of an efficient clustering algorithm by Teofilo F. Gonzalez, which is optimal with respect to the approximation bound of the clustering problem, has resulted in a fast and effective quantizer. This new quantizer is highly competitive and excels when quantization errors need to be well capped and when the performance of other quantizers may be hindered by such factors as low number of quantized colors or unfavorable pixel population distribution. Both computer-synthesized and photographic images are used in experimental comparison with several existing quantization methods.
TL;DR: For a real-valued random variable whose distribution is the classical Cantor probability, the n - quantization error and the n- optimal quantization rules are calculated for every natural number n as mentioned in this paper.
Abstract: For a real-valued random variable whose distribution is the classical Cantor probability, the n - quantization error and the n - optimal quantization rules are calculated for every natural number n. Moreover, the connection between the rate of convergence of the logarithms of the quantization errors for n going to infinity and the Hausdorff dimension of the Cantor set is indicated.
TL;DR: A fast encoding algorithm for vector quantization that uses two characteristics of a vector, mean, and variance simultaneously to save computation time all the more.
Abstract: In this letter, we present a fast encoding algorithm for vector quantization that uses two characteristics of a vector, mean, and variance. Although a similar method using these features was already proposed, it handles these features separately, On the other hand, the proposed algorithm utilizes these features simultaneously to save computation time all the more. Since the proposed algorithm rejects those codewords that are impossible to be the nearest codeword, it produces the same output as the conventional full search algorithm. The simulation results confirm the effectiveness of the proposed algorithm.
TL;DR: By combining the concepts of self-organization and topographic mapping with those of multiscale image segmentation the HSOM alleviates the shortcomings of the conventional SOM in the context of image segmentsation.
TL;DR: This project outlines the project on image coding for mobile wireless environments and investigates hierarchical data structures and related algorithms, and preliminary results are encouraging.
TL;DR: A fully-parallel VQ processor chip for real-time encoding of motion pictures minimizes clock cycles for a single VQ operation and has a versatile code book.
Abstract: A vector-quantization (VQ) processor system has been developed aiming at real-time compression of motion pictures using a 0.6-/spl mu/m triple-metal CMOS technology. The chip employs a fully parallel single-instruction, multiple-data architecture having a two-stage pipeline. Each pipeline segment consists of 19 cycles, thus enabling the execution of a single VQ operation in only 19 clock cycles. As a result, it has become possible to encode a full-color picture of 640/spl times/480 pixels in less than 33 ms, i.e., the real-time compression of moving pictures has become available. The chip is scalable up to eight-chip master-slave configuration in conducting fully parallel search for 2-K template vectors. The chip operates at 17 MHz with a power dissipation of 0.29 W under a power-supply voltage of 3.3 V.
TL;DR: Two quantitative measures are introduced which establish a relationship between the formulation that led to FALVQ algorithms and the competition between the prototypes during the learning process and are tested and evaluated using the IRIS data set.
Abstract: This paper presents a general methodology for the development of fuzzy algorithms for learning vector quantization (FALVQ). The design of specific FALVQ algorithms according to existing approaches reduces to the selection of the membership function assigned to the weight vectors of an LVQ competitive neural network, which represent the prototypes. The development of a broad variety of FALVQ algorithms can be accomplished by selecting the form of the interference function that determines the effect of the nonwinning prototypes on the attraction between the winning prototype and the input of the network. The proposed methodology provides the basis for extending the existing FALVQ 1, FALVQ 2, and FALVQ 3 families of algorithms. This paper also introduces two quantitative measures which establish a relationship between the formulation that led to FALVQ algorithms and the competition between the prototypes during the learning process. The proposed algorithms and competition measures are tested and evaluated using the IRIS data set. The significance of the proposed competition measure is illustrated using FALVQ algorithms to perform segmentation of magnetic resonance images of the brain.
TL;DR: In this article, the authors discuss the applicability of existing distortion measures in video coding to the compression of hyperspectral images and propose a distortion measure called the percentage maximum absolute distortion (PMAD) measure.
TL;DR: This paper has chosen the similarity to a particular variant of vector quantization as the most direct approach to fractal image compression and surveys some of the advanced concepts such as fast decoding, hybrid methods, and adaptive partitionings.
Abstract: Fractal image compression is a new technique for encoding images compactly. It builds on local self-similarities within images. Image blocks are seen as rescaled and intensity transformed approximate copies of blocks found elsewhere in the image. This yields a self-referential description of image data, which --- when decoded --- shows a typical fractal structure. This paper provides an elementary introduction to this compression technique. We have chosen the similarity to a particular variant of vector quantization as the most direct approach to fractal image compression. We discuss the hierarchical quadtree scheme and vital complexity reduction methods. Furthermore, we survey some of the advanced concepts such as fast decoding, hybrid methods, and adaptive partitionings. We conclude with a list of relevant WEB resources including complete public domain C implementations of the method and a comprehensive list of up-to-date references.
TL;DR: A new quantization method for color images that uses a local error optimization strategy to generate near optimal quantization levels that are superior than those of other popular image quantization algorithms.
Abstract: This paper presents a new quantization method for color images. It uses a local error optimization strategy to generate near optimal quantization levels. The algorithm is simple to implement and produces results that are superior than those of other popular image quantization algorithms.
TL;DR: In this article, the degree of similarity between an input vector and all code vectors stored in the codebook is found by approximation for pre-selecting a smaller plural number of code vectors.
Abstract: The processing volume for codebook search for vector quantization is to be diminished. In sending data representing an envelope of spectral components of the harmonics from a spectrum evaluation unit 148 of a sinusoidal analytic encoder 114 to a vector quantizer 116 for vector quantization, the degree of similarity between an input vector and all code vectors stored in the codebook is found by approximation for pre-selecting a smaller plural number of code vectors. From these plural pre-selected code vectors, such a code vector minimizing an error with respect to the input vector is ultimately selected. In this manner, a smaller number of candidate code vectors are pre-selected by pre-selection involving simplified processing and subsequently subjected to ultimate selection with high precision.
TL;DR: In this article, the authors propose an approach for processing acoustic features extracted from a sample of speech data forming a feature vector signal every frame period including a first linear prediction analyzer, a vector quantizer, at least one partitioned quantizer and a scalar quantizer.
Abstract: Apparatus for processing acoustic features extracted from a sample of speech data forming a feature vector signal every frame period includes a first linear prediction analyzer, a vector quantizer, at least one partitioned vector quantizer and a scalar quantizer. The first linear prediction analyzer performs a linear prediction analysis on the feature vector signal to generate a first error vector signal. Next, the vector quantizer performs a vector quantization on the first error signal thereby generating a first index corresponding to a first prestored vector signal which is an approximation of the first error vector signal. The vector quantizer also generates a residual vector signal which is the difference between the first error vector signal and the first prestored approximation vector signal. Next, the at least one partitioned vector quantizer performs a partitioned vector quantization on a first portion of the residual vector signal thereby generating at least one second index corresponding to a second prestored vector signal which is an approximation of the first portion of the residual vector signal. Next, the scalar quantizer performs a scalar quantization on a second portion of the residual vector signal thereby generating a third index corresponding to a prestored scalar signal which is an approximation of the second portion of the residual vector signal. The first, second and third indices are combined to form an encoded vector signal which is a compressed representation of the feature vector signal. The encoded vector signal may be transmitted and/or stored as desired. The feature vector signal may be reconstructed from the encoded vector signal by adding the corresponding prestored signals to the encoded vector signal to form a decompressed representation of the feature vector signal.
TL;DR: In this article, a high efficiency encoding method for encoding data on frequency axis obtained by dividing an input audio signal on block-by-block basis and converting the signal onto the frequency axis, wherein V bands are searched for a band BVH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all bands on the frequency domain.
Abstract: A high efficiency encoding method for encoding data on frequency axis obtained by dividing an input audio signal on block-by-block basis and converting the signal onto the frequency axis, wherein V bands are searched for a band BVH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all bands on the frequency axis, and wherein the number of V bands NV up to the band BVH is found, so as to decide whether proportion of the V bands is equal to or higher than a predetermined threshold Nth, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby to reduce data volume and to reduce bit rate. Also, by using two-stage hierarchical vector quantization in quantizing the data on the frequency axis, operation volume for codebook search and memory capacity of the codebook are reduced.
TL;DR: In this paper, a pre-emphasis step is performed to perform gross decorrelation, followed by an adaptive linear prediction to perform further decorrelation and a transform is performed on the residual of the linear prediction, to obtain transform coefficients representing the residual in the frequency domain.
Abstract: Audio source data is subjected to a pre-emphasis step (302) to perform gross decorrelation, followed by an adaptive linear prediction (306) to perform further decorrelation. A transform is performed on the residual of the linear prediction, to obtain transform coefficients representing the residual in the frequency domain. A number of tonal components are identified (310), subtracted from the transform coefficients and encoded by vector quantization. The transform coefficients are then grouped into sub-bands, and each sub-band encoded in the frequency domain by vector quantization. The sub-bands are of uniform width on an auditory scale, so that each vector may comprise a different number of transform coefficients.
TL;DR: The experiments show that the 3D-DCT video compression using the proposed quantization values produce high compression ratios with good visual quality for the reconstructed video frames.
TL;DR: Experimental results show that the proposed method can produce good visual quality videos and can achieve an attractive compression ratio of 16:1 by quantization alone.
Abstract: The paper proposes a novel method to generate quantization values for 3D-DCT coefficients. Such quantization values are important as they would affect significantly the extent of the compression achieved by any 3D-DCT based video compression scheme. It was found that the distribution of the dominant AC coefficients spread along the major axes of the 3D-DCT cube. The coefficient distribution can be modeled by dividing the cube into two regions, one for most of the significant coefficients and the other for the high frequency coefficients which can be discarded. The quantization values are computed by using an exponential function. The same function is also used to determine a scan order of the 3D-DCT coefficients, which is better than a three dimensional zig-zag scan method in general. Experimental results show that the proposed method can produce good visual quality videos and can achieve an attractive compression ratio of 16:1 by quantization alone.
TL;DR: Results indicate that good visual quality can be achieved for very low bit-rate coding of underwater video with the proposed wavelet-based hybrid video encoder which employs entropy-constrained vector quantization (ECVQ) with overlapped block-based motion compensation.
Abstract: Recent advances in autonomous underwater vehicle (AUV) and underwater communication technology have promoted a surge of research activity within the area of signal and information processing. A new application is proposed herein for capturing and processing underwater video onboard an untethered AUV, then transmitting it to a remote platform using acoustic telemetry. Since video communication requires a considerably larger bandwidth than that provided by an underwater acoustic channel, the data must be massively compressed prior to transmission from the AUV. Past research has shown that the low contrast and low-detailed nature of underwater imagery allows for low-bit-rate coding of the data by wavelet-based image-coding algorithms. In this work, these findings have been extended to the design of a wavelet-based hybrid video encoder which employs entropy-constrained vector quantization (ECVQ) with overlapped block-based motion compensation. The ECVQ codebooks were designed from a statistical source model which describes the distribution of high subband wavelet coefficients in both intraframe and prediction error images. Results indicate that good visual quality can be achieved for very low bit-rate coding of underwater video with our algorithm.
TL;DR: In this article, a block based signal compression system, employing quantization of codewords or transform coefficients including circuitry for adaptively controlling the quantization is made a function of several parameters, such as coding cost or bandwidth, wherein coding cost is generated on a macroblock basis but averaged over a window of macroblocks centered on the macroblock currently to be quantized.
Abstract: A block based signal compression system, as for example for coding video signal, and employing quantization of codewords or transform coefficients includes circuitry for adaptively controlling the quantization. Adaptivity of quantization is made a function of several parameters. One parameter is coding cost or bandwidth, wherein coding cost is generated on a macroblock basis but averaged over a window of macroblocks centered on a macroblock currently to be quantized. Another parameter is block or macroblock motion attributes, wherein block motion attribut values are used to modify the quantizing function.
TL;DR: A new side-match vector quantizer, NewSMVQ, is presented, which outperforms SMVQ and CSMVQ in terms of bit rate versus image quality tradeoffs.
Abstract: A new side-match vector quantizer, NewSMVQ, is presented in this paper. Three techniques are incorporated to improve the image quality, encoding speed, and bit rate for compressing images. The experimental result shows: i) the encoding time of NewSMVQ is almost 7 times faster than that of SMVQ (ordinary fixed-rate side-match vector quantizer) and CSMVQ (variable-rate SMVQ) and ii) NewSMVQ outperforms SMVQ and CSMVQ in terms of bit rate versus image quality tradeoffs.
TL;DR: In this article, the quantization noise produced during signal compression is made independent from and non-orthogonal (i.e., uncorrelated) to the original signal.
Abstract: A data compression system 200, method, and apparatus 214 employs an encoder 210 optimized to decorrelate and make independent from the original signal 202, the quantization noise produced during signal compression. The proposed system 200, method, and apparatus 214 supports high degrees of signal compression, which in turn leads to lower computational complexity and improved performance. Because the quantization noise produced during signal compression is made independent from and non-orthogonal (i.e., uncorrelated) to the original signal 202, enhanced filtering is achievable, which in turn leads to improvements in the signal-to-noise ratio (SNR) of the decoder 220.