Top 376 papers published in the topic of Vector quantization in 1998

Showing papers on "Vector quantization published in 1998"

Continuous probabilistic transform for voice conversion

[...]

Yannis Stylianou¹, Olivier Cappé², Eric Moulines²•Institutions (2)

01 Mar 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

Abstract: Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic+noise model) system, which allows high-quality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

1,188 citations

Journal Article•10.1109/5.726788•

Deterministic annealing for clustering, compression, classification, regression, and related optimization problems

[...]

Kenneth Rose¹•Institutions (1)

University of California, Santa Barbara¹

1 Nov 1998

TL;DR: The deterministic annealing approach to clustering and its extensions has demonstrated substantial performance improvement over standard supervised and unsupervised learning methods in a variety of important applications including compression, estimation, pattern recognition and classification, and statistical regression.

...read moreread less

Abstract: The deterministic annealing approach to clustering and its extensions has demonstrated substantial performance improvement over standard supervised and unsupervised learning methods in a variety of important applications including compression, estimation, pattern recognition and classification, and statistical regression. The application-specific cost is minimized subject to a constraint on the randomness of the solution, which is gradually lowered. We emphasize the intuition gained from analogy to statistical physics. Alternatively the method is derived within rate-distortion theory, where the annealing process is equivalent to computation of Shannon's rate-distortion function, and the annealing temperature is inversely proportional to the slope of the curve. The basic algorithm is extended by incorporating structural constraints to allow optimization of numerous popular structures including vector quantizers, decision trees, multilayer perceptrons, radial basis functions, and mixtures of experts.

...read moreread less

1,032 citations

Book Chapter•10.1007/978-3-642-56927-2_6•

Learning vector quantization

[...]

Teuvo Kohonen¹•Institutions (1)

Helsinki University of Technology¹

1 Oct 1998

TL;DR: While VQ and the basic SOM are unsupervised clustering and learning methods, LVQ describes supervised learning, unlike in SOM, no neighborhoods around the “winner” are defined during learning in the basic LVQ, whereby also no spatial order of the codebook vectors is expected to ensue.

...read moreread less

Abstract: Closely related to VQ and SOM is Learning Vector Quantization (LVQ). This name signifies a class of related algorithms, such as LVQ1, LVQ2, LVQ3, and OLVQ1. While VQ and the basic SOM are unsupervised clustering and learning methods, LVQ describes supervised learning. On the other hand, unlike in SOM, no neighborhoods around the “winner” are defined during learning in the basic LVQ, whereby also no spatial order of the codebook vectors is expected to ensue.

...read moreread less

547 citations

Journal Article•10.1109/34.655653•

Reduced multidimensional co-occurrence histograms in texture classification

[...]

K. Valkealahti¹, Erkki Oja¹•Institutions (1)

Helsinki University of Technology¹

01 Jan 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experiments with natural textures showed that multidimensional histograms reduced with the new method provided higher classification accuracies than the channel histograms and the wavelet packet signatures and was significantly faster than the previous method.

...read moreread less

Abstract: Textures are frequently described using co-occurrence histograms of gray levels at two pixels in a given relative position. Analysis of several co-occurring pixel values may benefit texture description but is impeded by the exponential growth of histogram size. To make use of multidimensional histograms, we have developed methods for their reduction. The method described here uses linear compression, dimension optimization, and vector quantization. Experiments with natural textures showed that multidimensional histograms reduced with the new method provided higher classification accuracies than the channel histograms and the wavelet packet signatures. The new method was significantly faster than our previous one.

...read moreread less

139 citations

Proceedings Article•10.1109/ICASSP.1998.675430•

Compression of acoustic features for speech recognition in network environments

[...]

Ganesh N. Ramaswamy¹, P.S. Gopalakrishnan•Institutions (1)

IBM¹

12 May 1998

TL;DR: The proposed compression algorithm uses a combination of simple techniques, such as linear prediction and multi-stage vector quantization, and the current version of the algorithm encodes the acoustic features at a fixed rate of 4.0 kbit/s.

...read moreread less

Abstract: In this paper, we describe a new compression algorithm for encoding acoustic features used in typical speech recognition systems. The proposed algorithm uses a combination of simple techniques, such as linear prediction and multi-stage vector quantization, and the current version of the algorithm encodes the acoustic features at a fixed rate of 4.0 kbit/s. The compression algorithm can be used very effectively for speech recognition in network environments, such as those employing a client-server model, or to reduce storage in general speech recognition applications. The algorithm has also been tuned for practical implementations, so that the computational complexity and memory requirements are modest. We have successfully tested the compression algorithm against many test sets from several different languages, and the algorithm performed very well, with no significant change in the recognition accuracy due to compression.

...read moreread less

92 citations

Proceedings Article•

Recognizing emotions in speech using short-term and long-term features.

[...]

Yang Li, Yunxin Zhao

1 Jan 1998

TL;DR: This study attempted to recognize the emotional status of individual speakers by using speech features that were extracted from short-time analysis frames as well as speech features That represented entire utterances to analyze the importance of individual features in representing emotional categories.

...read moreread less

Abstract: The acoustic characteristics of speech are influenced by speakers’ emotional status. In this study, we attempted to recognize the emotional status of individual speakers by using speech features that were extracted from short-time analysis frames as well as speech features that represented entire utterances. Principal component analysis was used to analyze the importance of individual features in representing emotional categories. Three classification methods including vector quantization, artificial neural networks and Gaussian mixture density model were used. Classifications using short-term features only, long-term features only and both short-term and long-term features were conducted. The best recognition performance of 62% accuracy was achieved by using the Gaussian mixture density method with both short-term and longterm features.

...read moreread less

92 citations

Journal Article•10.1109/26.725308•

Joint design of fixed-rate source codes and multiresolution channel codes

[...]

Andrea Goldsmith¹, Michelle Effros¹•Institutions (1)

California Institute of Technology¹

01 Oct 1998-IEEE Transactions on Communications

TL;DR: The fixed transmission rate constraint is relaxing by jointly optimizing the transmission rate, source code, and channel code and it is found that all three code designs have roughly the same performance when their bit allocations are optimized.

...read moreread less

Abstract: We propose three new design algorithms for jointly optimizing source and channel codes. Our optimality criterion is to minimize the average end-to-end distortion. For a given channel SNR and transmission rate, our joint source and channel code designs achieve an optimal allocation of bits between the source and channel coders. Our three techniques include a source-optimized channel code, a channel-optimized source code, and an iterative descent technique combining the design strategies of the other two codes. The joint designs use channel-optimized vector quantization (COVQ) for the source code and rate compatible punctured convolutional (RCPC) coding for the channel code. The optimal bit allocation reduces distortion by up to 6 dB over suboptimal allocations and by up to 4 dB relative to standard COVQ for the source data set considered. We find that all three code designs have roughly the same performance when their bit allocations are optimized. This result follows from the fact that at the optimal bit allocation the channel code removes most of the channel errors, in which case the three design techniques are roughly equivalent. We also compare the robustness of the three techniques to channel mismatch. We conclude the paper by relaxing the fixed transmission rate constraint and jointly optimizing the transmission rate, source code, and channel code.

...read moreread less

79 citations

Journal Article•10.1109/30.735818•

A fast LBG codebook training algorithm for vector quantization

[...]

Chin-Chen Chang¹, Yu-Chen Hu•Institutions (1)

National Chung Cheng University¹

01 Nov 1998-IEEE Transactions on Consumer Electronics

TL;DR: A fast codebook training algorithm based on the Linde, Buzo and Gray (1980) LBG algorithm, which provides a flexible way of selecting the test conditions to accommodate the different image training sets and a significant reduction in computation cost is obtained.

...read moreread less

Abstract: A fast codebook training algorithm based on the Linde, Buzo and Gray (1980) LBG algorithm is proposed. The fundamental goal of this method is to reduce the computation cost in the codebook training process. In this method, a kind of mean-sorted partial codebook search algorithm is applied to the closest codeword search. At the same time, a generalized integral projection model is developed for the generation of test conditions, which are used to speed up the search process in finding the closest codeword for each training vector. With this proposed method, a significant time reduction can be achieved by avoiding the computation of unnecessary codewords. Our simulation results show that a significant reduction in computation cost is obtained with this proposed method. Besides, this method provides a flexible way of selecting the test conditions to accommodate the different image training sets.

...read moreread less

74 citations

Journal Article•10.1109/83.718479•

Error-resilient pyramid vector quantization for image compression

[...]

A.C. Hung, E.K. Tsern, Teresa H. Meng

01 Oct 1998-IEEE Transactions on Image Processing

TL;DR: This paper proposes a new method of deriving the indices of the lattice points of the multidimensional pyramid and describes how these techniques can also improve the channel noise immunity of general symmetric lattice quantizers.

...read moreread less

Abstract: Pyramid vector quantization (PVQ) uses the lattice points of a pyramidal shape in multidimensional space as the quantizer codebook. It is a fixed-rate quantization technique that can be used for the compression of Laplacian-like sources arising from transform and subband image coding, where its performance approaches the optimal entropy-coded scalar quantizer without the necessity of variable length codes. In this paper, we investigate the use of PVQ for compressed image transmission over noisy channels, where the fixed-rate quantization reduces the susceptibility to bit-error corruption. We propose a new method of deriving the indices of the lattice points of the multidimensional pyramid and describe how these techniques can also improve the channel noise immunity of general symmetric lattice quantizers. Our new indexing scheme improves channel robustness by up to 3 dB over previous indexing methods, and can be performed with similar computational cost. The final fixed-rate coding algorithm surpasses the performance of typical Joint Photographic Experts Group (JPEG) implementations and exhibits much greater error resilience.

...read moreread less

72 citations

Patent•

Quantization matrix for still and moving picture coding

[...]

Sheng Mei Shen¹, Thiow Keng Tan¹•Institutions (1)

Panasonic¹

5 Feb 1998

TL;DR: In this article, an encoder and a decoder for still and moving picture are disclosed, and the encoder has a memory for storing a default quantization matrix including a plurality of quantization elements having predetermined values.

...read moreread less

Abstract: An encoder and a decoder for still and moving picture are disclosed. The encoder has a memory for storing a default quantization matrix including a plurality of quantization elements having predetermined values. Also, a generator is provided for producing a particular quantization matrix after a number of frames. The particular quantization matrix is read in a predetermined zigzag pattern, and the reading is terminated at a selected position which is in the middle of the zigzag pattern. An end code is added after the read quantization elements of a former portion of the particular quantization matrix. The quantization elements in the default quantization matrix are read in the same zigzag pattern from a position immediately after the selected position, and producing a latter portion of the default quantization matrix. The former portion of the particular quantization matrix and the latter portion of the default quantization matrix are synthesized to form a synthesized quantization matrix.

...read moreread less

71 citations

Book Chapter•10.1007/978-1-4471-1599-1_133•

A Neural Network Approach to Functional MRI Pattern Analysis — Clustering of Time-Series by Hierarchical Vector Quantization

[...]

Axel Wismüller¹, Dominik R. Dersch, Bernadette Lipinski², Klaus Hahn¹, Dorothee P. Auer² - Show less +1 more•Institutions (2)

Ludwig Maximilian University of Munich¹, Max Planck Society²

2 Sep 1998

TL;DR: As minimal free energy VQ represents a hierarchical data analysis strategy implying repetitive cluster splitting, it can provide a natural approach to the subclassification task of activated brain regions on different scales of resolution with respect to fine-grained differences in pixel dynamics.

...read moreread less

Abstract: In this paper, we present a neural network approach to hierarchical unsupervised clustering of functional magnetic resonance imaging (fMRI) time-sequences of the human brain by self-organized fuzzy minimal free energy vector quantization (VQ). In contrast to conventional model-based fMRI data analysis techniques, this deterministic annealing procedure does not imply presumptive knowledge of expected stimulus-response patterns, and, thus, may be applied to fMRI experiments in which the time course of the stimulus is unknown like in spontaneously occurring events, e.g. hallucinations, epileptic fits, or sleep. Moreover, as minimal free energy VQ represents a hierarchical data analysis strategy implying repetitive cluster splitting, it can provide a natural approach to the subclassification task of activated brain regions on different scales of resolution with respect to fine-grained differences in pixel dynamics.

...read moreread less

Patent•10.1121/1.429549•

Soft-clipping postprocessor scaling decoded audio signal frame saturation regions to approximate original waveform shape and maintain continuity

[...]

Shuwu Wu, John Mantegna

13 Oct 1998-Journal of the Acoustical Society of America

TL;DR: An audio coder/decoder that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization, which provides low bit-rate compression for music and speech, while being applicable to higher bit- rate audio compression.

...read moreread less

Abstract: An audio coder/decoder ("codec") that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization. The codec provides low bit-rate compression for music and speech, while being applicable to higher bit-rate audio compression. The codec includes an in-path implementation of psychoacoustic spectral masking, and frequency domain quantization using the novel ASVQ scheme and algorithms specific to audio compression. More particularly, the inventive audio codec employs frequency domain quantization with critically sampled subband filter banks to maintain time domain continuity across frame boundaries. The input audio signal is transformed into the frequency domain in which in-path spectral masking can be directly applied. This in-path spectral masking usually results in sparse vectors. The ASVQ scheme is a vector quantization algorithm that is particularly effective for quantizing sparse signal vectors. In the preferred embodiment, ASVQ adaptively classifies signal vectors into six different types of sparse vector quantization, and performs quantization accordingly. The ASVQ technique applies to general purpose data quantization as well as to quantization in the context of audio compression. The invention also includes a "soft clipping" algorithm in the decoder as a post-processing stage. The soft clipping algorithm preserves the waveform shapes of the reconstructed time domain audio signal in a frame- or block-oriented stateless manner while maintaining continuity across frame or block boundaries. The invention includes related methods, apparatus, and computer programs.

...read moreread less

Journal Article•10.1117/1.601905•

Unsupervised interference rejection approach to target detection and classification for hyperspectral imagery

[...]

Chein-I Chang¹, Tzu-Lung Sun¹, Mark L.G. Althouse²•Institutions (2)

University of Maryland, Baltimore County¹, United States Department of the Army²

01 Mar 1998-Optical Engineering

TL;DR: To find and reject interference, an unsupervised vector quantization-based interference rejection (UIR) approach is proposed in conjunction with either an orthogonal subspace projection (OSP) or an oblique sub space projection (OBSP) to simultaneously project a pixel into signature space as well as to null out interference.

...read moreread less

Abstract: A widely used approach to hyperspectral image classification is to model a mixed-pixel vector as a linear superposition of substances resident in a pixel with additive Gaussian noise. Using this linear mixture model many image processing techniques can be applied, such as linear unmixing or orthogonal subspace projection. However, a third source not considered in this model, called interference (clutter or structured noise), may sometimes give rise to more serious signal deterioration than the additive noise. We address this issue by introducing the interference into the linear mixture model. Including interference in the model enables us to treat the interference as another undesired source, like a passive jammer, so that it can be eliminated prior to detection and classification. This is particularly useful for hyperspectral images, which tend to have a high SNR but a low signal-to-interference ratio with the interference dif- ficult to identify. To find and reject interference, we propose an unsuper- vised vector quantization-based interference rejection (UIR) approach in conjunction with either an orthogonal subspace projection (OSP) or an oblique subspace projection (OBSP) to simultaneously project a pixel into signature space as well as to null out interference. Since there is no prior knowledge about the interference, the UIR is implemented in an unsupervised manner to generate the desired interference clusters so that they can be annihilated by the OSP or OBSP. The proposed ap- proach is shown by evaluation with Hyperspectral Digital Imagery Col- lection Experiment (HYDICE) data to exhibit considerable improvement in comparison to linear unmixing or the OSP where interference is not considered. © 1998 Society of Photo-Optical Instrumentation Engineers. (S0091-3286(98)00103-2)

...read moreread less

Patent•

System for variable quantization in JPEG for compound documents

[...]

Konstantinos Konstantinides¹, Daniel R. Tretter¹•Institutions (1)

Hewlett-Packard¹

21 Jul 1998

TL;DR: In this paper, a discrete cosine transformer connected to a quantizer drawing lossy quantization factors from quantization tables is used to compress images containing both text and pictures, with the high frequency of changes being indicative of text and the low frequency of pictures.

...read moreread less

Abstract: An image compression system for compound images containing both text and pictures. The system is capable of receiving the images on a non-overlapping 8 by 8 pixel blank basis and includes a discrete cosine transformer connected to a quantizer drawing lossy quantization factors from quantization tables. The lossy quantization factors are modified by a variable quantization subsystem based on the frequency of changes in the block to provide low lossy quantization factors for high frequency of changes and high lossy quantization factors for low frequency of changes. The high frequency of changes being indicative of text and the low frequency of changes being indicative of pictures. The quantizer is connected to an entropy coder using lossless entropy encoding factors from Huffman tables to provide JPEG compliant files.

...read moreread less

Patent•

Method and apparatus for automatic recognition using features encoded with product-space vector quantization

[...]

Vassilios Digalakis¹, Leonardo Neumeyer¹, Stavros Tsakalidis¹, Manolis Perakakis¹•Institutions (1)

SRI International¹

8 Sep 1998

TL;DR: In this article, an automatic recognition system and method divides observation vectors into subvectors and determines a quantization index for each subvector, which can then be transmitted or stored and used to perform recognition.

...read moreread less

Abstract: An automatic recognition system and method divides observation vectors into subvectors and determines a quantization index for the subvectors. Subvector indices can then be transmitted or otherwise stored and used to perform recognition. In a further embodiment, recognition probabilities are determined for subvectors separately and these probabilities are combined to generate probabilities for the observed vectors. An automatic system for assigning bits to subvector indices can be used to improve recognition.

...read moreread less

Journal Article•10.1109/34.713366•

Off-line handwritten Chinese character recognition as a compound Bayes decision problem

[...]

Pak-Kwong Wong¹, Chorkiri Chan¹•Institutions (1)

University of Hong Kong¹

01 Sep 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A handwritten Chinese character off-line recognizer based on contextual vector quantization (CVQ) of every pixel of an unknown character image has been constructed and the CVQ-based language model is the most effective one upgrading the recognition rate by 10.4 percent on the average.

...read moreread less

Abstract: A handwritten Chinese character off-line recognizer based on contextual vector quantization (CVQ) of every pixel of an unknown character image has been constructed. Each template character is represented by a codebook. When an unknown image is matched against a template character, each pixel of the image is quantized according to the associated codebook by considering not just the feature vector observed at each pixel, but those observed at its neighbors and their quantization as well. Structural information such as stroke counts observed at each pixel are captured to form a cellular feature vector. Supporting a vocabulary of 4616 simplified Chinese characters and alphanumeric and punctuation symbols, the writer-independent recognizer has an average recognition rate of 77.2 percent. Three statistical language models for postprocessing have been studied for their effectiveness in upgrading the recognition rate of the system. Among them, the CVQ-based language model is the most effective one upgrading the recognition rate by 10.4 percent on the average.

...read moreread less

Journal Article•10.1109/82.664257•

A fast Linde-Buzo-Gray algorithm in image vector quantization

[...]

Yih-chuan Lin¹, Shen-Chuan Tai¹•Institutions (1)

National Cheng Kung University¹

01 Mar 1998-IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing

TL;DR: A novel algorithm for speeding up the codebook design in image vector quantization that exploits the correlation among the pixels in an image block to compress the computational complexity of calculating the squared Euclidean distortion measures.

...read moreread less

Abstract: This paper presents a novel algorithm for speeding up the codebook design in image vector quantization that exploits the correlation among the pixels in an image block to compress the computational complexity of calculating the squared Euclidean distortion measures, and uses the similarity between the codevectors in the consecutive code-books during the iterative clustering-process to reduce the number of codevectors necessary to be checked for one codebook search. Verified test results have shown that the proposed algorithm can provide almost 98% reduction of the execution time when compared to the conventional Linde-Buzo-Gray (LBG) algorithm.

...read moreread less

Journal Article•10.1109/83.668024•

Kronecker-product gain-shape vector quantization for multispectral and hyperspectral image coding

[...]

G.R. Canta¹, G. Poggi•Institutions (1)

Accenture¹

01 May 1998-IEEE Transactions on Image Processing

TL;DR: A new vector quantization based (VQ-based) technique for very low bit rate encoding of multispectral images based on the assumption that the shape of a generic spatial block does not change significantly from band to band, which is the case for high spectral-resolution imagery.

...read moreread less

Abstract: This paper proposes a new vector quantization based (VQ-based) technique for very low bit rate encoding of multispectral images. We rely on the assumption that the shape of a generic spatial block does not change significantly from band to band, as is the case for high spectral-resolution imagery. In such a hypothesis, it is possible to accurately quantize a three-dimensional (3-D) block-composed of homologous two-dimensional (2-D) blocks drawn from several bands-as the Kronecker-product of a spatial-shape codevector and a spectral-gain codevector, with significant computation saving with respect to straight VQ. An even higher complexity reduction is obtained by representing each 3-D block in its minimum-square-error Kronecker-product form and by quantizing the component shape and gain vectors. For the block sizes considered, this encoding strategy is over 100 times more computationally efficient than unconstrained VQ, and over ten times more computationally efficient than direct gain-shape VQ. The proposed technique is obviously suboptimal with respect to VQ, but the huge complexity reduction allows one to use much larger blocks than usual and to better exploit both the statistical and psychovisual redundancy of the image. Numerical experiments show fully satisfactory results whenever the shape-invariance hypothesis turns out to be accurate enough, as in the case of hyperspectral images. In particular, for a given level of complexity and image quality, the compression ratio is up to five times larger than that provided by ordinary VQ, and also larger than that provided by other techniques specifically designed for multispectral image coding.

...read moreread less

Journal Article•10.1109/83.704313•

Image compression based on fuzzy algorithms for learning vector quantization and wavelet image decomposition

[...]

Nicolaos B. Karayiannis¹, P. Pai, H. Zervos•Institutions (1)

University of Houston¹

01 Aug 1998-IEEE Transactions on Image Processing

TL;DR: This work evaluates the performance of an image compression system based on wavelet-based subband decomposition and vector quantization using the Linde-Buzo-Gray (LBG) algorithm and various fuzzy algorithms for learning vectors quantization (FALVQ).

...read moreread less

Abstract: This paper evaluates the performance of an image compression system based on wavelet-based subband decomposition and vector quantization. The images are decomposed using wavelet filters into a set of subbands with different resolutions corresponding to different frequency bands. The resulting subbands are vector quantized using the Linde-Buzo-Gray (1980) algorithm and various fuzzy algorithms for learning vector quantization (FALVQ). These algorithms perform vector quantization by updating all prototypes of a competitive neural network through an unsupervised learning process. The quality of the multiresolution codebooks designed by these algorithms is measured on the reconstructed images belonging to the training set used for multiresolution codebook design and the reconstructed images from a testing set.

...read moreread less

Patent•10.1121/1.1554222•

Quantization using frequency and mean compensated frequency input data for robust speech recognition

[...]

Safdar M. Asghar, Lin Cong

05 Oct 1998-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech recognition system utilizes multiple quantizers to process frequency parameters and mean compensated frequency parameters derived from an input signal, and such quantizer pairs may also function as front ends to a second stage speech classifiers such as hidden Markov models (HMMs) and/or utilizes neural network postprocessing to improve speech recognition performance.

...read moreread less

Abstract: A speech recognition system utilizes multiple quantizers to process frequency parameters and mean compensated frequency parameters derived from an input signal. The quantizers may be matrix and vector quantizer pairs, and such quantizer pairs may also function as front ends to a second stage speech classifiers such as hidden Markov models (HMMs) and/or utilizes neural network postprocessing to, for example, improve speech recognition performance. Mean compensating the frequency parameters can remove noise frequency components that remain approximately constant during the duration of the input signal. HMM initial state and state transition probabilities derived from common quantizer types and the same input signal may be consolidated to improve recognition system performance and efficiency. Matrix quantization exploits the “evolution” of the speech short-term spectral envelopes as well as frequency domain information, and vector quantization (VQ) primarily operates on frequency domain information. Time domain information may be substantially limited which may introduce error into the matrix quantization, and the VQ may provide error compensation. The matrix and vector quantizers may split spectral subbands to target selected frequencies for enhanced processing and may use fuzzy associations to develop fuzzy observation sequence data. A mixer may provide a variety of input data to the neural network for classification determination. Fuzzy operators may be utilized to reduce quantization error. Multiple codebooks may also be combined to form single respective codebooks for split matrix and split vector quantization to reduce processing resources demand.

...read moreread less

Patent•

Method for switched-predictive quantization

[...]

Alan V. McCree¹•Institutions (1)

Texas Instruments¹

15 Aug 1998

TL;DR: In this article, an improved version of switched predictive multi-stage vector quantization (SPMQ) was proposed for quantization of the LPC coefficients in a speech coder.

...read moreread less

Abstract: A new method for quantization of the LPC coefficients in a speech coder includes an improved form of switched predictive multi-stage vector quantization. The switch predictive quantization includes at least a pair of codebook sets in a MSVQ quantizer and a first and second prediction matrix 24a and 24b with the first prediction matrix 1 used with codebook set 1 and prediction matrix 2 used with codebook set 2 and the encoder determines which prediction matrix/codebooks set produces the minimum quantization error at detector 35 and control 29 gates the indices with the minimum error out of the speech coder.

...read moreread less

Proceedings Article•10.1109/ICASSP.1998.675337•

Segmental vocoder-going beyond the phonetic approach

[...]

Jan Cernocky, Genevieve Baudoin, Gérard Chollet

12 May 1998

TL;DR: The problem of very low bit rate segmental speech coding is addressed and future extensions of the scheme (diphone-like synthesis and speaker adaptation) as well as possible use of automatically derived units in recognition are discussed.

...read moreread less

Abstract: The problem of very low bit rate segmental speech coding is addressed. The basic units are found automatically in the training database using temporal decomposition, vector quantization and multigrams. They are modelled by HMMs. The coding is based on recognition and synthesis. In single speaker tests, we obtained intelligible and naturally sounding speech at a mean rate of 211.2 b/s. In the end, future extensions of our scheme (diphone-like synthesis and speaker adaptation) as well as possible use of automatically derived units in recognition are discussed.

...read moreread less

Journal Article•10.1016/S0031-3203(97)00127-1•

Tabu search algorithm for codebook generation in vector quantization

[...]

Pasi Fränti¹, Juha Kivijärvi², Olli Nevalainen²•Institutions (2)

University of Eastern Finland¹, Turku Centre for Computer Science²

01 Aug 1998-Pattern Recognition

TL;DR: The proposed algorithm first makes non-local changes to the codebook which is then fine-tuned by the generalized Lloyd algorithm, and for a set of gray-scale images, the new algorithm was better than GLA alone, and its results were comparable to simulated annealing.

...read moreread less

Proceedings Article•10.1109/MELCON.1998.699349•

Design of signal constellations for Gaussian channel by using iterative polar quantization

[...]

Zoran Peric, Ivan B. Djordjevic, S.M. Bogosavljevic, Mihajlo Stefanovic¹•Institutions (1)

University of Niš¹

18 May 1998

TL;DR: The new iterative nonuniform polar quantization method is presented and the exact method for the error probability determination per signal constellation symbol, which is obtained, is presented.

...read moreread less

Abstract: The new iterative nonuniform polar quantization method is presented in this paper. The decision levels and the reconstruction levels are determined by this iterative method as well as the number of points on levels. The quantization mean-squared error (MSE) is used as the criterion for optimization. We also present the exact method for the error probability determination per signal constellation symbol, which is obtained by this quantization method. The error probability for nonequiprobable symbols transmission through a channel with the additive Gaussian noise is also computed.

...read moreread less

Patent•

Multistage positive product vector quantization for line spectral frequencies in low rate speech coding

[...]

Peter Warren Moo¹•Institutions (1)

General Electric¹

20 Feb 1998

TL;DR: In this paper, a method of encoding takes advantage of spherical symmetry of error vectors associated with encoding Line Spectral Frequency (LSF) coefficients, to reduce the information transmitted, where the signal is decoded into an audio signal closely representing the original signal intended to be transmitted.

...read moreread less

Abstract: A digital transmitter/receiver communications system transmits audio voice signals over a channel with increased quality for a specified bit rate. The method of encoding takes advantage of spherical symmetry of error vectors associated with encoding Line Spectral Frequency (LSF) coefficients, to reduce the information transmitted. Errors in encoding the LSF coefficient sets, vectors J, are modeled by a number of vectors J p having all positive components, and a sign vector s indicating the polarity of each component of the vector. Each LSF vector J intended to be transmitted is approximated by a positive vector J p and a sign vector s. An index I p of the positive vector J p and the sign vector corresponding to vector J are transmitted, along with other audio information to a receiver/decoder where the signal is decoded into an audio signal closely representing the original signal intended to be transmitted.

...read moreread less

Journal Article•10.1117/1.601810•

Iterative split-and-merge algorithm for vector quantization codebook generation

[...]

Timo Kaukoranta¹, Pasi Fränti², Olli Nevalainen¹•Institutions (2)

University of Turku¹, University of Eastern Finland²

01 Oct 1998-Optical Engineering

TL;DR: Experimental results show that the proposed method performs well in comparison to other tested methods, including the generalized Lloyd algorithm (GLA) and two hier- archical methods.

...read moreread less

Abstract: We propose a new iterative algorithm for the generation of a codebook in vector quantization. The algorithm starts with an initial code- book that is improved by a combination of merge and split operations. By merging small neighboring clusters, additional resources (codevectors) are released. These extra codevectors can be reallocated by splitting large clusters. This process can be iterated until no further improvement is achieved in the distortion of the codebook. Experimental results show that the proposed method performs well in comparison to other tested methods, including the generalized Lloyd algorithm (GLA) and two hier- archical methods. © 1998 Society of Photo-Optical Instrumentation Engineers. (S0091-3286(98)01110-6)

...read moreread less

Journal Article•10.1109/83.730390•

Inverse error-diffusion using classified vector quantization

[...]

Jim Z. C. Lai¹, J.Y. Yen•Institutions (1)

Feng Chia University¹

01 Dec 1998-IEEE Transactions on Image Processing

TL;DR: This correspondence extends and modifies classified vector quantization (CVQ) to solve the problem of inverse halftoning and reconstructs a gray-scale image from a set of codeword-indices.

...read moreread less

Abstract: This correspondence extends and modifies classified vector quantization (CVQ) to solve the problem of inverse halftoning. The proposed process consists of two phases: the encoding phase and decoding phase. The encoding procedure needs a codebook for the encoder which transforms a halftoned image to a set of codeword-indices. The decoding process also requires a different codebook for the decoder which reconstructs a gray-scale image from a set of codeword-indices. Using CVQ, the reconstructed gray-scale image is stored in compressed form and no further compression may be required. This is different from the existing algorithms, which reconstructed a halftoned image in an uncompressed form. The bit rate of encoding a reconstructed image is about 0.51 b/pixel.

...read moreread less

Patent•10.1121/1.1507016•

Matrix quantization with vector quantization error compensation and neural network postprocessing for robust speech recognition

[...]

Safdar M. Asghar¹, Lin Cong•Institutions (1)

Advanced Micro Devices¹

05 Oct 1998-Journal of the Acoustical Society of America

TL;DR: A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier such as hidden Markov models (HMMs) and utilizes neural network postprocessing to, for example, improve speech recognition performance.

...read moreread less

Abstract: A speech recognition system utilizes both matrix and vector quantizers as front ends to a second stage speech classifier such as hidden Markov models (HMMs) and utilizes neural network postprocessing to, for example, improve speech recognition performance. Matrix quantization exploits the “evolution” of the speech short-term spectral envelopes as well as frequency domain information, and vector quantization (VQ) primarily operates on frequency domain information. Time domain information may be substantially limited which may introduce error into the matrix quantization, and the VQ may provide error compensation. The matrix and vector quantizers may split spectral subbands to target selected frequencies for enhanced processing and may use fuzzy associations to develop fuzzy observation sequence data. A mixer provides a variety of input data to the neural network for classification determination. The neural network's ability to analyze the input data generally enhances recognition accuracy. Fuzzy operators may be utilized to reduce quantization error. Multiple codebooks may also be combined to form single respective codebooks for split matrix and split vector quantization to reduce processing resources demand.

...read moreread less

Journal Article•10.1109/78.651227•

An improved lattice vector quantization scheme for wavelet compression

[...]

J. Knipe, Xiaobo Li¹, Bin Han¹•Institutions (1)

University of Alberta¹

01 Jan 1998-IEEE Transactions on Signal Processing

TL;DR: An image compression scheme ModLVQ is presented, which is based on set partitioning in hierarchical trees (SPIHT) and includes a modified lattice vector codebook and more effective timing of the vector coding.

...read moreread less

Abstract: We present an image compression scheme ModLVQ, which is based on set partitioning in hierarchical trees (SPIHT) and includes a modified lattice vector codebook and more effective timing of the vector coding. The experimental results are encouraging, especially on busy images (which are more difficult to compress) and complex portions of other images both numerically and subjectively.

...read moreread less

Patent•

Speech coding system and method including spectral quantizer

[...]

Mark Lewis Grabb¹, Steven Robert Koch¹, Glen William Brooksby¹, Richard Louis Zinser¹•Institutions (1)

Lockheed Martin Corporation¹

13 Jul 1998

TL;DR: In this paper, a speech coding system and associated method relies on a speech encoder and a speech decoder, which includes a spectral quantizer for computing line spectral frequencies (LSFs) for respective frames of speech and for quantizing the LSFs to obtain a minimum bit representation of a spectral envelope of each respective frame of speech.

...read moreread less

Abstract: A speech coding system and associated method relies on a speech encoder and a speech decoder. The encoder includes a spectral quantizer for computing line spectral frequencies (LSFs) for respective frames of speech and for quantizing the LSFs to obtain a minimum bit representation of a spectral envelope of each respective frame of speech. For even numbered frames of speech the LSFs are quantized using a vector quantization technique. For odd numbered frames of speech samples the LSFs are quantized using a dynamic bit allocation (DBA) method. The dynamic bit allocation method determines an interpolation factor for interpolating between the LSFs of the previous and next frames. According to the dynamic bit allocation method the most perceptually important LSFs are represented by relatively more bits, while the least perceptually important LSFs are represented by relatively fewer bits. The system and associated method thereby reduces an overall bit rate required to represent, transmit or store the speech samples.

...read moreread less

...

Expand