Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Vector quantization
  4. 2004
  1. Home
  2. Topics
  3. Vector quantization
  4. 2004
Showing papers on "Vector quantization published in 2004"
Proceedings Article•
Visual categorization with bags of keypoints

[...]

Gabriela Csurka
1 Jan 2004
TL;DR: This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches and shows that it is simple, computationally efficient and intrinsically invariant.
Abstract: We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naive Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.

5,369 citations

Journal Article•10.1007/S10044-004-0218-1•
A new cluster validity measure and its application to image compression

[...]

Chien-Hsing Chou1, Mu-Chun Su2, Edmund M-K. Lai3•
Academia Sinica1, National Central University2, Tamkang University3
01 Jul 2004-Pattern Analysis and Applications
TL;DR: This paper proposes a new validity measure that can deal with the edge degradation in vector quantisation of image compression and proposes a modified K-means algorithm that can assign more cluster centres to areas with low densities of data.
Abstract: Many validity measures have been proposed for evaluating clustering results. Most of these popular validity measures do not work well for clusters with different densities and/or sizes. They usually have a tendency of ignoring clusters with low densities. In this paper, we propose a new validity measure that can deal with this situation. In addition, we also propose a modified K-means algorithm that can assign more cluster centres to areas with low densities of data than the conventional K-means algorithm does. First, several artificial data sets are used to test the performance of the proposed measure. Then the proposed measure and the modified K-means algorithm are applied to reduce the edge degradation in vector quantisation of image compression.

357 citations

Proceedings Article•10.1109/IJCNN.2004.1381036•
Sparse coding and NMF

[...]

Julian Eggert1, Edgar Körner1•
Honda1
25 Jul 2004
TL;DR: This work shows how to merge the concepts of non-negative factorization with sparsity conditions, and results are a multiplicative algorithm that is comparable in efficiency to standard NMF, but that can be used to gain sensible solutions in the overcomplete cases.
Abstract: Non-negative matrix factorization (NMF) is a very efficient parameter-free method for decomposing multivariate data into strictly positive activations and basis vectors. However, the method is not suited for overcomplete representations, where usually sparse coding paradigms apply. We show how to merge the concepts of non-negative factorization with sparsity conditions. The result is a multiplicative algorithm that is comparable in efficiency to standard NMF, but that can be used to gain sensible solutions in the overcomplete cases. This is of interest e.g. for the case of learning and modeling of arrays of receptive fields arranged in a visual processing map, where an overcomplete representation is unavoidable.

327 citations

Speaker identification using mel frequency cepstral coefficients

[...]

Rashidul Hasan, Saifur Rahman
1 Jan 2004
TL;DR: This paper presents a security system based on speaker identification based onMel frequency Cepstral Coefficients{MFCCs} have been used for feature extraction and vector quantization technique is used to minimize the amount of data to be handled.
Abstract: This paper presents a security system based on speaker identification. Mel frequency Cepstral Coefficients{MFCCs} have been used for feature extraction and vector quantization technique is used to minimize the amount of data to be handled .

326 citations

Journal Article•10.1109/TIP.2004.829779•
MAP estimation for hyperspectral image resolution enhancement using an auxiliary sensor

[...]

Russell C. Hardie1, Michael T. Eismann2, G.L. Wilson•
University of Dayton1, Air Force Research Laboratory2
01 Sep 2004-IEEE Transactions on Image Processing
TL;DR: A novel maximum a posteriori estimator for enhancing the spatial resolution of an image using co-registered high spatial-resolution imagery from an auxiliary sensor, focusing on the use of high-resolution panchromatic data to enhance hyperspectral imagery.
Abstract: This paper presents a novel maximum a posteriori estimator for enhancing the spatial resolution of an image using co-registered high spatial-resolution imagery from an auxiliary sensor. Here, we focus on the use of high-resolution panchromatic data to enhance hyperspectral imagery. However, the estimation framework developed allows for any number of spectral bands in the primary and auxiliary image. The proposed technique is suitable for applications where some correlation, either localized or global, exists between the auxiliary image and the image being enhanced. To exploit localized correlations, a spatially varying statistical model, based on vector quantization, is used. Another important aspect of the proposed algorithm is that it allows for the use of an accurate observation model relating the "true" scene with the low-resolutions observations. Experimental results with hyperspectral data derived from the airborne visible-infrared imaging spectrometer are presented to demonstrate the efficacy of the proposed estimator.

324 citations

Patent•
Representation and retrieval of images using context vectors derived from image information elements

[...]

William R. Caid, Robert Hecht-Neilsen
14 Jun 2004
TL;DR: In this paper, a set of feature vectors, or atoms, is derived from the set of image feature vectors to form an atomic vocabulary, which is used to define new images, and meaning is established between atoms in the atomic vocabulary.
Abstract: Image features are generated by performing wavelet transformations at sample points on images stored in electronic form. Multiple wavelet transformations at a point are combined to form an image feature vector. A prototypical set of feature vectors, or atoms, is derived from the set of feature vectors to form an “atomic vocabulary.” The prototypical feature vectors are derived using a vector quantization method, e.g., using neural network self-organization techniques, in which a vector quantization network is also generated. The atomic vocabulary is used to define new images. Meaning is established between atoms in the atomic vocabulary. High-dimensional context vectors are assigned to each atom. The context vectors are then trained as a function of the proximity and co-occurrence of each atom to other atoms in the image. After training, the context vectors associated with the atoms that comprise an image are combined to form a summary vector for the image. Images are retrieved using a number of query methods, e.g., images, image portions, vocabulary atoms, index terms. The user's query is converted into a query context vector. A dot product is calculated between the query vector and the summary vectors to locate images having the closest meaning. The invention is also applicable to video or temporally related images, and can also be used in conjunction with other context vector data domains such as text or audio, thereby linking images to such data domains.

239 citations

Journal Article•10.1109/TIP.2004.826125•
An efficient and effective region-based image retrieval framework

[...]

Feng Jing, Mingjing Li1, Hong-Jiang Zhang1, Bo Zhang•
Microsoft1
01 May 2004-IEEE Transactions on Image Processing
TL;DR: An image retrieval framework that integrates efficient region-based representation in terms of storage and complexity and effective on-line learning capability and a region weighting strategy is introduced to optimally weight the regions and enable the system to self-improve.
Abstract: An image retrieval framework that integrates efficient region-based representation in terms of storage and complexity and effective on-line learning capability is proposed. The framework consists of methods for region-based image representation and comparison, indexing using modified inverted files, relevance feedback, and learning region weighting. By exploiting a vector quantization method, both compact and sparse (vector) region-based image representations are achieved. Using the compact representation, an indexing scheme similar to the inverted file technology and an image similarity measure based on Earth Mover's Distance are presented. Moreover, the vector representation facilitates a weighted query point movement algorithm and the compact representation enables a classification-based algorithm for relevance feedback. Based on users' feedback information, a region weighting strategy is also introduced to optimally weight the regions and enable the system to self-improve. Experimental results on a database of 10 000 general-purposed images demonstrate the efficiency and effectiveness of the proposed framework.

208 citations

Categorizing Nine Visual Classes using Local Appearance Descriptors

[...]

Jutta Willamowski, Damian Arregui, Gabriella Csurka, Christopher R. Dance, Lixin Fan 
1 Jan 2004
TL;DR: A thorough evaluation clearly demonstrates that the bag of keypoints method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.
Abstract: We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naive Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for classifying nine semantic visual categories and comment on results obtained by Fergus et al using a different method on the same data set. We obtain excellent results as well for multi class categorization as for object detection. A thorough evaluation clearly demonstrates that our method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.

205 citations

Proceedings Article•10.1109/VETECF.2004.1400315•
Design and analysis of transmit-beamforming based on limited-rate feedback

[...]

Pengfei Xia1, Georgios B. Giannakis1•
University of Minnesota1
1 Dec 2004
TL;DR: This work first considers multi-antenna beamformed transmissions through independent and identically distributed (i.i.d.) Rayleigh fading channels, and upper-bound the rate distortion function of the vector source, and also lower- bound the operational rate distortion performance achieved by the generalized Lloyd's algorithm.
Abstract: We deal with the design and performance analysis of transmit-beamformers for multi-input multi-output (MIMO) systems, based on bandwidth-limited information that is fed back from the receiver to the transmitter. By casting the design of transmit-beamforming based on limited-rate feedback as an equivalent sphere vector quantization (SVQ) problem, we first consider multi-antenna beamformed transmissions through independent and identically distributed (i.i.d.) Rayleigh fading channels. We upper-bound the rate distortion function of the vector source, and also lower-bound the operational rate distortion performance achieved by the generalized Lloyd's algorithm. A simple and valuable relationship emerges between the theoretical distortion limit and the achievable performance, and the average signal to noise ratio (SNR) performance is accurately quantified. Finally, we study beamformer codebook designs for correlated Rayleigh fading channels. and derive a low-complexity codebook design that achieves near optimal performance.

179 citations

Journal Article•10.1109/TIP.2004.837557•
Robust image-adaptive data hiding using erasure and error correction

[...]

Kaushal Solanki1, N. Jacobsen1, Upamanyu Madhow1, B.S. Manjunath1, Shivkumar Chandrasekaran1 •
University of California, Santa Barbara1
01 Dec 2004-IEEE Transactions on Image Processing
TL;DR: This work proposes practical realizations of this prescription for data hiding in images, with a view to hiding large volumes of data with low perceptual degradation, and shows that scalar quantization-based hiding incurs approximately only a 2-dB penalty in terms of resilience to attack.
Abstract: Information-theoretic analyses for data hiding prescribe embedding the hidden data in the choice of quantizer for the host data. We propose practical realizations of this prescription for data hiding in images, with a view to hiding large volumes of data with low perceptual degradation. The hidden data can be recovered reliably under attacks, such as compression and limited amounts of image tampering and image resizing. The three main findings are as follows. 1) In order to limit perceivable distortion while hiding large amounts of data, hiding schemes must use image-adaptive criteria in addition to statistical criteria based on information theory. 2) The use of local criteria to choose where to hide data can potentially cause desynchronization of the encoder and decoder. This synchronization problem is solved by the use of powerful, but simple-to-implement, erasures and errors correcting codes, which also provide robustness against a variety of attacks. 3) For simplicity, scalar quantization-based hiding is employed, even though information-theoretic guidelines prescribe vector quantization-based methods. However, an information-theoretic analysis for an idealized model is provided to show that scalar quantization-based hiding incurs approximately only a 2-dB penalty in terms of resilience to attack.

155 citations

Journal Article•10.1016/J.CVIU.2003.10.015•
Image retrieval using color histograms generated by Gauss mixture vector quantization

[...]

Sangoh Jeong1, Chee Sun Won2, Robert M. Gray1•
Stanford University1, Dongguk University2
01 Apr 2004-Computer Vision and Image Understanding
TL;DR: Results show that the histograms made by GMVQ with a penalized log-likelihood (LL) distortion yield better retrieval performance for color images than the conventional methods of uniform quantization and VQ with squared error distortion.
Patent•
Recursive reduction of channel state feedback

[...]

Qinghua Li1, Xintian E. Lin2•
Intel1, Apple Inc.2
8 Sep 2004
TL;DR: In this article, a closed-loop MIMO system with Householder transformations and vector quantization using codebooks was proposed. But the vector quantisation was not considered. But it was shown that vector quantizations can reduce the feedback bandwidth.
Abstract: Feedback bandwidth may be reduced in a closed loop MIMO system by Householder transformations and vector quantization using codebooks.
Journal Article•10.1016/J.PATREC.2004.08.017•
Fragile image watermarking using a gradient image for improved localization and security

[...]

Shan Suthaharan1•
University of North Carolina at Greensboro1
01 Dec 2004-Pattern Recognition Letters
TL;DR: A fragile watermarking algorithm for image authentication and tamper detection is proposed that provides superior localization with greater security against many attacks including vector quantization attack.
Journal Article•10.1109/TIT.2004.831832•
Network vector quantization

[...]

M. Fleming1, Qian Zhao2, Michelle Effros1•
California Institute of Technology1, Oracle Corporation2
01 Aug 2004-IEEE Transactions on Information Theory
TL;DR: An algorithm for designing locally optimal vector quantizers for general networks is presented that both includes these existing solutions as special cases and provides solutions to previously unsolved examples.
Abstract: We present an algorithm for designing locally optimal vector quantizers for general networks. We discuss the algorithm's implementation and compare the performance of the resulting "network vector quantizers" to traditional vector quantizers (VQs) and to rate-distortion (R-D) bounds where available. While some special cases of network codes (e.g., multiresolution (MR) and multiple description (MD) codes) have been studied in the literature, we here present a unifying approach that both includes these existing solutions as special cases and provides solutions to previously unsolved examples.
Journal Article•10.1109/TIT.2004.834746•
Quantized feedback information in orthogonal space-time block coding

[...]

G. Jongren, Mikael Skoglund1•
Royal Institute of Technology1
01 Oct 2004-IEEE Transactions on Information Theory
TL;DR: This work considers how the presence of quantized channel information obtained from a feedback link may be utilized for determining a transmit weighting matrix that improves the performance of a predetermined orthogonal space-time block (OSTB) code.
Abstract: This work considers how the presence of quantized channel information obtained from a feedback link may be utilized for determining a transmit weighting matrix that improves the performance of a predetermined orthogonal space-time block (OSTB) code. To reduce the effects of feedback delay, quantization errors and feedback channel bit errors, methods based on vector quantization for noisy channels are used in the design of the feedback link. The resulting transmission scheme and feedback link take the imperfect nature of the channel information into account while combining the benefits of conventional beamforming with those provided by OSTB coding.
Journal Article•10.1016/J.NEUCOM.2004.01.008•
A general framework for unsupervised processing of structured data

[...]

Barbara Hammer1, Alessio Micheli2, Alessandro Sperduti3, Marc Strickert1•
University of Osnabrück1, University of Pisa2, University of Padua3
1 Mar 2004
TL;DR: In this article, the authors define a general recursive dynamic which provides recursive processing of complex data structures by recursive computation of internal representations for the given context, which can be interpreted as an approximation of a stochastic gradient descent.
Abstract: Self-organization constitutes an important paradigm in machine learning with successful applications e.g. in data- and web-mining. Most approaches, however, have been proposed for processing data contained in a fixed and finite dimensional vector space. In this article, we will focus on extensions to more general data structures like sequences and tree structures. Various modifications of the standard self-organizing map (SOM) to sequences or tree structures have been proposed in the literature some of which are the temporal Kohonen map, the recursive SOM, and SOM for structured data. These methods enhance the standard SOM by utilizing recursive connections. We define a general recursive dynamic in this article which provides recursive processing of complex data structures by recursive computation of internal representations for the given context. The above mentioned mechanisms of SOMs for structures are special cases of the proposed general dynamic. Furthermore, the dynamic covers the supervised case of recurrent and recursive networks. The general framework offers an uniform notation for training mechanisms such as Hebbian learning. Moreover, the transfer of computational alternatives such as vector quantization or the neural gas algorithm to structure processing networks can be easily achieved. One can formulate general cost functions corresponding to vector quantization, neural gas, and a modification of SOM. The cost functions can be compared to Hebbian learning which can be interpreted as an approximation of a stochastic gradient descent. For comparison, we derive the exact gradients for general cost functions.
Journal Article•10.1016/J.PATREC.2004.04.003•
Information hiding based on search-order coding for VQ indices

[...]

Chin-Chen Chang1, Guei-Mei Chen1, Min-Hui Lin2•
National Chung Cheng University1, Providence College2
01 Aug 2004-Pattern Recognition Letters
TL;DR: A steganographic scheme based on the search-order coding (SOC) compression method of vector quantization (VQ) indices to embed secret data into the compression codes of the host image such that the interceptors will not notice the existence of secret data.
Patent•
Systems and methods for image pattern recognition

[...]

Ole Eichhorn, Dirk G. Soenksen1•
AmeriCorps VISTA1
13 Feb 2004
TL;DR: In this article, a vocabulary of vectors is built by segmenting images into kernels and creating vectors corresponding to each kernel, which can be used to reconstruct an image by looking up vectors stored in the vocabulary.
Abstract: Systems and methods for image pattern recognition comprise digital image capture and encoding using vector quantization ('VQ') of the image. A vocabulary of vectors is built by segmenting images into kernels and creating vectors corresponding to each kernel. Images are encoded by creating a vector index file having indices that point to the vectors stored in the vocabulary. The vector index file can be used to reconstruct an image by looking up vectors stored in the vocabulary. Pattern recognition of candidate regions of images can be accomplished by correlating image vectors to a pre-trained vocabulary of vector sets comprising vectors that correlate with particular image characteristics. In virtual microscopy, the systems and methods are suitable for rare-event finding, such as detection of micrometastasis clusters, tissue identification, such as locating regions of analysis for immunohistochemical assays, and rapid screening of tissue samples, such as histology sections arranged as tissue microarrays (TMAs).
Proceedings Article•10.1109/PIMRC.2004.1373811•
Channel feedback quantization methods for MISO and MIMO systems

[...]

June Chul Roh1, Bhaskar D. Rao1•
University of California, Berkeley1
5 Sep 2004
TL;DR: Results show that the quantization bit allocation over multiple spatial channels has a critical effect on the performance, and that the optimum bit allocation depends on the operating transmit power of the system.
Abstract: We investigate the quantization of multiple antenna channel to feed back through a low-rate feedback channel. Specifically, for multiple-input single-output (MISO) systems, we propose a new design criterion and the corresponding design algorithm for quantization of the random beamforming vector. For multiple-input multiple-output (MIMO) channels, which have multiple orthonormal vectors as channel spatial information for quantization, a matrix factorization method is proposed which provides a way to exploit the geometrical structure of orthonormality while quantizing the spatial information matrix. Results show that the quantization bit allocation over multiple spatial channels has a critical effect on the performance, and that the optimum bit allocation depends on the operating transmit power of the system.
Patent•
Interpolation in channel state feedback

[...]

Qinghua Li1, Xintian E. Lin2•
Intel1, Apple Inc.2
10 Sep 2004
TL;DR: In this paper, a column of a beamforming matrix is quantized using a codebook, a Householder reflection is performed on the beamforming matrices to reduce the dimensionality of the beamform matrix, and the quantizing and performing of Householder reflections on the previously dimensionality reduced beamform matrices is recursively repeated to obtain a further reduction of dimensionality.
Abstract: Feedback bandwidth may be reduced in a closed loop MIMO system by Householder transformations, vector quantization using codebooks, and down-sampling in the frequency domain. A column of a beamforming matrix is quantized using a codebook, a Householder reflection is performed on the beamforming matrix to reduce the dimensionality of the beamforming matrix, and the quantizing and performing of Householder reflection on the previously dimensionality reduced beamforming matrix is recursively repeated to obtain a further reduction of dimensionality of the beamforming matrix. These actions are performed for a subset of orthogonal frequency divisional multiplexing (OFDM) carriers, and quantized column vectors for the subset of OFDM carriers are transmitted.
Journal Article•10.1051/0004-6361:20040141•
Automated clustering algorithms for classification of astronomical objects

[...]

Yanxia Zhang, Yong-Heng Zhao
01 Aug 2004-Astronomy and Astrophysics
TL;DR: This study concludes that in the situation of fewer features, LVQ and SLP show better performance, in contrast to SVM, which shows better performance when considering more features.
Abstract: Data mining is an important and challenging problem for the efficient analysis of large astronomical databases and will become even more important with the development of the Global Virtual Observatory. In this study, learning vector quantization (LVQ), single-layer perceptron (SLP) and support vector machines (SVM) were used for multi-wavelength data classification. A feature selection technique was used to evaluate the significance of the considered features for the results of classification. We conclude that in the situation of fewer features, LVQ and SLP show better performance. In contrast, SVM shows better performance when considering more features. The focus of the automatic classification is on the development of an efficient feature-based classifier. The classifiers trained by these methods can be used to preselect AGN candidates.
Journal Article•10.1109/TCSVT.2004.828329•
Dictionary design for matching pursuit and application to motion-compensated video coding

[...]

Philippe Schmid-Saugeon, A. Zakhor
01 Jun 2004-IEEE Transactions on Circuits and Systems for Video Technology
TL;DR: This work presents a new algorithm for matching pursuit (MP) dictionary design that uses existing vector-quantization design techniques and an inner product-based distortion measure to learn functions from a set of training patterns.
Abstract: We present a new algorithm for matching pursuit (MP) dictionary design. This technique uses existing vector-quantization design techniques and an inner product-based distortion measure to learn functions from a set of training patterns. While this scheme can be applied to many MP applications, we focus on motion-compensated video coding. Given a set of training sequences, data are extracted from the high-energy packets of the motion-compensated frames. Dictionaries with different regions of support are trained, pruned, and finally evaluated on MPEG test sequences. We find that for high bit-rate QCIF sequences we can achieve improvements of up to 0.66 dB with respect to conventional MP with separable Gabor functions.
Proceedings Article•10.1109/CVPR.2004.1315273•
Hidden semantic concept discovery in region based image retrieval

[...]

Ruofei Zhang1, Zhongfei Zhang1•
Binghamton University1
27 Jun 2004
TL;DR: This work addresses content based image retrieval (CBIR), focusing on developing a hidden semantic concept discovery methodology to address effective semantics-intensive image retrieval.
Abstract: This work addresses content based image retrieval (CBIR), focusing on developing a hidden semantic concept discovery methodology to address effective semantics-intensive image retrieval. In our approach, each image in the database is segmented to region; associated with homogenous color, texture, and shape features. By exploiting regional statistical information in each image and employing a vector quantization method, a uniform and sparse region-based representation is achieved. With this representation a probabilistic model based on statistical-hidden-class assumptions of the image database is obtained, to which expectation-maximization (EM) technique is applied to analyze semantic concepts hidden in the database. An elaborated retrieval algorithm is designed to support the probabilistic model. The semantic similarity is measured through integrating the posterior probabilities of the transformed query image, as well as a constructed negative example, to the discovered semantic concepts. The proposed approach has a solid statistical foundation and the experimental evaluations on a database of 10,000 general-purposed images demonstrate its promise of the effectiveness.
Journal Article•10.1109/TIP.2004.833107•
Design of vector quantizer for image compression using self-organizing feature map and surface fitting

[...]

Arijit Laha, Nikhil R. Pal, Bhabatosh Chanda
01 Oct 2004-IEEE Transactions on Image Processing
TL;DR: The proposed scheme can produce reconstructed images of good quality while achieving compression at low bit rates and two indices for quantitative assessment of the psychovisual quality (blocking effect) of the reconstructed image are proposed.
Abstract: We propose a new scheme of designing a vector quantizer for image compression. First, a set of codevectors is generated using the self-organizing feature map algorithm. Then, the set of blocks associated with each code vector is modeled by a cubic surface for better perceptual fidelity of the reconstructed images. Mean-removed vectors from a set of training images is used for the construction of a generic codebook. Further, Huffman coding of the indices generated by the encoder and the difference-coded mean values of the blocks are used to achieve better compression ratio. We proposed two indices for quantitative assessment of the psychovisual quality (blocking effect) of the reconstructed image. Our experiments on several training and test images demonstrate that the proposed scheme can produce reconstructed images of good quality while achieving compression at low bit rates.
Patent•10.1121/1.2097083•
Sound source separation using convolutional mixing and a priori sound source knowledge

[...]

Alejandro Acero1, Steven J. Altschuler1, Lani Fang Wu1•
Microsoft1
18 Nov 2004-Journal of the Acoustical Society of America
TL;DR: In this paper, sound source separation using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed, where a vector quantization codebook of vectors representing typical sound source patterns is used to determine whether proper separation has occurred.
Abstract: Sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.
Proceedings Article•10.1109/ICME.2004.1394379•
Efficient motion-vector-based video search using query by clip

[...]

Chuan-Yu Cho1, Ya-Ting Chuang1, Pei-Chi Chu1, Shih-Yu Huang, Jia-Shung Wang •
National Tsing Hua University1
30 Jun 2004
TL;DR: A technique to filter and reconstruct noisy motion vectors to improve the fidelity of the motion activity and a new motion descriptor, named representative spectrum, extracted by a vector quantization expressing the spatial-temporal distribution of motion activity, is introduced for video indexing.
Abstract: In this paper, a simple and efficient motion-vector-based search engine supporting query by clip is presented for indexing and retrieving of videos. We propose a technique to filter and reconstruct noisy motion vectors to improve the fidelity of the motion activity. In addition to the motion features of intensity and direction, a new motion descriptor, named representative spectrum, extracted by a vector quantization expressing the spatial-temporal distribution of motion activity, is introduced for video indexing. A dynamic programming scheme is also utilized to measure the similarity of two videos based on the concept of common subsequence. Furthermore, a 2-step search of similar videos is employed to reduce the computational complexity. The experimental results indicate that the proposed search engine performs well and the obtained recall and precision values are also high enough
Proceedings Article•10.1109/ICASSP.2004.1326032•
Low-complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbit/s

[...]

Stéphane Ragot1, B. Bessette1, Roch Lefebvre1•
Université de Sherbrooke1
17 May 2004
TL;DR: A practical multi-rate quantization system based on Voronoi extension and derived from the lattice RE/sub 8/.
Abstract: We present a new method, called Voronoi extension, for the design of low-complexity multi-rate lattice vector quantization (VQ). With this technique, lattice codebooks of arbitrarily large bit rates can be generated algorithmically and the problem of lattice codebook overload can be bypassed. We describe a practical multi-rate quantization system based on Voronoi extension and derived from the lattice RE/sub 8/. This system is applied to the TCX coding model using pitch prediction, so as to extend AMR-WB speech coding at high bit rates (in particular 32 kbit/s).
Proceedings Article•10.21437/INTERSPEECH.2004-520•
Real-time speaker identification.

[...]

Pasi Fränti, Evgeny Karpov, Tomi Kinnunen
4 Oct 2004
TL;DR: The number of test vectors is reduced by pre-quantizing the test sequence prior to matching, and the number of speakers are reduced by pruning out unlikely speakers during the identification process by optimizing vector quantization (VQ) based speaker identification.
Abstract: In speaker identification, most of the computation originates from distance or likelihood computations between the feature vectors of the unknown speaker and the models in the database. The identification time depends on the number of feature vectors, their dimensionality, the complexity of the speaker models and the number of speakers. In this paper, we focus on optimizing vector quantization (VQ) based speaker identification. We reduce the number of test vectors by pre-quantizing the test sequence prior to matching, and the number of speakers by pruning out unlikely speakers during the identification process. The best variants are then generalized to Gaussian mixture model (GMM) based modeling also. We obtain a speed-up factor of 16:1 with VQ-based system, and 34:1 with GMM-based system with a minor degradation in the identification error rate.
Patent•
Method and device for gain quantization in variable bit rate wideband speech coding

[...]

Milan Jelinek1, Redwan Salami1•
Nokia1
12 Mar 2004
TL;DR: In this article, a gain quantization method and device for implementation in a technique for coding a sampled sound signal processed, during coding, by successive frames of L samples, wherein each frame is divided into a number of subframes and each subframe comprises a number N of samples, where N < L.
Abstract: The present invention relates to a gain quantization method and device for implementation in a technique for coding a sampled sound signal processed, during coding, by successive frames of L samples, wherein each frame is divided into a number of subframes and each subframe comprises a number N of samples, where N
Proceedings Article•10.1109/SSP.2003.1289429•
Nested quantization and Slepian-Wolf coding: a Wyner-Ziv coding paradigm for i.i.d. sources

[...]

Zixiang Xiong1, A.D. Liveris1, Samuel Cheng1, Zhixin Liu1•
Texas A&M University1
4 May 2004
TL;DR: In this article, a new paradigm for Wyner-Ziv coding of iid sources is proposed that consists of nested quantization and Slepian-Wolf coding, where the former plays the role of quantization with side information (at the decoder) and the latter lossless coding with side-information.
Abstract: A new paradigm for Wyner-Ziv coding of iid sources is proposed that consists of nested quantization and Slepian-Wolf coding The former plays the role of quantization with side information (at the decoder) and the latter lossless coding with side information The proposed Slepian-Wolf coded nested quantization (SWC-NQ) framework generalizes the classic source coding approach of quantization and lossless/entropy coding The main thrust is to treat Wyner-Ziv coding as a source-channel coding problem in which the side information is taken into account in the channel coding component via binning For Gaussian sources with MSE measure, assuming nested lattice quantization with ideal Slepian-Wolf coding and high rate, we establish system performance bounds of SWC-NQ similar to those in classic source coding, showing that 1-D/2-D nested lattice quantization performs 153/136 dB worse than the Wyner-Ziv distortion-rate function D/sub WZ/{R) Using nested lattices in higher dimensions or nested trellis-coded quantization (TCQ) could possibly approach D/sub WZ/(R) even further We implement 1-D and 2-D nested lattice quantization, together with irregular low-density parity-check (LDPC) codes for Slepian-Wolf coding, obtaining performance close to the corresponding theoretical limits
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve