Sparse coding based features for speech units classification.

doi:10.21437/INTERSPEECH.2015-246

Proceedings Article10.21437/INTERSPEECH.2015-246

Sparse coding based features for speech units classification.

Pulkit Sharma, +3 more

- 06 Sep 2015

- pp 712-715

12

TL;DR: In this paper, the training data belonging to each class is clustered into multiple clusters, and a principal component analysis (PCA) based dictionary is learnt for each cluster, where coefficients corresponding to middle principal components can effectively discriminate among different speech units.

Abstract: In this work, we propose sparse representation based features for speech units classification tasks. In order to effectively capture the variations in a speech unit, the proposed method employs multiple class specific dictionaries. Here, the training data belonging to each class is clustered into multiple clusters, and a principal component analysis (PCA) based dictionary is learnt for each cluster. It has been observed that coefficients corresponding to middle principal components can effectively discriminate among different speech units. Exploiting this observation, we propose to use a transformation function known as weighted decomposition (WD) of principal components, which is used to emphasize the discriminative information present in the PCA-based dictionary. In this paper, both raw speech samples and mel frequency cepstral coefficients (MFCC) are used as an initial representation for feature extraction. For comparison, various popular dictionary learning techniques such as K-singular value decomposition (KSVD), simultaneous codeword optimization (SimCO) and greedy adaptive dictionary (GAD) are also employed in the proposed framework. The effectiveness of the proposed features is demonstrated using continuous density hidden Markov model (CDHMM) based classifiers for (i) classification of isolated utterances of E-set of English alphabet, (ii) classification of consonant-vowel (CV) segments in Hindi language and (iii) classification of phoneme from TIMIT phonetic corpus.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/J.PATREC.2016.04.014

Greedy dictionary learning for kernel sparse representation based classifier

Vinayak Abrol, +2 more

- 15 Jul 2016

- Pattern Recognition Letters

TL;DR: Compared to the existing state-of-the-art methods, the proposed method has much less computational complexity, but performs similar for various pattern classification tasks.

...read moreread less

21

Journal Article•10.1016/J.SPECOM.2015.06.001

Voiced/nonvoiced detection in compressively sensed speech signals

Vinayak Abrol, +2 more

- 01 Sep 2015

- Speech Communication

TL;DR: The proposed novel unsupervised voiced/nonvoiced (V/NV) detection method attempts to exploit the fact that there is significant glottal activity during production of voiced speech while the same is not true for nonvoiced speech, and provides compelling evidence of the effectiveness of sparse feature vector for V/NV detection.

...read moreread less

21

Journal Article•10.1016/J.CSL.2017.08.004

Sparse coding based features for speech units classification

Pulkit Sharma, +3 more

- 01 Jan 2018

- Computer Speech & Language

TL;DR: Both raw speech samples and mel frequency cepstral coefficients are used as an initial representation for feature extraction and a transformation function known as weighted decomposition (WD) of principal components is used to emphasize the discriminative information present in the PCA-based dictionary.

...read moreread less

17

Journal Article•10.1016/J.SPECOM.2016.09.004

Greedy double sparse dictionary learning for sparse representation of speech signals

Vinayak Abrol, +2 more

- 01 Dec 2016

- Speech Communication

TL;DR: A greedy double sparse (DS) dictionary learning algorithm for speech signals, where the dictionary is the product of a predefined base dictionary, and a sparse matrix, and it is shown that the dictionary can be learned efficiently in the coefficient domain rather than the signal domain.

...read moreread less

14

Journal Article•10.1016/J.CSL.2018.05.003

Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation

Pulkit Sharma, +3 more

- 01 Nov 2018

- Computer Speech & Language

TL;DR: Experimental studies on two different Indian languages suggest that CS/SR based footprint reduction methods can be used as an alternative to existing compression methods employed in USS system.

...read moreread less

9

Sparse coding based features for speech units classification.

Chat with Paper

AI Agents for this Paper

Citations

Greedy dictionary learning for kernel sparse representation based classifier

Voiced/nonvoiced detection in compressively sensed speech signals

Sparse coding based features for speech units classification

Greedy double sparse dictionary learning for sparse representation of speech signals

Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation

Related Papers (5)

Sparse coding based features for speech units classification

Voiced/nonvoiced detection in compressively sensed speech signals

Speech enhancement using compressed sensing.

Fast Dictionary Learning for Sparse Representations of Speech Signals

Dictionary Learning