Robust Features for Speech Recognition using Temporal Filtering Technique in the Presence of Impulsive Noise
TL;DR: A robust feature extractor, dubbed as Modified Function Cepstral Coefficients (MODFCC), based on gammachirp filterbank, Relative Spectral (RASTA) and Autoregressive Moving-Average (ARMA) filter is introduced to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments.
read more
Abstract: In this paper we introduce a robust feature extractor, dubbed as Modified Function Cepstral Coefficients (MODFCC), based on gammachirp filterbank, Relative Spectral (RASTA) and Autoregressive Moving-Average (ARMA) filter. The goal of this work is to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC), RASTA and ARMA Frequency Cepstral Coefficients (RASTA- MFCC and ARMA-MFCC) are the three main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC were tested and compared against the original RASTA-MFCC and ARMA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases. Index Terms—Auditory filter, impulsive noise, MFCC, RASTA filter, ARMA filter, HMM\GMM.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Comparative Study of Feature Extraction Techniques for Speech Recognition System
Pratik K. Kurzekar,Ratnadeep R. Deshmukh,Vishal B. Waghmare,Pukhraj P. Shrishrimal,Babasaheb Ambedkar +4 more
TL;DR: Speech processing has vast applications in voice dialing, telephone communication, call routing, domestic appliances control, Speech to Text conversion, Text to Speech conversion, lip synchronization, automation systems etc.
An analysis on LPC, RASTA and MFCC techniques in Automatic Speech recognition system
Kartiki Gupta,Divya Gupta +1 more
- 01 Jan 2016
TL;DR: The main objective of this research paper is to briefly summarize speech recognition system and three feature extraction methods that are an integral part of ASR.
69
Speech and Audio Processing
Hazarathaiah Malepati
- 01 Jan 2010
TL;DR: This chapter provides the discussion of sound and audio signals, and then explores how audio data is presented to the processor from a variety of audio converters.
54
•Journal Article
Recognition and Classification of Human Behavior in Intelligent Surveillance Systems using Hidden Markov Model
TL;DR: A high accuracy human action classification and recognition method using hidden Markov model classifier, which classifies the investigated behaviors and detects abnormal actions with high accuracy in comparison by other abnormal detection reported in previous works.
References
RASTA processing of speech
Hynek Hermansky,Nelson Morgan +1 more
TL;DR: The theoretical and experimental foundations of the RASTA method are reviewed, the relationship with human auditory perception is discussed, the original method is extended to combinations of additive noise and convolutional noise, and an application is shown to speech enhancement.
2.1K
•Proceedings Article
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
David Pearce,Hans-Günter Hirsch +1 more
- 01 Jan 2000
TL;DR: A database designed to evaluate the performance of speech recognition algorithms in noisy conditions and recognition results are presented for the first standard DSR feature extraction scheme that is based on a cepstral analysis.
The subspace Gaussian mixture model-A structured model for speech recognition
Daniel Povey,Lukas Burget,Mohit Agarwal,Pinar Akyazi,Feng Kai,Arnab Ghoshal,Ondřej Glembek,Nagendra Kumar Goel,Martin Karafiat,Ariya Rastrow,Richard Rose,Petr Schwarz,Samuel Thomas +12 more
TL;DR: A new approach to speech recognition, in which all Hidden Markov Model states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state, appears to give better results than a conventional model.
323
Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring
Chanwoo Kim,Richard M. Stern +1 more
- 14 Mar 2010
TL;DR: A new robust feature extraction algorithm based on a modified approach to power bias subtraction combined with applying a threshold to the power spectral density is presented, showing better performance than the previous implementation.
Noise robust pitch tracking by subband autocorrelation classification
Daniel P. W. Ellis,Byungsuk Lee +1 more
- 01 Jan 2012
TL;DR: Training on various types of noisy speech recordings leads to a great increase in performance over state-of-the-art algorithms, according to both the traditional Gross Pitch Error (GPE) measure, and a proposed novel Pitch Tracking Error (PTE) which more fully reflects the accuracy of both pitch estimation/extraction and voicing detection in a single measure.