Journal Article10.1023/A:1023410018862
Efficient Noise Robust Feature Extraction Algorithms for Distributed Speech Recognition (DSR) Systems
16
TL;DR: Two innovative front-end processing techniques for noise robust speech recognition are presented and compared and include different forms of frame-attenuation, improvement of spectral subtraction based on minimum statistics, as well as a mel-cepstrum feature extraction procedure.
read more
Abstract: The evolution of robust speech recognition systems that maintain a high level of recognition accuracy in difficult and dynamically-varying acoustical environments is becoming increasingly important as speech recognition technology becomes a more integral part of mobile applications. In distributed speech recognition (DSR) architecture the recogniser's front-end is located in the terminal and is connected over a data network to a remote back-end recognition server. The terminal performs the feature parameter extraction, or the front-end of the speech recognition system. These features are transmitted over a data channel to the remote back-end recogniser. DSR provides particular benefits for the applications of mobile devices such as improved recognition performance compared to using the voice channel and ubiquitous access from different networks with a guaranteed level of recognition performance. A feature extraction algorithm integrated into the DSR system is required to operate in real-time as well as with the lowest possible computational costs.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Bee Swarm Activity Acoustic Classification for an IoT-Based Farm Service.
TL;DR: The evaluation results showed that good acoustic classification performance can be achieved with the proposed IoT-based bee activity acoustic classification system, and the objective was to successfully classify sound between the normal and swarming conditions in a beehive.
81
•Dissertation
Making music through real-time voice timbre analysis: machine learning and timbral control
Dan Stowell
- 01 Jan 2010
TL;DR: This thesis develops approaches that can be used with a wide variety of musical instruments by applying machine learning techniques to automatically derive the mappings between expressive audio input and control output, with a focus on timbral control.
40
A noise robust feature extraction algorithm using joint wavelet packet subband decomposition and AR modeling of speech signals
Bojan Kotnik,Zdravko Kacic +1 more
TL;DR: This paper presents a noise robust feature extraction algorithm NRFE using joint wavelet packet decomposition (WPD) and autoregressive (AR) modeling of a speech signal to improve noise robustness and performance.
35
Online speech/music segmentation based on the variance mean of filter bank energy
TL;DR: The proposed VMFBE feature as a stand-alone speech/music discriminator in a segmentation system achieves an overall accuracy of over 94% on radio broadcast material and it outperforms other features used for comparison, by more than 8%.
Voice activity detection algorithm using nonlinear spectral weights, hangover and hangbefore criteria
TL;DR: A nonlinear function into the frequency spectrum that improves the detection of vowels, diphthongs, and semivowels within the speech signal and presents a procedure for faster definition of those optimal constants used by hangover and hangbefore criteria.
13
References
Suppression of acoustic noise in speech using spectral subtraction
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
5.3K
•Book
Discrete-Time Processing of Speech Signals
J. R. Deller,John G. Proakis,John H. L. Hansen +2 more
- 01 Mar 1993
TL;DR: The preface to the IEEE Edition explains the background to speech production, coding, and quality assessment and introduces the Hidden Markov Model, the Artificial Neural Network, and Speech Enhancement.
3.1K
•Proceedings Article
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
David Pearce,Hans-Günter Hirsch +1 more
- 01 Jan 2000
TL;DR: A database designed to evaluate the performance of speech recognition algorithms in noisy conditions and recognition results are presented for the first standard DSR feature extraction scheme that is based on a cepstral analysis.
Spectral Subtraction Based on Minimum Statistics
Rainer Martin
- 01 Jan 2001
TL;DR: An unbiased noise power estimator based on minimum statistics is derived and its statistical properties and its performance in the context of spectral subtraction are discussed.
680
Hidden Markov model decomposition of speech and noise
Andrew Varga,Roger K. Moore +1 more
- 03 Apr 1990
TL;DR: A technique of signal decomposition using hidden Markov models is described that provides an optimal method of decomposing simultaneous processes and has wide implications for signal separation in general and improved speech modeling in particular.
577