Robust Bayesian estimation for context-based speech enhancement
TL;DR: A Bayesian framework that automatically provides the benefits of both context-dependent models that are trained on speech data with one or more aspects of context, such as speaker, acoustic environment, speaking style, etc.
read more
Abstract: Model-based speech enhancement algorithms that employ trained models, such as codebooks, hidden Markov models, Gaussian mixture models, etc., containing representations of speech such as linear predictive coefficients, mel-frequency cepstrum coefficients, etc., have been found to be successful in enhancing noisy speech corrupted by nonstationary noise. However, these models are typically trained on speech data from multiple speakers under controlled acoustic conditions. In this paper, we introduce the notion of context-dependent models that are trained on speech data with one or more aspects of context, such as speaker, acoustic environment, speaking style, etc. In scenarios where the modeled and observed contexts match, context-dependent models can be expected to result in better performance, whereas context-independent models are preferred otherwise. In this paper, we present a Bayesian framework that automatically provides the benefits of both models under varying contexts. As several aspects of the context remain constant over an extended period during usage, a memory-based approach that exploits information from past data is employed. We use a codebook-based speech enhancement technique that employs trained models of speech and noise linear predictive coefficients as an example model-based approach. Using speaker, acoustic environment, and speaking style as aspects of context, we demonstrate the robustness of the proposed framework for different context scenarios, input signal-to-noise ratios, and number of contexts modeled.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Codebook-based speech enhancement with Bayesian LP parameters estimation
Qing Wang,Changchun Bao +1 more
- 01 Dec 2015
TL;DR: A codebook-based Bayesian linear predictive (LP) parameters estimation for speech enhancement, in which the LP parameters are estimated based on the current and past frames of noisy speech, is proposed.
1
Speaker Adapted Codebooks for Speech Enhancement
D. Hanumantha Rao Naidu
- 11 Jun 2023
TL;DR: In this paper , the authors investigate the adaption of SI codebook of spectral representation of speech data to a specific target speaker using Vector Quantization Maximum a posteriori (VQ-MAP) algorithm and study its effect on speech enhancement performance.
1
Speaker Adapted Codebooks for Speech Enhancement
11 Jun 2023
TL;DR: In this paper , the authors investigate the adaption of SI codebook of spectral representation of speech data to a specific target speaker using Vector Quantization Maximum a posteriori (VQ-MAP) algorithm and study its effect on speech enhancement performance.
References
A tutorial on hidden Markov models and selected applications in speech recognition
Lawrence R. Rabiner
- 01 Feb 1989
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
An Algorithm for Vector Quantizer Design
Y. Linde,A. Buzo,Robert M. Gray +2 more
TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.
Suppression of acoustic noise in speech using spectral subtraction
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
5.3K
Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
Yariv Ephraim,David Malah +1 more
TL;DR: In this article, a system which utilizes a minimum mean square error (MMSE) estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm.
4.2K
•Journal Article
Speech enhancement using a minimum mean square error short-time spectral amplitude estimator
TL;DR: This paper derives a minimum mean-square error STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables, which results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise.
3.5K