Journal Article10.1109/TASLP.2014.2364452
A regression approach to speech enhancement based on deep neural networks
1.5K
TL;DR: The proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general, and is effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.
read more
Abstract: In contrast to the conventional minimum mean square error (MMSE)-based noise reduction techniques, we propose a supervised method to enhance speech by means of finding a mapping function between noisy and clean speech signals based on deep neural networks (DNNs). In order to be able to handle a wide range of additive noises in real-world situations, a large training set that encompasses many possible combinations of speech and noise types, is first designed. A DNN architecture is then employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been proposed to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and the dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the conventional MMSE based technique. It is also interesting to observe that the proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Audio Noise Filter using Cycle Consistent Adversarial Network - CycleGAN ANF
Nam Son Nguyen,Tengpeng Li,Xiaoqian Zhang,Bo Sheng,Teng Wang,Jiayin Wang +5 more
- 01 Dec 2019
TL;DR: In this paper, the authors proposed CycleGAN ANF, a neural network approach that can learn to reduce both stationary and non-stationary noises, totally unsupervised, by reading in a raw audio sample from a set X (speech mixed with noises) and transforming it so that it sound as if it belongs in set Y (clean speech).
2
Regression-based speech enhancement by convolutional neural network
Mustafa Erseven,Bulent Bolat +1 more
- 02 May 2018
TL;DR: A regression-based convolutional neural network model is proposed for speech enhancement to remove the noise on the conversations and the results are evaluated by perceptual evaluation of speech quality and short time objective intelligibility.
2
A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence
TL;DR: In this paper , the authors proposed a supervised single-channel speech enhancement method that combines Kullback-Leibler divergence-based non-negative matrix factorization (NMF) and a hidden Markov model (HMM).
Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement
Haoyu Li,Yun Liu,Junichi Yamagishi +2 more
- 22 Mar 2022
TL;DR: In this paper , a deep learning-based joint framework integrating noise reduction (NR) with listening enhancement (LE) is proposed, in which the NR module first suppresses noise and the LE module then modifies the denoised speech to further improve speech intelligibility.
Targeted Voice Enhancement by Bandpass Filter and Composite Deep Denoising Autoencoder
Raghad Yaseen Lazim AL-Taai,Wu Xiaojun,Zhu Y +2 more
- 14 Dec 2020
TL;DR: In this article, a hybrid system for hearing-aids application, which works to separate the target voice from the noisy signal and then enhance the speech based on the user's hearing loss, is proposed.
2
References
Reducing the Dimensionality of Data with Neural Networks
TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
A fast learning algorithm for deep belief nets
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
•Book
Learning Deep Architectures for AI
Yoshua Bengio
- 01 Jan 2009
TL;DR: The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.
•Posted Content
Improving neural networks by preventing co-adaptation of feature detectors
TL;DR: The authors randomly omits half of the feature detectors on each training case to prevent complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.
Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks
Geoffrey E. Hinton,Ruslan Salakhutdinov +1 more
- 01 Jan 2006
TL;DR: This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.