Multiresolution convolutional neural network for robust speech recognition

doi:10.1109/IRANIANCEE.2017.7985272

Proceedings Article10.1109/IRANIANCEE.2017.7985272

Multiresolution convolutional neural network for robust speech recognition

Navid Naderi, +1 more

- 02 May 2017

- pp 1459-1464

13

TL;DR: Recognition accuracy on Aurora 2 database, show that MRCNN with two CNNs and corresponding 1×6 and 1×20 convolution filter sizes outperformsCNNs and other MRCnns setting in extracting robust features.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.3390/APP11083603

A Deep Neural Network Model for Speaker Identification

Feng Ye, +1 more

- 16 Apr 2021

- Applied Sciences

TL;DR: A deep neural network (DNN) model based on a two-dimensional convolutional neural network and gated recurrent unit (GRU) for speaker identification is proposed and the experimental results showed that the proposed DNN model, which is called deep GRU, achieved a high recognition accuracy of 98.96%.

...read moreread less

97

•Book Chapter•10.1007/978-3-319-93764-9_32

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

Emad M. Grais, +3 more

- 02 Jul 2018

TL;DR: The proposed MR-FCN is applied to separate the singing voice from mixtures of music sources and improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional Neural Network (FCNs) on the audio source separation problem.

...read moreread less

29

•Proceedings Article•10.1109/PRIA.2019.8786010

A Convolutional Neural Network model based on Neutrosophy for Noisy Speech Recognition

Elyas Rashno, +2 more

- 06 Mar 2019

TL;DR: Neutrosophic Convolutional Neural Network (NCNN) as discussed by the authors is proposed for classification task where the speech signals are used as input data and their noise is modeled as uncertainty.

...read moreread less

17

•Posted Content

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

Emad M. Grais, +3 more

- 28 Oct 2017

- arXiv: Sound

TL;DR: In this paper, a multi-resolution fully convolutional neural network (MR-FCNN) is proposed to separate a target audio source from a mixture of many audio sources.

...read moreread less

15

•Journal Article•10.1109/jbhi.2023.3248281

Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning

01 May 2023

- IEEE Journal of Biomedical and Health In...

TL;DR: In this paper , a multi-branched training scheme was proposed to address the class imbalance problem in the SD domain via a multibranching (MB) scheme and by weighting the contribution of classes in the overall loss function, resulting in a huge improvement in stuttering classes.

...read moreread less

10

References

•Book

Speech Enhancement: Theory and Practice

Philipos C. Loizou

- 07 Jun 2007

TL;DR: Clear and concise, this book explores how human listeners compensate for acoustic noise in noisy environments and suggests steps that can be taken to realize the full potential of these algorithms under realistic conditions.

...read moreread less

2.5K

•Journal Article•10.1109/TASLP.2014.2339736

Convolutional neural networks for speech recognition

Ossama Abdel-Hamid, +5 more

- 01 Oct 2014

- IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that further error rate reduction can be obtained by using convolutional neural networks (CNNs), and a limited-weight-sharing scheme is proposed that can better model speech features.

...read moreread less

2.5K

Journal Article•10.1109/TASL.2011.2109382

Acoustic Modeling Using Deep Belief Networks

Abdelrahman Mohamed, +2 more

- 01 Jan 2012

- IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that better phone recognition on the TIMIT dataset can be achieved by replacing Gaussian mixture models by deep neural networks that contain many layers of features and a very large number of parameters.

...read moreread less

2K

•Proceedings Article

The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions

David Pearce, +1 more

- 01 Jan 2000

TL;DR: A database designed to evaluate the performance of speech recognition algorithms in noisy conditions and recognition results are presented for the first standard DSR feature extraction scheme that is based on a cepstral analysis.

...read moreread less

2K

Proceedings Article•10.1109/ICASSP.2015.7178838

Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks

Tara N. Sainath, +3 more

- 08 Sep 2015

TL;DR: This paper takes advantage of the complementarity of CNNs, LSTMs and DNNs by combining them into one unified architecture, and finds that the CLDNN provides a 4-6% relative improvement in WER over an LSTM, the strongest of the three individual models.

...read moreread less

1.9K