Lexicon-Free Conversational Speech Recognition with Neural Networks

doi:10.3115/V1/N15-1038

Open AccessProceedings Article10.3115/V1/N15-1038

Lexicon-Free Conversational Speech Recognition with Neural Networks

Andrew L. Maas, +3 more

- 01 Jan 2015

- pp 345-354

180

TL;DR: An approach to speech recognition that uses only a neural network to map acoustic input to characters, a character-level language model, and a beam search decoding procedure, making it possible to directly train a speech recognizer using errors generated by spoken language understanding tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.21437/INTERSPEECH.2017-546

Direct Acoustics-to-Word Models for English Conversational Speech Recognition

Kartik Audhkhasi, +4 more

- 22 Mar 2017

TL;DR: This paper presents the first results employing direct acoustics-to-word CTC models on two well-known public benchmark tasks: Switchboard and CallHome, and presents rescoring results on CTC word model lattices to quantify the performance benefits of a LM, and contrast the performance of word and phone C TC models.

...read moreread less

10.1109/icecube53880.2021.9628208

An On-chip Reconfigurable Multi-layer Perceptron and Recurrent Neural Network Processor for Speech Recognition

Junaid Hussain Muzamal, +2 more

TL;DR: Researchers propose an on-chip reconfigurable MLP-RNN processor for speech recognition, achieving 75% weight reduction and 5x power improvement through LUT-powered multiplication, optimized for performance, cost, and power consumption in a 65nm CMOS process.

...read moreread less

Journal Article•10.48550/arXiv.2305.07034

Quran Recitation Recognition using End-to-End Deep Learning

Khloud Al Jallad

- 10 May 2023

- arXiv.org

TL;DR: In this article , a CNN-Bidirectional GRU encoder and a character-based decoder were used to recognize the recitation of the Holy Quran in a public dataset.

...read moreread less

10.1145/3483446

Improving Deep Learning Based Automatic Speech Recognition for Gujarati

Deepang Raval, +3 more

TL;DR: This study improves Gujarati automatic speech recognition using a deep learning-based approach, incorporating novel techniques such as prefix decoding and post-processing, achieving a 5.87% decrease in Word Error Rate on the Microsoft Speech Corpus.

...read moreread less

•Proceedings Article•10.1145/3411764.3445565

LipType: A Silent Speech Recognizer Augmented with an Independent Repair Model

Laxmi Pandey, +1 more

- 06 May 2021

TL;DR: In this article, an optimized version of LipNet for improved speed and accuracy is developed, and an independent repair model that processes video input for poor lighting conditions, when applicable, and corrects potential errors in output for increased accuracy.

...read moreread less

...

Expand

References

•Proceedings Article

Deep Sparse Rectifier Neural Networks

Xavier Glorot, +2 more

- 14 Jun 2011

TL;DR: This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-dierentiabil ity.

...read moreread less

8.7K

•Proceedings Article

The Kaldi Speech Recognition Toolkit

Daniel Povey, +12 more

- 01 Jan 2011

TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.

...read moreread less

6.8K

•Proceedings Article

Recurrent neural network based language model

Tomas Mikolov, +4 more

- 01 Jan 2010

TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

...read moreread less

6.8K

Proceedings Article•10.1145/1143844.1143891

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Alex Graves, +3 more

- 25 Jun 2006

TL;DR: This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems of sequence learning and post-processing.

...read moreread less

6.8K

•Journal Article

Deep Neural Networks for Acoustic Modeling in Speech Recognition

Geoffrey E. Hinton, +10 more

- 01 Nov 2012

- IEEE Signal Processing Magazine

TL;DR: This paper provides an overview of this progress and repres nts the shared views of four research groups who have had recent successes in using deep neural networks for a coustic modeling in speech recognition.

...read moreread less

2.7K