Journal Article10.1109/29.103088
Automatic recognition of keywords in unconstrained speech using hidden Markov models
498
TL;DR: The modifications made to a connected word speech recognition algorithm based on hidden Markov models which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described.
read more
Abstract: The modifications made to a connected word speech recognition algorithm based on hidden Markov models (HMMs) which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described. The novelty of this approach is that statistical models of both the actual vocabulary word and the extraneous speech and background are created. An HMM-based connected word recognition system is then used to find the best sequence of background, extraneous speech, and vocabulary word models for matching the actual input. Word recognition accuracy of 99.3% on purely isolated speech (i.e., only vocabulary items and background noise were present), and 95.1% when the vocabulary word was embedded in unconstrained extraneous speech, were obtained for the five word vocabulary using the proposed recognition algorithm. >
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Patent
Machine learning based upon feedback from contact center analysis
Christopher Douglas Blair,Roger Louis Keenan +1 more
- 24 Aug 2006
TL;DR: A signal monitoring apparatus and method involving devices for monitoring signals representing communications traffic, devices for identifying at least one predetermined parameter by analyzing the context of the monitoring signal, a device for recording the occurrence of the identified parameter, identifying the traffic stream associated with the identified parameters, and a device, responsive to the analysis of the recorded data, for controlling the handling of communications traffic within the apparatus as mentioned in this paper.
2
•Dissertation
Audio pattern discovery and retrieval
Lei Wang
- 01 Jan 2012
TL;DR: This thesis explores unsupervised algorithms for pattern discovery and retrieval in audio and speech data and explores the techniques of searching audio pattern in broadcast audio which consists of diverse content such as speech, music/songs, commercials, sound effects and background noise.
2
Multimodal analysis of free-standing conversational groups
Xavier Alameda-Pineda,Elisa Ricci,Nicu Sebe +2 more
- 19 Dec 2017
TL;DR: A multimodal joint head and body pose estimator is described and compared to other recent approaches for head andBody pose estimation and F-formation detection, in particular for the detection of conversational groups or F-formations.
2
Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting
Ming Sun,Anirudh Raju,George Tucker,Sankaran Panchapagesan,Geng-Shen Fu,Arindam Mandal,Spyros Matsoukas,Nikko Strom,Shiv Naga Prasad Vitaladevuni +8 more
TL;DR: In this paper, a max-pooling based loss function was proposed for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting, with low CPU, memory, and latency requirements.
2
Garbage Modeling for On-device Speech Recognition
Christophe Van Gysel,Leonid Velikovich,Ian McGraw,Francoise Beaufays +3 more
- 06 Sep 2015
TL;DR: This work presents a data-driven methodology for mining tens of thousands of target phrases from an existing corpus, and examines a deficiency of the sub-word modeling approach and introduces a novel modification that makes use of common prefixes between targeted phrases and non-targeted phrases.
References
Minimum prediction residual principle applied to speech recognition
TL;DR: A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual through optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm.
1.7K
Continuous speech recognition by statistical methods
Frederick Jelinek
- 01 Apr 1976
TL;DR: Experimental results are presented that indicate the power of the methods and concern modeling of a speaker and of an acoustic processor, extraction of the models' statistical parameters and hypothesis search procedures and likelihood computations of linguistic decoding.
1.1K
Recognition of isolated digits using hidden Markov models with continuous mixture densities
TL;DR: This paper extends previous work on isolated-word recognition based on hidden Markov models by replacing the discrete symbol representation of the speech signal with a continuous Gaussian mixture density, thereby eliminating the inherent quantization error introduced by the discrete representation.
296
A segmental k-means training procedure for connected word recognition
TL;DR: In this paper, a segmental k-means training procedure was used to extract whole-word patterns from naturally spoken word strings, which were then used to create a set of word reference patterns for recognition.
257
Related Papers (5)
Richard Rose,D.B. Paul +1 more
- 03 Apr 1990
Lawrence R. Rabiner,Biing-Hwang Juang +1 more
- 01 Jan 1993
Guoguo Chen,Carolina Parada,Georg Heigold +2 more
- 04 May 2014