Automatic recognition of keywords in unconstrained speech using hidden Markov models

doi:10.1109/29.103088

Journal Article10.1109/29.103088

Automatic recognition of keywords in unconstrained speech using hidden Markov models

Jay G. Wilpon, +3 more

- 01 Nov 1990

- IEEE Transactions on Acoustics, Speech, ...

- Vol. 38, Iss: 11, pp 1870-1878

498

TL;DR: The modifications made to a connected word speech recognition algorithm based on hidden Markov models which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Patent

Machine learning based upon feedback from contact center analysis

Christopher Douglas Blair, +1 more

- 24 Aug 2006

TL;DR: A signal monitoring apparatus and method involving devices for monitoring signals representing communications traffic, devices for identifying at least one predetermined parameter by analyzing the context of the monitoring signal, a device for recording the occurrence of the identified parameter, identifying the traffic stream associated with the identified parameters, and a device, responsive to the analysis of the recorded data, for controlling the handling of communications traffic within the apparatus as mentioned in this paper.

...read moreread less

2

•Dissertation

Audio pattern discovery and retrieval

Lei Wang

- 01 Jan 2012

TL;DR: This thesis explores unsupervised algorithms for pattern discovery and retrieval in audio and speech data and explores the techniques of searching audio pattern in broadcast audio which consists of diverse content such as speech, music/songs, commercials, sound effects and background noise.

...read moreread less

2

Book Chapter•10.1145/3122865.3122869

Multimodal analysis of free-standing conversational groups

Xavier Alameda-Pineda, +2 more

- 19 Dec 2017

TL;DR: A multimodal joint head and body pose estimator is described and compared to other recent approaches for head andBody pose estimation and F-formation detection, in particular for the detection of conversational groups or F-formations.

...read moreread less

2

•Proceedings Article•10.1109/SLT.2016.7846306

Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

Ming Sun, +8 more

- 05 May 2017

- arXiv: Computation and Language

TL;DR: In this paper, a max-pooling based loss function was proposed for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting, with low CPU, memory, and latency requirements.

...read moreread less

2

Proceedings Article•10.21437/INTERSPEECH.2015-480

Garbage Modeling for On-device Speech Recognition

Christophe Van Gysel, +3 more

- 06 Sep 2015

TL;DR: This work presents a data-driven methodology for mining tens of thousands of target phrases from an existing corpus, and examines a deficiency of the sub-word modeling approach and introduces a novel modification that makes use of common prefixes between targeted phrases and non-targeted phrases.

...read moreread less

2

...

Expand

References

Journal Article•10.1109/TASSP.1975.1162641

Minimum prediction residual principle applied to speech recognition

F. Itakura

- 01 Feb 1975

- IEEE Transactions on Acoustics, Speech, ...

TL;DR: A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual through optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm.

...read moreread less

1.7K

Journal Article•10.1109/PROC.1976.10159

Continuous speech recognition by statistical methods

Frederick Jelinek

- 01 Apr 1976

TL;DR: Experimental results are presented that indicate the power of the methods and concern modeling of a speaker and of an acoustic processor, extraction of the models' statistical parameters and hypothesis search procedures and likelihood computations of linguistic decoding.

...read moreread less

1.1K

Large-vocabulary speaker-independent continuous speech recognition: the sphinx system

Raj Reddy, +1 more

- 01 Jan 1988

436

Journal Article•10.1002/J.1538-7305.1985.TB00272.X

Recognition of isolated digits using hidden Markov models with continuous mixture densities

Lawrence R. Rabiner, +3 more

- 08 Jul 1985

- AT&T technical journal

TL;DR: This paper extends previous work on isolated-word recognition based on hidden Markov models by replacing the discrete symbol representation of the speech signal with a continuous Gaussian mixture density, thereby eliminating the inherent quantization error introduced by the discrete representation.

...read moreread less

296

Journal Article•10.1002/J.1538-7305.1986.TB00368.X

A segmental k-means training procedure for connected word recognition

Lawrence R. Rabiner, +2 more

- 06 May 1986

- AT&T technical journal

TL;DR: In this paper, a segmental k-means training procedure was used to extract whole-word patterns from naturally spoken word strings, which were then used to create a set of word reference patterns for recognition.

...read moreread less

257