Journal Article10.1109/29.103088
Automatic recognition of keywords in unconstrained speech using hidden Markov models
498
TL;DR: The modifications made to a connected word speech recognition algorithm based on hidden Markov models which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described.
read more
Abstract: The modifications made to a connected word speech recognition algorithm based on hidden Markov models (HMMs) which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described. The novelty of this approach is that statistical models of both the actual vocabulary word and the extraneous speech and background are created. An HMM-based connected word recognition system is then used to find the best sequence of background, extraneous speech, and vocabulary word models for matching the actual input. Word recognition accuracy of 99.3% on purely isolated speech (i.e., only vocabulary items and background noise were present), and 95.1% when the vocabulary word was embedded in unconstrained extraneous speech, were obtained for the five word vocabulary using the proposed recognition algorithm. >
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Is ASR ready for wireless primetime: measuring the core technology for selected applications
TL;DR: A set of benchmark tasks designed to evaluate the state-of-the-art ASR technologies from a wireless perspective are described and the results of these benchmark tests on two commercially available software-based ASR systems that represent the best core ASR technology on the market are presented.
6
Unsupervised Learning of a Chinese Spontaneous and Colloquial Speech Lexicon with Content and Filler Phrase Classification
Cheung Chi-Shun,Pascale Fung +1 more
TL;DR: This work proposes an unsupervised learning method to find colloquial terms and classify filler and content phrases in spontaneous andColloquial Chinese, including Cantonese, and adapts a language model trained from written texts with the Hong Kong Newsgroup corpus that outperforms both the standard Chinese language model and also the Cantonesese language model.
6
Towards Visually Prompted Keyword Localisation for Zero-Resource Spoken Languages
09 Jan 2023
TL;DR: This article proposed a speech-vision model with a novel localising attention mechanism which they train with a new keyword sampling scheme, and showed that these innovations give improvements in VPKL over an existing speech-viz model.
5
•Posted Content
Learning acoustic word embeddings with phonetically associated triplet network
TL;DR: A novel architecture, phonetically associated triplet network (PATN), which aims at increasing discriminative power of acoustic word embeddings by utilizing phonetic information as well as word identity.
5
References
Minimum prediction residual principle applied to speech recognition
TL;DR: A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual through optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm.
1.7K
Continuous speech recognition by statistical methods
Frederick Jelinek
- 01 Apr 1976
TL;DR: Experimental results are presented that indicate the power of the methods and concern modeling of a speaker and of an acoustic processor, extraction of the models' statistical parameters and hypothesis search procedures and likelihood computations of linguistic decoding.
1.1K
Recognition of isolated digits using hidden Markov models with continuous mixture densities
TL;DR: This paper extends previous work on isolated-word recognition based on hidden Markov models by replacing the discrete symbol representation of the speech signal with a continuous Gaussian mixture density, thereby eliminating the inherent quantization error introduced by the discrete representation.
296
A segmental k-means training procedure for connected word recognition
TL;DR: In this paper, a segmental k-means training procedure was used to extract whole-word patterns from naturally spoken word strings, which were then used to create a set of word reference patterns for recognition.
257
Related Papers (5)
Richard Rose,D.B. Paul +1 more
- 03 Apr 1990
Lawrence R. Rabiner,Biing-Hwang Juang +1 more
- 01 Jan 1993
Guoguo Chen,Carolina Parada,Georg Heigold +2 more
- 04 May 2014