iROVER: Improving System Combination with Classification
Dustin Hillard,Bjoern Hoffmeister,Mari Ostendorf,Ralf Schlueter,Hermann Ney +4 more
- 22 Apr 2007
- pp 65-68
TL;DR: An improved system combination technique, iROVER, is presented that obtains significant improvements over ROVER, and is consistently better across varying numbers of component systems.
read more
Abstract: We present an improved system combination technique, iROVER, Our approach obtains significant improvements over ROVER, and is consistently better across varying numbers of component systems A classifier is trained on features from the system lattices, and selects the final word hypothesis by learning cues to choose the system that is most likely to be correct at each word location This approach achieves the best result published to date on the TC-STAR 2006 English speech recognition evaluation set
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Handbook of Natural Language Processing and Machine Translation
Joseph Olive,Caitlin Christianson,John McCary +2 more
- 01 Jan 2011
TL;DR: This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program--The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation.
150
Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition
Giulia Garau,Steve Renals +1 more
TL;DR: The results indicate that combining conventional and pitch-synchronous acoustic feature sets using HLDA results in a consistent, significant decrease in word error rate across all three LVCSR tasks.
Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech
01 Mar 2022
TL;DR: In this article , the authors presented their work on code-switched Egyptian Arabic-English ASR using DNN-based hybrid and Transformer-based end-to-end models.
Patent
Recognition using re-recognition and statistical classification
Shuangyu Chang,Michael Levit,Bruce Melvin Buntschuh +2 more
- 01 Jun 2010
TL;DR: In this article, an overall grammar is used as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc.
34
Speech and Audio Processing for Coding, Enhancement and Recognition
Tokunbo Ogunfunmi,Roberto Togneri,Madihally Narasimha +2 more
- 15 Oct 2014
TL;DR: The basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals are described, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas.
References
BoosTexter: A Boosting-based Systemfor Text Categorization
Robert E. Schapire,Yoram Singer +1 more
TL;DR: In this article, a new and improved family of boosting algorithms is proposed for text categorization tasks, called BoosTexter, which learns from examples to perform multiclass text and speech categorization.
A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
Jonathan G. Fiscus
- 14 Dec 1997
TL;DR: The NIST Recognizer Output Voting Error Reduction (ROVER) system as discussed by the authors was developed at NIST to produce a composite automatic speech recognition (ASR) system output when the outputs of multiple ASR systems are available, and for which the composite ASR output has a lower error rate than any of the individual systems.
1.3K
The sri march 2000 hub-5 conversational speech transcription system
Andreas Stolcke,Harry Bratt,John Butzberger,Horacio Franco,V. R. Rao Gadde,C. Richey,E. Shriberg,Fuliang Weng,Jing Zheng +8 more
- 01 Jan 2000
TL;DR: SRI’s large vocabulary conversational speech r ecognition system as used in the March 2000 NIST Hub-5E evaluation is described and a generalized ROVER algorithm is applied to combine the N-best hypotheses from several systems based on different acoustic models.
Using word probabilities as confidence measures
Frank Wessel,K. Macherey,Ralf Schlüter +2 more
- 12 May 1998
TL;DR: An approach to estimate the confidence in a hypothesized word as its posterior probability, given all acoustic feature vectors of the speaker utterance, as the sum of all word hypothesis probabilities which represent the occurrence of the same word in more or less the same segment of time.
•Proceedings Article
Frame Based System Combination and a Comparison with Weighted ROVER and CNC
Bjorn Hoffmeister,Tobias Klein,Ralf Schlüter,Hermann Ney +3 more
- 01 Jan 2006
TL;DR: A novel ASR system combination technique able to combine systems producing word graphs of different structure and with different segmentations based on the definition of a time frame-wise word error cost function in a minimum Bayes risk framework is presented.