Hynek Hermansky
Johns Hopkins University
328 Papers
4.5K Citations
Hynek Hermansky is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Speech processing & Computer science. The author has an hindex of 51, co-authored 317 publications. Previous affiliations of Hynek Hermansky include Oregon Health & Science University & State Street Corporation.
Chat about Author
Papers
Weak top-down constraints for unsupervised acoustic model training
Aren Jansen,Samuel Thomas,Hynek Hermansky +2 more
- 26 May 2013
TL;DR: A much weaker form of top-down supervision for use in place of transcripts and dictionaries in the zero resource setting is investigated, capable of improving model speaker independence by up to 57% relative over bottom-up training alone.
On the relative importance of various components of the modulation spectrum for automatic speech recognition
TL;DR: Most of the useful linguistic information is in modulation frequency components from the range between 1 and 16 Hz, with the dominant component at around 4 Hz, and in some realistic environments, the use of componentsfrom the range below 2 Hz or above 16 Hz can degrade the recognition accuracy.
•Proceedings Article
Robust ASR front-end using spectral-based and discriminant features: experiments on the Aurora tasks
M. Carmen Benítez,Lukas Burget,Barry Y. Chen,Stéphane Dupont,Harinath Garudadri,Hynek Hermansky,Pratibha Jain,Sachin S. Kajarekar,Nelson Morgan,Sunil Sivadas +9 more
- 01 Jan 2001
TL;DR: An automatic speech recognition frontend that combines low-level robust ASR feature extraction techniques, and higher-level linear and non-linear feature transformations is described.
Posterior-based Out of Vocabulary Word Detection in Telephone Speech
Stefan Kombrink,Lukas Burget,Pavel Matejka,Martin Karafiat,Hynek Hermansky +4 more
- 06 Sep 2009
TL;DR: An approach based on phone posteriors created by a Large Vocabulary Continuous Speech Recognition system and an additional phone recognizer that allows detection of OOV and misrecognized words is presented.
A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation
Tetsuji Ogawa,Sri Harish Mallidi,Emmanuel Dupoux,Jordan Cohen,Naomi H. Feldman,Hynek Hermansky +5 more
- 01 Jan 2016
TL;DR: The M-measure was extended by considering the latent phoneme information, resulting in an improved reliability, and was successfully applied to multistream-based unsupervised adaptation of ASR systems to address data uncertainty when the ground-truth is unknown.