Hynek Hermansky

Johns Hopkins University

328 Papers

4.5K Citations

Hynek Hermansky is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Speech processing & Computer science. The author has an hindex of 51, co-authored 317 publications. Previous affiliations of Hynek Hermansky include Oregon Health & Science University & State Street Corporation.

Author Tools

Create citation map

Create Author Profile

Analyze Hynek Hermansky's Top Papers

Chat about Author

Papers

Proceedings Article•10.1109/ICASSP.2013.6639241

Weak top-down constraints for unsupervised acoustic model training

Aren Jansen, +2 more

- 26 May 2013

TL;DR: A much weaker form of top-down supervision for use in place of transcripts and dictionaries in the zero resource setting is investigated, capable of improving model speaker independence by up to 57% relative over bottom-up training alone.

...read moreread less

Journal Article•10.1016/S0167-6393(99)00002-3

On the relative importance of various components of the modulation spectrum for automatic speech recognition

Noboru Kanedera, +5 more

- 01 May 1999

- Speech Communication

TL;DR: Most of the useful linguistic information is in modulation frequency components from the range between 1 and 16 Hz, with the dominant component at around 4 Hz, and in some realistic environments, the use of componentsfrom the range below 2 Hz or above 16 Hz can degrade the recognition accuracy.

...read moreread less

•Proceedings Article

Robust ASR front-end using spectral-based and discriminant features: experiments on the Aurora tasks

M. Carmen Benítez, +9 more

- 01 Jan 2001

TL;DR: An automatic speech recognition frontend that combines low-level robust ASR feature extraction techniques, and higher-level linear and non-linear feature transformations is described.

...read moreread less

•Proceedings Article•10.21437/INTERSPEECH.2009-18

Posterior-based Out of Vocabulary Word Detection in Telephone Speech

Stefan Kombrink, +4 more

- 06 Sep 2009

TL;DR: An approach based on phone posteriors created by a Large Vocabulary Continuous Speech Recognition system and an additional phone recognizer that allows detection of OOV and misrecognized words is presented.

...read moreread less

Proceedings Article•10.1109/ICPR.2016.7899966

A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation

Tetsuji Ogawa, +5 more

- 01 Jan 2016

TL;DR: The M-measure was extended by considering the latent phoneme information, resulting in an improved reliability, and was successfully applied to multistream-based unsupervised adaptation of ASR systems to address data uncertainty when the ground-truth is unknown.

...read moreread less

...

Expand