Shinji Watanabe

Johns Hopkins University

543 Papers

2.9K Citations

Shinji Watanabe is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 53, co-authored 383 publications. Previous affiliations of Shinji Watanabe include Mitsubishi Electric Research Laboratories & Mitsubishi.

Author Tools

Create citation map

Create Author Profile

Analyze Shinji Watanabe's Top Papers

Chat about Author

Papers

Journal Article•10.1109/TASLP.2020.2988423

Improving End-to-End Single-Channel Multi-Talker Speech Recognition

Wangyou Zhang, +3 more

- 20 Apr 2020

- IEEE Transactions on Audio, Speech, and ...

TL;DR: An enhanced end-to-end monaural multi- talker ASR architecture and training strategy to recognize the overlapped speech and demonstrates that the proposed architectures can significantly improve the multi-talker mixed speech recognition.

...read moreread less

•Proceedings Article•10.1109/SLT48900.2021.9383517

Streaming Transformer Asr With Blockwise Synchronous Beam Search

Emiru Tsunoo, +2 more

- 19 Jan 2021

TL;DR: In this article, a blockwise synchronous beam search algorithm based on blockwise processing of encoder is proposed to perform streaming E2E Transformer ASR, where encoded feature blocks are synchronously aligned using a block boundary detection technique, where a reliability score of each predicted hypothesis is evaluated based on the end-ofsequence and repeated tokens in the hypothesis.

...read moreread less

•Posted Content

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet.

Siddhant Arora, +12 more

- 29 Nov 2021

- arXiv: Computation and Language

TL;DR: The ESPnet-SLU project as mentioned in this paper is a toolkit that can be used to generate reproducible results on different Spoken Language Understanding (SLU) benchmarks, such as ASR, Text to Speech (TTS), and ST.

...read moreread less

•Posted Content

End-to-End Neural Speaker Diarization with Permutation-Free Objectives

Yusuke Fujita, +5 more

- 12 Sep 2019

- arXiv: Audio and Speech Processing

TL;DR: In this paper, a single neural network was proposed to directly output speaker diarization results, and a permutation-free objective function was introduced to minimize the speaker-label permutation problem.

...read moreread less

•Proceedings Article•10.21437/INTERSPEECH.2015-706

Uncertainty propagation through deep neural networks

Ahmed Hussen Abdelaziz, +4 more

- 06 Sep 2015

TL;DR: In this paper, the authors study the propagation of observation uncertainties through the layers of a DNN-based acoustic model and employ approximate propagation methods, including Monte Carlo sampling, the unscented transform, and the piecewise exponential approximation of the activation function, to estimate the distribution of acoustic scores.

...read moreread less

...

Expand