Shinji Watanabe

Johns Hopkins University

543 Papers

2.9K Citations

Shinji Watanabe is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 53, co-authored 383 publications. Previous affiliations of Shinji Watanabe include Mitsubishi Electric Research Laboratories & Mitsubishi.

Author Tools

Create citation map

Create Author Profile

Analyze Shinji Watanabe's Top Papers

Chat about Author

Papers

Journal Article•10.48550/arXiv.2305.13331

A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning

Jiyang Tang, +4 more

- 19 May 2023

- arXiv.org

TL;DR: In this article , two multi-task learning methods based on the CTC/Attention architecture were introduced to perform both tasks simultaneously. And they achieved state-of-the-art speaker-level detection accuracy (97.3%), and a relative WER reduction of 11% for moderate Aphasia patients.

...read moreread less

Tensor decomposition for minimization of E2E SLU model toward on-device processing

Yosuke Kashiwagi, +8 more

- 02 Jun 2023

TL;DR: In this paper , tensor decomposition was applied to the Conformer and E-Branchformer architectures in an end-to-end (E2E) SLU model to reduce the model size.

...read moreread less

Proceedings Article•10.1109/ICASSP.2012.6288845

Bag Of ARCS: New representation of speech segment features based on finite state machines

Shinji Watanabe, +4 more

- 25 Mar 2012

TL;DR: The effectiveness of the proposed approach for some ASR post-processing applications in utterance classification experiments, and in speaker adaptation experiments, is shown by achieving absolute 1% improvement in WER from baseline results.

...read moreread less

•Posted Content

A practical two-stage training strategy for multi-stream end-to-end speech recognition

Ruizhi Li, +4 more

- 23 Oct 2019

- arXiv: Computation and Language

TL;DR: In this article, a two-stage training scheme is proposed to train a Universal Feature Extractor (UFE), where encoder outputs are produced from a single-stream model trained with all data.

...read moreread less

•Posted Content

Multi-mode Transformer Transducer with Stochastic Future Context

Kwangyoun Kim, +4 more

- 17 Jun 2021

- arXiv: Audio and Speech Processing

TL;DR: In this article, the authors propose a multi-mode ASR model with stochastic future context, a simple training procedure that samples one streaming configuration in each iteration, which can fulfill various latency requirements during inference.

...read moreread less

...

Expand