Hongbin Suo
Alibaba Group
5 Papers
23 Citations
Hongbin Suo is an academic researcher from Alibaba Group. The author has contributed to research in topics: Computer science & Voice activity detection. The author has an hindex of 2, co-authored 5 publications.
Chat about Author
Papers
A Real-Time Speaker Diarization System Based on Spatial Spectrum
Siqi Zheng,Weilong Huang,Xianliang Wang,Hongbin Suo,Jinwei Feng,Zhi-Jie Yan +5 more
- 06 Jun 2021
TL;DR: In this article, a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting is presented. But the system is limited to short text-independent utterances.
31
Phonetically-Aware Coupled Network For Short Duration Text-Independent Speaker Verification.
Siqi Zheng,Yun Lei,Hongbin Suo +2 more
- 25 Oct 2020
TL;DR: An end-to-end phonetically-aware coupled network for short duration speaker verification tasks that provides direct comparison of speech contents between two utterances and hence enabling phonetic-based normalization.
16
Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings
Shiliang Zhang,Siqi Zheng,Weilong Huang,Ming Lei,Hongbin Suo,Jinwei Feng,Zhi-Jie Yan +6 more
- 30 Aug 2021
TL;DR: Experimental results show that two-stream DFSMN with attention pooling can effectively model acoustic-spatial feature and significantly boost the performance of OSD, result in 3.5% absolute detection accuracy improvement compared to the baseline system.
7
•Posted Content
BeamTransformer: Microphone Array-based Overlapping Speech Detection.
Siqi Zheng,Shiliang Zhang,Weilong Huang,Qian Chen,Hongbin Suo,Ming Lei,Jinwei Feng,Zhi-Jie Yan +7 more
TL;DR: In this article, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling is proposed to optimize modeling of sequential relationship among signals from different spatial direction.
Cam: Context-Aware Masking for Robust Speaker Verification
Ya-Qi Yu,Siqi Zheng,Hongbin Suo,Yun Lei,Wu-Jun Li +4 more
- 06 Jun 2021
TL;DR: The authors proposed context-aware masking (CAM), which enables the speaker embedding network to focus on the speaker of interest and blur unrelated noise by dynamically controlling the threshold of masking.