Efficient blind speech separation suitable for embedded devices
Kazunobu Kondo,Yu Takahashi,Seiichi Hashimoto,Hiroshi Saruwatari,Takanori Nishino,Kazuya Takeda +5 more
- 29 Aug 2011
- pp 2319-2323
TL;DR: A blind speech separation method with a low computational complexity that consists of a combination of independent component analysis with frequency band selection, and a frame-wise spectral softmask method based on an inter-channel power ratio of tentative separated signals in the frequency domain.
read more
Abstract: A blind speech separation method with a low computational complexity is proposed. This method consists of a combination of independent component analysis with frequency band selection, and a frame-wise spectral softmask method based on an inter-channel power ratio of tentative separated signals in the frequency domain. The softmask cancels the transfer function between sources and separated signals. A theoretical analysis is given. Performance and effectiveness are evaluated via source separation simulations and a computational estimate, and experimental results show the significantly improved performance of the proposed method. The segmental signal-to-noise ratio achieves 7 [dB] and 3 [dB], and the cepstral distortion achieves 1 [dB] and 2.5 [dB], in anechoic and reverberant conditions, respectively. Moreover, there can be a reduction of over 80% in computational complexity compared with unmodified FDICA.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Improved Method of Blind Speech Separation with Low Computational Complexity
Kazunobu Kondo,Yu Takahashi,Seiichi Hashimoto,Hiroshi Saruwatari,Takanori Nishino,Kazuya Takeda +5 more
TL;DR: A blind speech separation method with low computational complexity that consists of a combination of independent component analysis with frequency band selection, and a frame-wise spectral soft mask method based on an interchannel power ratio of tentative separated signals in the frequency domain.
•Proceedings Article
Fast source separation based on selection of effective temporal frames
Yusuke Mizuno,Kazunobu Kondo,Takanori Nishino,Norihide Kitaoka,Kazuya Takeda +4 more
- 18 Oct 2012
TL;DR: A method of selecting temporal frames which are effective for training the separation filters is proposed and evaluated, and the proposed method can achieve faster computation with lower computational complexity, and its effectiveness can be confirmed.
3
Practically Efficient Blind Speech Separation Using Frequency Band Selection Based on Magnitude Squared Coherence and a Small Dodecahedral Microphone Array
TL;DR: A band selection method based on magnitude squared coherence is proposed for small agglomerative microphone array systems and shows improvement in performance compared to the use of uniformly spaced frequency band.
2
•Dissertation
Blind source separation with the low computational costs for the mobile and portable speech equipment
Kazunobu Kondo,多伸 近藤 +1 more
- 31 Jul 2014
TL;DR: Frequency bin selection is proposed as a method to reduce the computational cost of blind source separation (BSS) based on frequency domain independent component analysis (FDICA), which aims at reducing computational cost by reducing the number of frequency bins in this dissertation.
1
References
•Book
Fundamentals of speech recognition
Lawrence R. Rabiner,Biing-Hwang Juang +1 more
- 01 Jan 1993
TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.
9.4K
•Book
Independent Component Analysis
Aapo Hyvärinen,Juha Karhunen,Erkki Oja +2 more
- 18 May 2001
TL;DR: Independent component analysis as mentioned in this paper is a statistical generative model based on sparse coding, which is basically a proper probabilistic formulation of the ideas underpinning sparse coding and can be interpreted as providing a Bayesian prior.
•Book
Discrete-Time Processing of Speech Signals
J. R. Deller,John G. Proakis,John H. L. Hansen +2 more
- 01 Mar 1993
TL;DR: The preface to the IEEE Edition explains the background to speech production, coding, and quality assessment and introduces the Hidden Markov Model, the Artificial Neural Network, and Speech Enhancement.
3.1K
Blind separation of speech mixtures via time-frequency masking
Ozgur Yilmaz,Scott Rickard +1 more
TL;DR: The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture and show that the W-disjoint orthogonality of speech can be approximate in the case where two anechoic mixtures are provided.
1.7K
Blind separation of convolved mixtures in the frequency domain
TL;DR: It is observed that convolved Mixing in the time domain corresponds to instantaneous mixing in the frequency domain, and convolved mixing can be inverted using simpler and more robust algorithms than the ones recently developed.
872