Efficient blind speech separation suitable for embedded devices

doi:10.5281/ZENODO.42625

Open AccessProceedings Article10.5281/ZENODO.42625

Efficient blind speech separation suitable for embedded devices

Kazunobu Kondo, +5 more

- 29 Aug 2011

- pp 2319-2323

5

Abstract: A blind speech separation method with a low computational complexity is proposed. This method consists of a combination of independent component analysis with frequency band selection, and a frame-wise spectral softmask method based on an inter-channel power ratio of tentative separated signals in the frequency domain. The softmask cancels the transfer function between sources and separated signals. A theoretical analysis is given. Performance and effectiveness are evaluated via source separation simulations and a computational estimate, and experimental results show the significantly improved performance of the proposed method. The segmental signal-to-noise ratio achieves 7 [dB] and 3 [dB], and the cepstral distortion achieves 1 [dB] and 2.5 [dB], in anechoic and reverberant conditions, respectively. Moreover, there can be a reduction of over 80% in computational complexity compared with unmodified FDICA.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1155/2011/765429

Improved Method of Blind Speech Separation with Low Computational Complexity

Kazunobu Kondo, +5 more

- 05 Oct 2011

- Advances in Acoustics and Vibration

TL;DR: A blind speech separation method with low computational complexity that consists of a combination of independent component analysis with frequency band selection, and a frame-wise spectral soft mask method based on an interchannel power ratio of tentative separated signals in the frequency domain.

...read moreread less

7

•Proceedings Article

Fast source separation based on selection of effective temporal frames

Yusuke Mizuno, +4 more

- 18 Oct 2012

TL;DR: A method of selecting temporal frames which are effective for training the separation filters is proposed and evaluated, and the proposed method can achieve faster computation with lower computational complexity, and its effectiveness can be confirmed.

...read moreread less

3

•Journal Article•10.1155/2012/324398

Practically Efficient Blind Speech Separation Using Frequency Band Selection Based on Magnitude Squared Coherence and a Small Dodecahedral Microphone Array

Kazunobu Kondo, +3 more

- 02 Oct 2012

- Journal of Electrical and Computer Engin...

TL;DR: A band selection method based on magnitude squared coherence is proposed for small agglomerative microphone array systems and shows improvement in performance compared to the use of uniformly spaced frequency band.

...read moreread less

2

Journal Article•10.1587/TRANSFUN.E97.A.784

Effective Frame Selection for Blind Source Separation Based on Frequency Domain Independent Component Analysis

Yusuke Mizuno, +4 more

- 01 Mar 2014

- IEICE Transactions on Fundamentals of El...

1

•Dissertation

Blind source separation with the low computational costs for the mobile and portable speech equipment

Kazunobu Kondo, +1 more

- 31 Jul 2014

TL;DR: Frequency bin selection is proposed as a method to reduce the computational cost of blind source separation (BSS) based on frequency domain independent component analysis (FDICA), which aims at reducing computational cost by reducing the number of frequency bins in this dissertation.

...read moreread less

1

References

•Book

Fundamentals of speech recognition

Lawrence R. Rabiner, +1 more

- 01 Jan 1993

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

...read moreread less

9.4K

•Book

Independent Component Analysis

Aapo Hyvärinen, +2 more

- 18 May 2001

TL;DR: Independent component analysis as mentioned in this paper is a statistical generative model based on sparse coding, which is basically a proper probabilistic formulation of the ideas underpinning sparse coding and can be interpreted as providing a Bayesian prior.

...read moreread less

8.4K

•Book

Discrete-Time Processing of Speech Signals

J. R. Deller, +2 more

- 01 Mar 1993

TL;DR: The preface to the IEEE Edition explains the background to speech production, coding, and quality assessment and introduces the Hidden Markov Model, the Artificial Neural Network, and Speech Enhancement.

...read moreread less

3.1K

Journal Article•10.1109/TSP.2004.828896

Blind separation of speech mixtures via time-frequency masking

Ozgur Yilmaz, +1 more

- 01 Jul 2004

- IEEE Transactions on Signal Processing

TL;DR: The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture and show that the W-disjoint orthogonality of speech can be approximate in the case where two anechoic mixtures are provided.

...read moreread less

1.7K

Journal Article•10.1016/S0925-2312(98)00047-2

Blind separation of convolved mixtures in the frequency domain

Paris Smaragdis

- 20 Nov 1998

- Neurocomputing

TL;DR: It is observed that convolved Mixing in the time domain corresponds to instantaneous mixing in the frequency domain, and convolved mixing can be inverted using simpler and more robust algorithms than the ones recently developed.

...read moreread less

872