A variational EM algorithm for learning eigenvoice parameters in mixed signals

doi:10.1109/ICASSP.2009.4959533

Open AccessProceedings Article10.1109/ICASSP.2009.4959533

A variational EM algorithm for learning eigenvoice parameters in mixed signals

Ron Weiss, +1 more

- 19 Apr 2009

- pp 113-116

10

TL;DR: An efficient learning algorithm is derived for model-based source separation for use on single channel speech mixtures where the precise source characteristics are not known a priori.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/MSP.2011.942737

Modeling Dynamical Influence in Human Interaction: Using data to make better inferences about influence within social systems

Wei Pan, +5 more

- 21 Feb 2012

- IEEE Signal Processing Magazine

TL;DR: The model can recover known estimates of influence, it generates results that are consistent with other measures of social networks, and it allows us to uncover important shifts in the way states may be transmitted between actors at different points in time.

...read moreread less

98

•Posted Content

Modeling Dynamical Influence in Human Interaction Patterns

Wei Pan, +4 more

- 01 Sep 2010

- arXiv: Social and Information Networks

TL;DR: The model can recover known estimates of influence, it generates results that are consistent with other measures of social networks, and it allows us to uncover important shifts in the way states may be transmitted between actors at different points in time.

...read moreread less

24

Proceedings Article•10.1109/ICASSP.2019.8683244

Speaker Agnostic Foreground Speech Detection from Audio Recordings in Workplace Settings from Wearable Recorders

Amrutha Nadarajan, +2 more

- 12 May 2019

TL;DR: A convolutional neural network model is proposed to predict foreground regions using a limited set of audio features and it is shown that these models generalize across the proxy corpora collected in house to approximately match the deployment environment.

...read moreread less

15

Journal Article•10.1109/TASL.2013.2260744

Model-Based Multiple Pitch Tracking Using Factorial HMMs: Model Adaptation and Inference

Michael Wohlmayr, +1 more

- 01 Aug 2013

- IEEE Transactions on Audio, Speech, and ...

TL;DR: An EM-like iterative adaptation framework which is capable to adapt the model parameters to the specific situation using only speech mixture data is developed and efficient approaches based on observation likelihood pruning are developed.

...read moreread less

11

Proceedings Article•10.1109/ICASSP.2010.5496273

Single channel source separation based on sparse source observation model with harmonic constraint

Tomohiro Nakatani, +1 more

- 14 Mar 2010

TL;DR: A harmonicity based source separation method is implemented using a robust fundamental frequency (F0) estimation algorithm and the experimental results confirm the effectiveness of the proposed method.

...read moreread less

5

References

•Journal Article•10.1023/A:1007425814087

Factorial Hidden Markov Models

Zoubin Ghahramani, +1 more

- 27 Nov 1995

TL;DR: A generalization of HMMs in which this state is factored into multiple state variables and is therefore represented in a distributed manner, and a structured approximation in which the the state variables are decoupled, yielding a tractable algorithm for learning the parameters of the model.

...read moreread less

1.6K

Journal Article•10.1109/TASL.2008.925147

A Study of Interspeaker Variability in Speaker Verification

Patrick Kenny, +4 more

- 01 Jul 2008

- IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that when a large joint factor analysis model is trained in this way and tested on the core condition, the extended data condition and the cross-channel condition, it is capable of performing at least as well as fusions of multiple systems of other types.

...read moreread less

760

Proceedings Article•10.1109/ICASSP.1990.115970

Hidden Markov model decomposition of speech and noise

Andrew Varga, +1 more

- 03 Apr 1990

TL;DR: A technique of signal decomposition using hidden Markov models is described that provides an optimal method of decomposing simultaneous processes and has wide implications for signal separation in general and improved speech modeling in particular.

...read moreread less

577

•Proceedings Article

Super-Human Multi-Talker Speech Recognition: The IBM 2006 Speech Separation Challenge System

Trausti Kristjansson, +4 more

- 01 Jan 2006

TL;DR: A system for model based speech separation which achieves super-human recognition performance when two talkers speak at similar levels and incorporates a novel method for performing two-talker speaker identification and gain estimation is described.

...read moreread less

127

•Journal Article•10.1016/J.CSL.2008.03.003

Speech separation using speaker-adapted eigenvoice speech models

Ron Weiss, +1 more

- 01 Jan 2010

- Computer Speech & Language

TL;DR: An algorithm to infer the characteristics of the sources present in a mixture is presented, allowing for significantly improved separation performance over that obtained using unadapted source models.

...read moreread less

88