Polynomial Eigenvalue Decomposition-Based Target Speaker Voice Activity Detection in the Presence of Competing Talkers

doi:10.1109/iwaenc53105.2022.9914796

Open AccessProceedings Article10.1109/iwaenc53105.2022.9914796

Polynomial Eigenvalue Decomposition-Based Target Speaker Voice Activity Detection in the Presence of Competing Talkers

05 Sep 2022

10

TL;DR: In this article , a polynomial eigenvalue decomposition-based target-speaker VAD algorithm was proposed to detect unseen target speakers in the presence of competing talkers.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/taslp.2023.3313441

Signal Compaction Using Polynomial EVD for Spherical Array Processing with Applications

IEEE/ACM transactions on audio, speech, ...

TL;DR: This work proposes a framework for signal representation that improves the diagonality factor over the microphone signal representation with a significantly lower computation cost, and improves metrics known as short-time objective intelligibility (STOI) and source-to-distortion ratio (SDR) by up to 0.2 and 20 dB, respectively.

...read moreread less

8

•Proceedings Article•10.1109/sspd54131.2022.9896222

A Polynomial Subspace Projection Approach for the Detection of Weak Voice Activity

01 Sep 2022

TL;DR: In this article , a polynomial subspace projection pre-processor is proposed to improve the performance of a voice activity detection (VAD) algorithm, which projects the microphone signals onto a lower dimensional subspace to remove the interferer components and thus eases the detection of the speech target.

...read moreread less

7

•Proceedings Article•10.1109/raeecs56511.2022.9954500

Support Estimation of Analytic Eigenvectors of Parahermitian Matrices

18 Oct 2022

TL;DR: In this paper , the authors proposed a method to estimate the time-domain support of eigenvectors from parahermitian matrices, which is validated via an ensemble of known support, which the estimated support accurately matches.

...read moreread less

5

•Proceedings Article•10.1109/sspd54131.2022.9896183

Enhanced Space-Time Covariance Estimation Based on a System Identification Approach

01 Sep 2022

TL;DR: In this paper , a significantly more accurate estimate can be obtained if the source signals driving the signal model are also accessible, such that a system identication approach for the source model becomes viable.

...read moreread less

4

•Proceedings Article•10.1109/iwaenc53105.2022.9914789

Frame-Based Space-Time Covariance Matrix Estimation for Polynomial Eigenvalue Decomposition-Based Speech Enhancement

05 Sep 2022

TL;DR: In this article , a frame-based procedure for the estimation of space-time covariance matrices was proposed, which was found to yield spatial filters and speech enhancement improvements comparable to the batch method in [1], showing potential for real-time processing.

...read moreread less

References

•Book

Fundamentals of speech recognition

Lawrence R. Rabiner, +1 more

- 01 Jan 1993

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

...read moreread less

9.4K

•Journal Article•10.1109/TSA.2005.858005

Performance measurement in blind audio source separation

Emmanuel Vincent, +2 more

- 01 Jul 2006

- IEEE Transactions on Audio, Speech, and ...

TL;DR: This paper considers four different sets of allowed distortions in blind audio source separation algorithms, from time-invariant gains to time-varying filters, and derives a global performance measure using an energy ratio, plus a separate performance measure for each error term.

...read moreread less

3.4K

•Journal Article•10.1016/J.ACI.2018.08.003

Classification assessment methods

Alaa Tharwat

- 04 Jan 2021

- Applied Computing and Informatics

TL;DR: A detailed overview of the classification assessment measures is introduced with the aim of providing the basics of these measures and to show how it works to serve as a comprehensive source for researchers who are interested in this field.

...read moreread less

2K

Journal Article•10.1109/97.736233

A statistical model-based voice activity detection

Jongseo Sohn, +2 more

- 01 Jan 1999

- IEEE Signal Processing Letters

TL;DR: An effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences is proposed which shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.

...read moreread less

1.4K

Journal Article•10.1109/TSA.2003.811544

Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging

Israel Cohen

- 26 Aug 2003

- IEEE Transactions on Speech and Audio Pr...

TL;DR: In this article, an improved minima controlled recursive averaging (IMCRA) approach is proposed for noise estimation in adverse environments involving nonstationary noise, weak speech components, and low input signal-to-noise ratio (SNR).

...read moreread less

1K