Deep feature for text-dependent speaker verification

doi:10.1016/J.SPECOM.2015.07.003

Journal Article10.1016/J.SPECOM.2015.07.003

Deep feature for text-dependent speaker verification

Yuan Liu, +5 more

- 01 Oct 2015

- Speech Communication

- Vol. 73, pp 1-13

209

TL;DR: Experiments showed that deep feature based methods can obtain significant performance improvements compared to the traditional baselines, no matter if they are directly applied in the GMM-UBM system or utilized as identity vectors.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1016/J.NEUNET.2021.03.004

Speaker recognition based on deep learning: An overview

Zhongxin Bai, +1 more

- 17 Mar 2021

- Neural Networks

TL;DR: In this article, the authors review several major subtasks of speaker recognition, including speaker verification, identification, diarization, and robust speaker recognition with a focus on deep learning-based methods.

...read moreread less

275

•Posted Content

Speaker Recognition Based on Deep Learning: An Overview

Zhongxin Bai, +1 more

- 02 Dec 2020

- arXiv: Audio and Speech Processing

TL;DR: Several major subtasks of speaker recognition are reviewed, including speaker verification, identification, diarization, and robust speaker recognition, with a focus on deep-learning-based methods.

...read moreread less

229

Proceedings Article•10.21437/INTERSPEECH.2018-1545

Angular Softmax for Short-Duration Text-independent Speaker Verification.

Zili Huang, +2 more

- 02 Sep 2018

TL;DR: In this article, the angular softmax (A-softmax) loss is introduced to improve speaker embedding quality in an end-to-end speaker verification system, where deep discriminant analysis is used for channel compensation.

...read moreread less

123

•Journal Article•10.1142/S0219622019300052

A Review on Human-Computer Interaction and Intelligent Robots

Fuji Ren, +2 more

- 17 Feb 2020

- International Journal of Information Tec...

TL;DR: This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction, and introduces some intelligent robot systems and platforms.

...read moreread less

121

Journal Article•10.1631/FITEE.1700814

Past review, current progress, and challenges ahead on the cocktail party problem

Yanmin Qian, +4 more

- 23 Apr 2018

- Journal of Zhejiang University Science C

TL;DR: This overview paper focuses on the speech separation problem given its central role in the cocktail party environment, and describes the conventional single-channel techniques such as computational auditory scene analysis (CASA), non-negative matrix factorization (NMF) and generative models, and the newly developed deep learning-based techniques.

...read moreread less

106

...

Expand

References

Journal Article•10.1162/NECO.2006.18.7.1527

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006

- Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

18.3K

Journal Article•10.1109/MSP.2012.2205597

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

Geoffrey E. Hinton, +10 more

- 18 Oct 2012

- IEEE Signal Processing Magazine

TL;DR: This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.

...read moreread less

11.4K

•Journal Article•10.1006/DSPR.1999.0361

Speaker Verification Using Adapted Gaussian Mixture Models

Douglas A. Reynolds, +2 more

- 01 Jan 2000

- Digital Signal Processing

TL;DR: The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.

...read moreread less

5K

Journal Article•10.1109/TASL.2010.2064307

Front-End Factor Analysis for Speaker Verification

Najim Dehak, +4 more

- 01 May 2011

- IEEE Transactions on Audio, Speech, and ...

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.

...read moreread less

4.4K

•Journal Article•10.1109/TASL.2011.2134090

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

George E. Dahl, +3 more

- 01 Jan 2012

- IEEE Transactions on Audio, Speech, and ...

TL;DR: A pre-trained deep neural network hidden Markov model (DNN-HMM) hybrid architecture that trains the DNN to produce a distribution over senones (tied triphone states) as its output that can significantly outperform the conventional context-dependent Gaussian mixture model (GMM)-HMMs.

...read moreread less

3.6K