Journal Article10.1121/1.399423
Perceptual linear predictive (PLP) analysis of speech
3.1K
TL;DR: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, which uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum, and yields a low-dimensional representation of speech.
read more
Abstract: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, is presented and examined. This technique uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum: (1) the critical-band spectral resolution, (2) the equal-loudness curve, and (3) the intensity-loudness power law. The auditory spectrum is then approximated by an autoregressive all-pole model. A 5th-order all-pole model is effective in suppressing speaker-dependent details of the auditory spectrum. In comparison with conventional linear predictive (LP) analysis, PLP analysis is more consistent with human hearing. The effective second formant F2' and the 3.5-Bark spectral-peak integration theories of vowel perception are well accounted for. PLP analysis is computationally efficient and yields a low-dimensional representation of speech. These properties are found to be useful in speaker-independent automatic-speech recognition.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Patent
Intelligent digital assistant in a multi-tasking environment
Aram David Kudurshian,Bronwyn A. Jones,Elizabeth Caroline Furches Cranfill,Harry J. Saddler +3 more
- 21 Sep 2016
TL;DR: In this paper, a system and processes for operating a digital assistant are described. Butler et al. present a method for determining whether the user intent is to perform a task using a searching process or an object managing process.
82
Auditory teager energy cepstrum coefficients for robust speech recognition
Dimitrios Dimitriadis,Petros Maragos,Alexandros Potamianos +2 more
- 04 Sep 2005
TL;DR: Error analysis and speech recognition experiments show that the TECCs and the mel frequency cepstrum coefficients (MFCCs) perform similarly for clean recording conditions; while the T ECCs perform significantly better than the MFCCs for noisy recognition tasks.
Patent
Personalized prediction of responses for instant messaging
Vivek Kumar Rangarajan Sridhar
- 04 Sep 2015
TL;DR: In this paper, the authors present a message transcript, where the message transcript includes an incoming message, and determine if the frequently inputted response is synonymous with a suggested one or more characters.
82
Text-independent speaker identification based on selection of the most similar feature vectors
TL;DR: Two methods to find MFCCs feature vectors with the highest similar that is applied to text independent speaker identification system are proposed and Experimental results indicate that the performance of speaker Identification system has been improved in accuracy and time consumption term.
81
A Survey on Signal Processing Based Pathological Voice Detection Techniques
TL;DR: The motivation of the work is to address the need for non-invasive signal processing techniques to detect voice disability in the general population by addressing the issues and challenges related to the selection of voice feature and classifier algorithms.
References
Effect of glottal pulse shape on the quality of natural vowels.
TL;DR: In this article, a male speaker recorded monosyllabic words and a continuous sentence and a pitch-synchronous analysis was carried out by a digital computer on the vowel portions of these samples, for every pitch period, the analysis provided: formant frequencies, waveform of the glottal excitation function, and an accurate pitch-period measurement.
Prediction of perceived phonetic distance from critical-band spectra: A first step
Dennis H. Klatt
- 03 May 1982
TL;DR: Judgements of phonetic distance between pairs of static synthetic vowels and fricatives have been collected in which the stimulus ensemble included formant frequency changes and a number of acoustic changes that turn out to have little phonetic relevance.
349