Journal Article10.1016/J.KNOSYS.2014.03.019
Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure
Arianna Mencattini,Eugenio Martinelli,Giovanni Costantini,Massimiliano Todisco,Barbara Basile,Marco Bozzali,Corrado Di Natale +6 more
89
TL;DR: This study proposes the use of a PLS regression model, optimized according to specific features selection procedures and trained on the Italian speech corpus EMOVO, suggesting a way to automatically label the corpus in terms of arousal and valence.
read more
Abstract: Speech emotion recognition (SER) is a challenging framework in demanding human machine interaction systems. Standard approaches based on the categorical model of emotions reach low performance, probably due to the modelization of emotions as distinct and independent affective states. Starting from the recently investigated assumption on the dimensional circumplex model of emotions, SER systems are structured as the prediction of valence and arousal on a continuous scale in a two-dimensional domain. In this study, we propose the use of a PLS regression model, optimized according to specific features selection procedures and trained on the Italian speech corpus EMOVO, suggesting a way to automatically label the corpus in terms of arousal and valence. New speech features related to the speech amplitude modulation, caused by the slowly-varying articulatory motion, and standard features extracted from the pitch contour, have been included in the regression model. An average value for the coefficient of determination R2 of 0.72 (maximum value of 0.95 for fear and minimum of 0.60 for sadness) is obtained for the female model and a value for R2 of 0.81 (maximum value of 0.89 for anger and minimum value of 0.71 for joy) is obtained for the male model, over the seven primary emotions (including the neutral state).
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Databases, features and classifiers for speech emotion recognition: a review
TL;DR: In this study, available literature on various databases, different features and classifiers have been taken in to consideration for speech emotion recognition from assorted languages.
333
A comprehensive survey on feature selection in the various fields of machine learning
Pradip Dhal,Chandrashekhar Azad +1 more
TL;DR: A descriptive survey on FS with the associated area of real-world problem domains to understand the main idea of FS work and identify the core idea of how FS will be applicable in various problem domains.
276
Speech emotion recognition based on feature selection and extreme learning machine decision tree
TL;DR: A feature selection method based on correlation analysis and Fisher is proposed, which can remove the redundant features that have close correlations with each other, which would make it possible to realize the interaction between speaker-independent and computer/robot in the future.
261
A novel feature selection method for speech emotion recognition
TL;DR: A new statistical feature selection method is proposed based on the changes in emotions on acoustic features that provides a significant reduction in the number of features, as well as increasing the classification success.
157
Preserving privacy in speaker and speech characterisation
Andreas Nautsch,Andreas Nautsch,Abelino Jiménez,Amos Treiber,Jascha Kolberg,Catherine Jasserand,Els Kindt,Héctor Delgado,Massimiliano Todisco,Mohamed Amine Hmani,Aymen Mtibaa,Mohammed Ahmed Abdelraheem,Alberto Abad,Francisco Teixeira,Driss Matrouf,Marta Gomez-Barrero,Dijana Petrovska-Delacrétaz,Gérard Chollet,Nicholas Evans,Thomas Schneider,Jean-François Bonastre,Bhiksha Raj,Isabel Trancoso,Christoph Busch +23 more
TL;DR: The requirements for effective privacy preservation are established, generic cryptography-based solutions are reviewed, followed by specific techniques that are applicable to speaker characterisation and speech characterisation (biometrics and non-biometric applications), and common, empirical evaluation metrics for the assessment of privacy-preserving technologies for speech data are outlined.
144
References
•Book
The Nature of Statistical Learning Theory
Vladimir Vapnik
- 01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
46K
•Book
Applied Regression Analysis
Norman R. Draper,Harry Smith +1 more
- 01 Jan 1966
TL;DR: In this article, the Straight Line Case is used to fit a straight line by least squares, and the Durbin-Watson Test is used for checking the straight line fit.
19K
•Book
Handbook of mathematical functions : with formulas, graphs, and mathematical tables
Milton Abramowitz,Irene A. Stegun +1 more
- 01 Jan 1970
18K