TL;DR: Although the proposed i-vectors yield inferior performance compared to the standard ones, they are capable of attaining 16% relative improvement when fused with them, meaning that they carry useful complementary information about the speaker’s identity.
Abstract: We examine the use of Deep Neural Networks (DNN) in extracting Baum-Welch statistics for i-vector-based textindependent speaker recognition. Instead of training the universal background model using the standard EM algorithm, the components are predefined and correspond to the set of triphone states, the posterior occupancy probabilities of which are modeled by a DNN. Those assignments are then combined with the standard 60-dim MFCC features to calculate first order BaumWelch statistics in order to train the i-vector extractor and extract i-vectors. The DNN-based assignment force the i-vectors to capture the idiosyncratic way in which each speaker pronounces each particular triphone state, which can enrich the standard short-term spectral representation of the standard ivectors. After experimenting with Switchboard data and a baseline PLDA classifier, our results showed that although the proposed i-vectors yield inferior performance compared to the standard ones, they are capable of attaining 16% relative improvement when fused with them, meaning that they carry useful complementary information about the speaker’s identity. A further experiment with a different DNN configuration attained comparable performance with the baseline i-vectors on NIST 2012 (condition C2, female).
TL;DR: Prerequisites in probability calculus and the Baum - Welch Learning Algorithm and Hidden Markov Models: an Overview are presented.
Abstract: Foreword. 1. Prerequisites in probability calculus. 2. Information and the Kullback Distance. 3. Probabilistic Models and Learning. 4. EM Algorithm. 5. Alignment and Scoring. 6. Mixture Models and Profiles. 7. Markov Chains. 8. Learning of Markov Chains. 9. Markovian Models for DNA sequences. 10. Hidden Markov Models: an Overview. 11. HMM for DNA Sequences. 12. Left to Right HMM for Sequences. 13. Derin's Algorithm. 14. Forward - Backward Algorithm. 15. Baum - Welch Learning Algorithm. 16. Limit Points of Baum - Welch. 17. Asymptotics of Learning. 18. Full Probabilistic HMM. Index.
TL;DR: Two experiments designed to determine how much manual training information is needed for speech tagging by Hidden Markov Model suggest that initial biasing of either lexical or transition probabilities is essential to achieve a good accuracy and reveal three distinct patterns of Baum-Welch reestimation.
Abstract: In part of speech tagging by Hidden Markov Model, a statistical model is used to assign grammatical categories to words in a text. Early work in the field relied on a corpus which had been tagged by a human annotator to train the model. More recently, Cutting et al. (1992) suggest that training can be achieved with a minimal lexicon and a limited amount of a priori information about probabilities, by using an Baum-Welch re-estimation to automatically refine the model. In this paper, I report two experiments designed to determine how much manual training information is needed. The first experiment suggests that initial biasing of either lexical or transition probabilities is essential to achieve a good accuracy. The second experiment reveals that there are three distinct patterns of Baum-Welch reestimation. In two of the patterns, the re-estimation ultimately reduces the accuracy of the tagging rather than improving it. The pattern which is applicable can be predicted from the quality of the initial model and the similarity between the tagged training corpus (if any) and the corpus to be tagged. Heuristics for deciding how to use re-estimation in an effective manner are given. The conclusions are broadly in agreement with those of Merialdo (1994), but give greater detail about the contributions of different parts of the model.
TL;DR: The proposed HMM framework has been applied to find the most probable activity states series with low data transmission rate, which makes it highly suitable for daily activity classification applications.
Abstract: This paper presents a hidden Markov model (HMM) approach for real-time activity classification using signals from wearable wireless sensor networks. A wearable wireless sensor network can be used to continuously monitor the daily activities of a subject in real time. However, the wireless sensor nodes are constrained by limited battery and computing resources. The proposed HMM framework has been applied to find the most probable activity states series with low data transmission rate, which makes it highly suitable for daily activity classification applications. The performance was evaluated using a small sensor network consisting of three accelerometers. The activity detection rate is 95.82%, using a test set of 5 subjects with 11 activity series.