1. What have the authors contributed in "Overlapping speaker segmentation using multiple hypothesis tracking of fundamental frequency" ?
This paper demonstrates how the harmonic structure of voiced speech can be exploited to segment multiple overlapping speakers in a speaker diarization task.. The authors show that voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity.. This system is bench-marked against a segmentation system from the literature that employs a bidirectional long short term memory network ( BLSTM ) approach and requires training.. The authors also show that the estimated pitch tracks of their system can be used as features to the BLSTM to achieve further improvements of 1. 21 % in terms of coverage and 2. 45 % in terms of purity.. The authors explore how a change in the speaker can be inferred from a change in pitch.
read more


![Fig. 6. Baseline-1 system architecture presented in [49] with st: input signal, Φ̂t: peak detections, Ψ̂t: detection reliabilities, Zt: generated observations, Ti: selected track hypotheses, ot: overlapping speech onsets, Bt: strongest candidate track and ct: speaker change onsets.](/figures/fig-6-baseline-1-system-architecture-presented-in-49-with-st-1ns25a8s.png)


