1. What are the contributions in "High-dimensional linear representations for robust speech recognition" ?
Phoneme classification is investigated in linear feature domains with the aim of improving the robustness to additive noise.. The authors first show results for a representation consisting of concatenated frames from the centre of the phoneme, each containing f frames.. Next the authors improve results by including information from the entire phoneme.. In the presence of additive noise, classification in this framework performs better than an analogous PLP classifier, adapted to noise using cepstral mean and variance normalisation, below 18dB SNR.. As no single f is optimal for all phonemes, the authors further average over models with a range of values of f.
read more
![Table 1. Phomene duration [ms] in the training data grouped by broad phonetic class.](/figures/table-1-phomene-duration-ms-in-the-training-data-grouped-by-9b4z1py9.png)
![Fig. 2. Comparison of existing phoneme representations. Top: Division described in [12] resulting in five sectors, three covering the duration of the phoneme and two of 40ms over the transitions. Bottom: f frames closest to the five points A,B,C,D and E (that correspond to the centres of the regions above) are selected to map the phoneme segment to five feature vectors xA,xB ,xC ,xD and xE .](/figures/fig-2-comparison-of-existing-phoneme-representations-top-152hxa9o.png)


