Scispace (Formerly Typeset)
  1. Home
  2. Journals
  3. Computer Speech & Language
  4. 2018
  1. Home
  2. Journals
  3. Computer Speech & Language
  4. 2018
Showing papers in "Computer Speech & Language in 2018"
Journal Article•10.1016/J.CSL.2017.10.004•
Predicting speech intelligibility with deep neural networks

[...]

Constantin Spille1, Stephan D. Ewert1, Birger Kollmeier1, Bernd Meyer1•
University of Oldenburg1
01 Mar 2018-Computer Speech & Language
TL;DR: For both training schemes, ASR-based predictions outperform established measures such as the extended speech intelligibility index (ESII), the multi-resolution speech envelope power spectrum model (mr-sEPSM) and others.

101 citations

Journal Article•10.1016/J.CSL.2018.01.001•
Automatic speaker, age-group and gender identification from children’s speech

[...]

Saeid Safavi1, Martin J. Russell1, Peter Jancovic1•
University of Birmingham1
01 Jul 2018-Computer Speech & Language
TL;DR: The performances of several classification methods are compared, including Gaussian Mixture Model–Universal Background Model (GMM–UBM), GMM–Support Vector Machine (G MM–SVM) and i-vector based approaches, and the utility of different frequency bands for speaker, age-group and gender recognition from children’s speech is assessed.

91 citations

Journal Article•10.1016/J.CSL.2017.10.001•
Synthetic speech detection using fundamental frequency variation and spectral features

[...]

Monisankha Pal1, Dipjyoti Paul1, Goutam Saha1•
Indian Institute of Technology Kharagpur1
01 Mar 2018-Computer Speech & Language
TL;DR: This paper proposed a new approach to detect synthetic speech using score-level fusion of front-end features namely, constant Q cepstral coefficients (CQCCs), all-pole group delay function (APGDF) and fundamental frequency variation (FFV), which outperforms all existing baseline features for both known and unknown attacks.

66 citations

Journal Article•10.1016/J.CSL.2017.11.004•
Neural versus Phrase-Based MT Quality: an In-Depth Analysis on English-German and English-French

[...]

Luisa Bentivogli1, Arianna Bisazza2, Mauro Cettolo1, Marcello Federico1•
fondazione bruno kessler1, Leiden University2
01 May 2018-Computer Speech & Language
TL;DR: A detailed analysis of neural versus phrase-based statistical machine translation outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data provides useful insights on what linguistic phenomena are best modelled by neural models.

61 citations

Journal Article•10.1016/J.CSL.2018.04.003•
Comparing human and automatic speech recognition in simple and complex acoustic scenes

[...]

Constantin Spille1, Birger Kollmeier1, Bernd Meyer1•
University of Oldenburg1
01 Nov 2018-Computer Speech & Language
TL;DR: It is found that DNN-based ASR reaches human performance for single-channel, small-vocabulary tasks in the presence of speech-shaped noise and in multi-talker babble noise, which is an important difference to previous human-machine comparisons.

46 citations

Journal Article•10.1016/J.CSL.2017.07.002•
Combining sentence similarities measures to identify paraphrases

[...]

Rafael Ferreira1, George D. C. Cavalcanti2, Fred Freitas2, Rafael Dueire Lins1, Steven J. Simske3, Marcelo Riss3 •
Universidade Federal Rural de Pernambuco1, Federal University of Pernambuco2, Hewlett-Packard3
01 Jan 2018-Computer Speech & Language
TL;DR: A paraphrase identification system that represents each pair of sentence as a combination of different similarity measures that extract lexical, syntactic and semantic components of the sentences encompassed in a graph is proposed.

42 citations

Journal Article•10.1016/J.CSL.2017.06.007•
Restricted Boltzmann machines for vector representation of speech in speaker recognition

[...]

Omid Ghahabi1, Javier Hernando1•
Polytechnic University of Catalonia1
01 Jan 2018-Computer Speech & Language
TL;DR: Experiments on the core test condition 5 of NIST SRE 2010 show that comparable results with conventional i-vectors are achieved with a clearly lower computational load in the vector extraction process.

41 citations

Journal Article•10.1016/J.CSL.2017.06.005•
Uncertainty weighting and propagation in DNN–HMM-based speech recognition

[...]

José Novoa1, Josué Fredes1, Víctor Poblete2, Néstor Becerra Yoma1•
University of Chile1, Austral University of Chile2
01 Jan 2018-Computer Speech & Language
TL;DR: The results presented here suggest that substantial reduction in WER is achieved with clean training, and the uncertainty weighting method reduced the gap between clean and multi-noise/multi-condition training.

36 citations

Journal Article•10.1016/J.CSL.2017.07.009•
On the Effects of Using word2vec Representations in Neural Networks for Dialogue Act Recognition

[...]

Christophe Cerisara, Pavel Král1, Ladislav Lenc1•
University of West Bohemia1
01 Jan 2018-Computer Speech & Language
TL;DR: This paper proposed a new deep neural network that explores recurrent models to capture word sequences within sentences, and further study the impact of pretrained word embeddings on the performance of the proposed approach.

32 citations

Journal Article•10.1016/J.CSL.2017.11.003•
Rank-1 constrained Multichannel Wiener Filter for speech recognition in noisy environments

[...]

Ziteng Wang1, Emmanuel Vincent2, Romain Serizel2, Yonghong Yan1•
Chinese Academy of Sciences1, French Institute for Research in Computer Science and Automation2
01 May 2018-Computer Speech & Language
TL;DR: In this paper, the rank-1 constrained multichannel Wiener filter is employed for noise reduction and a new constant residual noise power constraint is derived which enhances the recognition performance.

32 citations

Journal Article•10.1016/J.CSL.2017.08.001•
Improving PLDA speaker verification performance using domain mismatch compensation techniques

[...]

Hafizur Rahman1, Ahilan Kanagasundaram1, Ivan Himawan1, David Dean1, Sridha Sridharan1 •
Queensland University of Technology1
01 Jan 2018-Computer Speech & Language
TL;DR: In this article, a domain-invariant linear discriminant analysis (DI-LDA) technique was proposed to compensate domain mismatch from both LDA and PLDA subspaces.
Journal Article•10.1016/J.CSL.2018.02.003•
Reward estimation for dialogue policy optimisation

[...]

Pei-Hao Su1, Milica Gasic1, Steve Young1•
University of Cambridge1
01 Sep 2018-Computer Speech & Language
TL;DR: Two approaches to tackling dialogue management as a reinforcement learning task are presented, whereby a recurrent neural network is utilised as a task success predictor which is pre-trained from off-line data to estimate task success during subsequent on-line dialogue policy learning.
Journal Article•10.1016/J.CSL.2017.12.004•
An empirical study on POS tagging for Vietnamese social media text

[...]

Ngo Xuan Bach1, Nguyen Dieu Linh1, Tu Minh Phuong1•
Posts and Telecommunications Institute of Technology1
01 Jul 2018-Computer Speech & Language
TL;DR: An empirical study on POS tagging for Vietnamese social media text is presented, which shows several challenges compared with tagging for general text and the semi-supervised model outperformed, in terms of accuracy, the version of vnTagger trained on the same Facebook dataset, showing the usefulness of word cluster features.
Journal Article•10.1016/J.CSL.2018.04.001•
A novel rule based machine translation scheme from Greek to Greek Sign Language: Production of different types of large corpora and Language Models evaluation

[...]

Dimitris Kouremenos1, Klimis Ntalianis1, Stefanos Kollias1•
National Technical University of Athens1
01 Sep 2018-Computer Speech & Language
TL;DR: This work presents a novel prototype Rule Based Machine Translation (RBMT) system for the creation of large and quality written Greek Sign Language (GSL) glossed corpora from Greek text and stresses that Language Models for written GSL gloss are missing from the scientific literature, thus this work is pioneer in this field.
Journal Article•10.1016/J.CSL.2018.02.002•
Using a PCA-based dataset similarity measure to improve cross-corpus emotion recognition

[...]

Ingo Siegert1, Ronald Böck1, Andreas Wendemuth1•
Otto-von-Guericke University Magdeburg1
01 Sep 2018-Computer Speech & Language
TL;DR: A corpus similarity measure based on PCA-ranked features answers the question which corpora should be included into joint training and outperforms all other combinations of corpora.
Journal Article•10.1016/J.CSL.2017.12.010•
Computer based speech prosody teaching system

[...]

Dávid Sztahó1, Gabor Kiss1, Klára Vicsi1•
Budapest University of Technology and Economics1
01 Jul 2018-Computer Speech & Language
TL;DR: A novel prosody teaching system where intensity (accent), intonation and rhythm are presented visually for the students as visual feedback and automatic assessment scores are given jointly and separately for the goodness of intonations and rhythm is introduced.
Journal Article•10.1016/J.CSL.2017.11.001•
Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening

[...]

Maryam Sadat Mirzaei1, Kourosh Meshgi1, Tatsuya Kawahara1•
Kyoto University1
01 May 2018-Computer Speech & Language
TL;DR: This paper addresses the viability of using Automatic Speech Recognition errors as the predictor of difficulties in speech segments, thereby exploiting them to improve Partial and Synchronized Caption (PSC), and proposes the use of ASR systems as a model of L2 listeners and hypothesize that ASR errors can predict challenging speech segments for these learners.
Journal Article•10.1016/J.CSL.2017.08.004•
Sparse coding based features for speech units classification

[...]

Pulkit Sharma1, Vinayak Abrol1, A. D. Dileep1, Anil Kumar Sao1•
Indian Institute of Technology Mandi1
01 Jan 2018-Computer Speech & Language
TL;DR: Both raw speech samples and mel frequency cepstral coefficients are used as an initial representation for feature extraction and a transformation function known as weighted decomposition (WD) of principal components is used to emphasize the discriminative information present in the PCA-based dictionary.
Journal Article•10.1016/J.CSL.2017.11.005•
Conversational telephone speech recognition for Lithuanian

[...]

Rasa Lileikytė1, Lori Lamel1, Jean-Luc Gauvain1, Arseniy Gorin1•
Université Paris-Saclay1
01 May 2018-Computer Speech & Language
TL;DR: The use of Web texts for language modeling is shown to significantly improve both speech recognition and keyword spotting performance, and combining full-word and subword units leads to the best keyword spotting results.
Journal Article•10.1016/J.CSL.2017.10.003•
Learning static spectral weightings for speech intelligibility enhancement in noise

[...]

Yan Tang1, Martin Cooke2, Martin Cooke3•
University of Salford1, Ikerbasque2, University of the Basque Country3
01 May 2018-Computer Speech & Language
TL;DR: Speech modification strategies based on reallocating energy statically across the spectrum using masker-specific spectral weightings are investigated, indicating that energy-neutral spectral weighting is a highly-effective near-end speech enhancement approach that places minimal demands on detailed masker estimation.
Journal Article•10.1016/J.CSL.2017.07.004•
RankUp: Enhancing graph-based keyphrase extraction methods with error-feedback propagation

[...]

Gerardo Figueroa1, Po-Chi Chen1, Yi-Shin Chen1•
National Tsing Hua University1
01 Jan 2018-Computer Speech & Language
TL;DR: This work proposes an unsupervised method—RankUp—that enhances graph-based keyphrase extraction approaches by applying an error-feedback mechanism similar to the concept of backpropagation, and shows that error- feedback propagation can boost the quality of keyphrases in graph- based keyphrase extractions techniques.
Journal Article•10.1016/J.CSL.2017.12.009•
Nonparametrically trained PLDA for short duration i-vector speaker verification

[...]

Abbas Khosravani1, Mohammad Mehdi Homayounpour1•
Amirkabir University of Technology1
01 Nov 2018-Computer Speech & Language
TL;DR: This paper provides further analysis on the proposed nonparametrically trained PLDA as well as introduces a duration variability modeling technique in the estimation of the within-speaker scatter matrix as to compensate for the effect of limited speech data.
Journal Article•10.1016/J.CSL.2017.07.007•
Using Chinese radical parts for sentiment analysis and domain-dependent seed set extraction

[...]

August F.Y. Chao, Heng-Li Yang
01 Jan 2018-Computer Speech & Language
TL;DR: This study adopted Chinese radical information for sentiment feature extraction and confirmed that radical information could be adopted as a feature unit in sentiment analysis and that domain-dependent radicals could be reused in different corpora.
Journal Article•10.1016/J.CSL.2017.12.008•
Sequential use of spectral models to reduce deletion and insertion errors in vowel detection

[...]

Hamidreza Baradaran Kashani1, Abolghasem Sayadiyan1•
Amirkabir University of Technology1
01 Jul 2018-Computer Speech & Language
TL;DR: A new framework that directly addresses three possible errors in the vowel detection problem, namely vowel deletion, consonant insertion, and vowel insertion is proposed and outperforms the existing well-known methods in terms of both total error and F-measure.
Journal Article•10.1016/J.CSL.2018.01.002•
One-on-one and small group conversations with an intelligent virtual science tutor

[...]

Ronald A. Cole, Cindy Buchenroth-Martin, Timothy J. Weston, Liam Devine, Jeannine Myatt, Brandon Helding, Sameer Pradhan, Margaret G. McKeown1, Samantha Messier, Jennifer Borum, Wayne H. Ward •
University of Pittsburgh1
01 Jul 2018-Computer Speech & Language
TL;DR: This study investigated students' conversations with a virtual science tutor (Marni), either individually or in small groups, to see if students receiving tutoring using the virtual tutor in groups would demonstrate learning gains equivalent to those of students receiving one-on-one tutoring.
Journal Article•10.1016/J.CSL.2018.02.001•
A multilinear tongue model derived from speech related MRI data of the human vocal tract

[...]

Alexander Hewer1, Alexander Hewer2, Stefanie Wuhrer, Ingmar Steiner2, Ingmar Steiner1, Korin Richmond •
Saarland University1, German Research Centre for Artificial Intelligence2
01 Sep 2018-Computer Speech & Language
TL;DR: A multilinear statistical model of the human tongue that captures anatomical and tongue pose related shape variations separately is presented and it is shown that it can be used to generate plausible tongue animation by tracking sparse motion capture data.
Journal Article•10.1016/J.CSL.2017.08.005•
Improvements to harmonic model for extracting better speech features in clinical applications

[...]

Meysam Asgari1, Izhak Shafran1•
Oregon Health & Science University1
01 Jan 2018-Computer Speech & Language
TL;DR: An improved version of HM is introduced that leads to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer and the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder.
Journal Article•10.1016/J.CSL.2017.07.006•
A methodology for turn-taking capabilities enhancement in Spoken Dialogue Systems using Reinforcement Learning

[...]

Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre
01 Jan 2018-Computer Speech & Language
TL;DR: A new approach for transforming the traditional dialogue architecture into an incremental one at a low cost is presented: a new turn-taking decision module called the Scheduler is inserted between the Client and the Service.
Journal Article•10.1016/J.CSL.2018.05.003•
Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation

[...]

Pulkit Sharma1, Vinayak Abrol1, Nivedita1, Anil Kumar Sao1•
Indian Institute of Technology Mandi1
01 Nov 2018-Computer Speech & Language
TL;DR: Experimental studies on two different Indian languages suggest that CS/SR based footprint reduction methods can be used as an alternative to existing compression methods employed in USS system.
Journal Article•10.1016/J.CSL.2017.08.003•
Cepstral distance based channel selection for distant speech recognition

[...]

Cristina Maritza Guerrero Flores1, Cristina Maritza Guerrero Flores2, Georgia Tryfou1, Georgia Tryfou2, Maurizio Omologo2 •
University of Trento1, fondazione bruno kessler2
01 Jan 2018-Computer Speech & Language
TL;DR: This work investigates on the application of cepstral distance as a distortion measure that turns out to be closely related to properties of the room acoustics, such as reverberation time and direct-to-reverberant ratio.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve