POLQA

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1109/ICASSP.2001.941023•

Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs

[...]

Antony William Rix, John G. Beerends, Michael Peter Hollier, Andries Pieter Hekstra

7 May 2001

TL;DR: A new model has been developed for use across a wider range of network conditions, including analogue connections, codecs, packet loss and variable delay, known as perceptual evaluation of speech quality (PESQ).

...read moreread less

Abstract: Previous objective speech quality assessment models, such as bark spectral distortion (BSD), the perceptual speech quality measure (PSQM), and measuring normalizing blocks (MNB), have been found to be suitable for assessing only a limited range of distortions. A new model has therefore been developed for use across a wider range of network conditions, including analogue connections, codecs, packet loss and variable delay. Known as perceptual evaluation of speech quality (PESQ), it is the result of integration of the perceptual analysis measurement system (PAMS) and PSQM99, an enhanced version of PSQM. PESQ is expected to become a new ITU-T recommendation P.862, replacing P.861 which specified PSQM and MNB.

...read moreread less

2,818 citations

Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs

[...]

A. W. Rix

1 Jan 2001

TL;DR: A new perceptual evaluation of speech quality assessment model has been developed for use across a wider range of network conditions, including analogue connections, codecs, packet loss and variable delay.

...read moreread less

1,933 citations

Journal Article•

Perceptual evaluation of speech quality (PESQ) the new ITU standard for end-to-end speech quality assessment: Part I: Time-delay compensation

[...]

Antony William Rix, Michael Peter Hollier, Andries Pieter Hekstra, John G. Beerends

15 Oct 2002-Journal of The Audio Engineering Society

TL;DR: A new model for the perceptual evaluation of speech quality (PESQ) was recently standardized by the International Telecommunications Union as Recommendation P.862 as discussed by the authors, which is able to predict subjective quality with good correlation in a very wide range of conditions, which may include coding distortions, errors, noise, filtering, delay, and variable delay.

...read moreread less

Abstract: A new model for the perceptual evaluation of speech quality (PESQ) was recently standardized by the International Telecommunications Union as Recommendation P.862. Unlike previous codec assessment models, such as PSQM and MNB (ITU-T P.861), PESQ is able to predict subjective quality with good correlation in a very wide range of conditions, which may include coding distortions, errors, noise, filtering, delay, and variable delay. In Part I time-delay identification techniques are introduced and some causes of variable delay are outlined before the processes that are integrated into PESQ and specified in P.862 are described. More information on the structure of PESQ as well as performance results will be given in Part II.

...read moreread less

187 citations

Proceedings Article•10.21437/INTERSPEECH.2019-3087•

A Scalable Noisy Speech Dataset and Online Subjective Test Framework.

[...]

Chandan K A Reddy¹, Ebrahim Beyrami¹, Jamie Pool¹, Ross Cutler¹, Sriram Srinivasan¹, Johannes Gehrke¹ - Show less +2 more•Institutions (1)

Microsoft¹

15 Sep 2019

TL;DR: A noisy speech dataset (MS-SNSD) that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio levels desired and an open-source evaluation methodology to evaluate the results subjectively at scale using crowdsourcing.

...read moreread less

Abstract: Background noise is a major source of quality impairments in Voice over Internet Protocol (VoIP) and Public Switched Telephone Network (PSTN) calls. Recent work shows the efficacy of deep learning for noise suppression, but the datasets have been relatively small compared to those used in other domains (e.g., ImageNet) and the associated evaluations have been more focused. In order to better facilitate deep learning research in Speech Enhancement, we present a noisy speech dataset (MS-SNSD) that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired. We show that increasing dataset sizes increases noise suppression performance as expected. In addition, we provide an open-source evaluation methodology to evaluate the results subjectively at scale using crowdsourcing, with a reference algorithm to normalize the results. To demonstrate the dataset and evaluation framework we apply it to several noise suppressors and compare the subjective Mean Opinion Score (MOS) with objective quality measures such as SNR, PESQ, POLQA, and VISQOL and show why MOS is still required. Our subjective MOS evaluation is the first large scale evaluation of Speech Enhancement algorithms that we are aware of.

...read moreread less

183 citations

Journal Article•10.1186/S13636-015-0054-9•

ViSQOL: an objective speech quality model

[...]

Andrew Hines¹, Andrew Hines², Jan Skoglund³, Anil Kokaram³, Naomi Harte² - Show less +1 more•Institutions (3)

Dublin Institute of Technology¹, Trinity College, Dublin², Google³

17 May 2015-Eurasip Journal on Audio, Speech, and Music Processing

TL;DR: ViSQOL is shown to offer a useful alternative to POLQA in predicting speech quality in VoIP scenarios and has a wider application and robustness to conditions than PESQ or more trivial distance metrics.

...read moreread less

Abstract: This paper presents an objective speech quality model, ViSQOL, the Virtual Speech Quality Objective Listener. It is a signal-based, full-reference, intrusive metric that models human speech quality perception using a spectro-temporal measure of similarity between a reference and a test speech signal. The metric has been particularly designed to be robust for quality issues associated with Voice over IP (VoIP) transmission. This paper describes the algorithm and compares the quality predictions with the ITU-T standard metrics PESQ and POLQA for common problems in VoIP: clock drift, associated time warping, and playout delays. The results indicate that ViSQOL and POLQA significantly outperform PESQ, with ViSQOL competing well with POLQA. An extensive benchmarking against PESQ, POLQA, and simpler distance metrics using three speech corpora (NOIZEUS and E4 and the ITU-T P.Sup. 23 database) is also presented. These experiments benchmark the performance for a wide range of quality impairments, including VoIP degradations, a variety of background noise types, speech enhancement methods, and SNR levels. The results and subsequent analysis show that both ViSQOL and POLQA have some performance weaknesses and under-predict perceived quality in certain VoIP conditions. Both have a wider application and robustness to conditions than PESQ or more trivial distance metrics. ViSQOL is shown to offer a useful alternative to POLQA in predicting speech quality in VoIP scenarios.

...read moreread less

172 citations

...

Expand

Year	Papers
2021	10
2020	10
2019	10
2018	5
2017	8
2016	5

Topic Tools

Papers published on a yearly basis

Papers

Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs

Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs

Perceptual evaluation of speech quality (PESQ) the new ITU standard for end-to-end speech quality assessment: Part I: Time-delay compensation

A Scalable Noisy Speech Dataset and Online Subjective Test Framework.

ViSQOL: an objective speech quality model

Related Topics (5)

Performance Metrics