P.862 Perceptual Evaluation of Speech Quality does not always percept degraded speech well – what to do?
PESQ stands for Perceptual Evaluation of Speech Quality, an ITU standard specifying how one should test speech/voice signals to obtain objective measurement on the quality of tested voice sound and predict the results of subjective listening tests on network and telephony systems. Relax, even this method is not perfect…
PESQ is meant to be an objective measurement method that predicts the results of subjective listening tests on telephony systems. PESQ uses a sensory model to compare the original, unprocessed signal with the degraded signal from the network or network element. The resulting quality score is similar to the subjective “Mean Opinion Score” (MOS) measured using panel tests according to ITU-T P.800. The PESQ scores are calibrated using a large database of subjective tests. The method takes into account coding distortions, errors, packet loss, delay and variable delay, and filtering in analogue network components.
And again… although being one of the most popular tools PESQ has a number of disadvantages such as demanding test signals to be speech-like because many systems are optimized for speech and respond in an unrepresentative way to non-speech signals (e.g. tones, noise, ITU-T P.50). PESQ test signal is to be set by tester and thus vendor estimations may vary from end customer estimations. The approach performs signal level equalization what theoretically is not that good because when speaking different sound volumes may have different spectrums. PESQ cannot catch significant quality loss, which occurs when the voice is equalized such that there is far less low frequency and high frequency energy when compared to the original voice file (this case is described in Microtronix article devoted to PESQ: http://www.microtronix.ca/pesq-disc.html)
Hm… what should follow this post then? What is the outcome of all previous posts? What are we talking about? You’ll learn that tomorrow. Stay tuned!.

