Speech Quality Assessment in Crowdsourcing: Influence of Environmental Noise

Babak Naderi, Sebastian Möller, Gabriel Mittag

In: 44. Deutsche Jahrestagung für Akustik (DAGA). Deutsche Jahrestagung für Akustik (DAGA) Alte Jakobstraße 88, 10179 Berlin Pages 229-302 ISBN 978-3-939296-13-3 Deutsche Gesellschaft für Akustik DEGA e.V. 2018.


Micro-task crowdsourcing opens up new possibilities for investigating the influence of a variety of realistic environmental factors on the quality of transmitted speech as perceived by the user. This paper reports the influence of environmental noise on speech quality assessment ratings using crowdsourcing approach. In a two-phase experiment, subjects assessed the quality of speech stimuli from a standard dataset (SwissQual 501 speech database from the ITU-T Rec. P.863 competition) in different environments. Phase A was conducted in the laboratory, in either silent or simulated environments with background noise. In phase B, the same group of participants completed the same task in different crowdsourcing environments. The Mean Opinion Score (MOS) values, representing perceived overall quality, were calculated for each degradation condition and compared to the scores reported from the standard laboratory test. The highest correlation with standard laboratory test was achieved in the silent-laboratory environment (rs = .97). In the noisy (simulated) environments higher correlation was achieved when subjects were wearing in-ear headphones, and in crowdsourcing condition when they were performing their task in their living-room. It was also discovered that perceived loudness of the stimuli negatively correlates with the difference between MOS values obtained in test environmental conditions and the MOS values reported in the standard laboratory.

