DFKI-LT - Neural correlates of speech quality dimensions analyzed using electroencephalography (EEG)

Stefan Uhrig, Gabriel Mittag, Sebastian Möller, Jan-Niklas Voigt-Antons
Neural correlates of speech quality dimensions analyzed using electroencephalography (EEG)
2 Journal of Neural Engineering volume 16 number 3, IOP Publishing, 2019
Objective. By means of subjective psychophysical methods, quality of transmitted speech has been decomposed into three perceptual dimensions named discontinuity, noisiness and coloration. Previous studies using electroencephalography (EEG) already reported effects of perceived intensity of single quality dimensions on electrical brain activity. However, it has not been investigated so far, whether the dimensions themselves are dissociable on a neurophysiological level of analysis. Approach. Pursuing this goal in the present study, a high-quality (HQ) recording of a spoken word was degraded on each dimension at a time, resulting in three quality-impaired stimuli (F, N, C) which were on average described as being equal in perceived degradation intensity. Participants performed a three-stimulus oddball task, involving the serial presentation of different stimulus types: (1) HQ or degraded standard stimuli to establish sensory/perceptual quality references. (2) Degraded oddball stimuli to cause random, infrequent deviations from those references. EEG was employed to examine the neuro-electrical correlates of speech quality perception. Main results. Emphasis was placed on modulations in temporal and morphological characteristics of the P300 component of the event-related brain potential (ERP), whose subcomponents P3a and P3b are commonly linked to attentional orienting and task relevance categorization, respectively. Electrophysiological data analysis revealed significant modulations of P300 amplitude and latency by the perceptual dimensions underlying both quality references and oddball stimuli. Significance. The present study exemplifies the utility of physiological methods like EEG for dissociating speech degradations not only based on perceived intensity level, but also their distinctive quality dimension.
Files: BibTeX