Publication

Evaluating Speech Enhancement Performance Across Demographics and Language

Jose Giraldo; Alex Peiró-Lilja; Carme Armentano-Oller; Rodolfo Zevallos; Cristina España-Bonet

In: Interspeech 2025. Conference in the Annual Series of Interspeech Events (INTERSPEECH-2025), Rotterdam, Netherlands, Pages 1353-1357, Interspeech, 2025.

Abstract

Speech enhancement models have traditionally relied on VoiceBank-DEMAND for training and evaluation. However, this dataset presents significant limitations due to its limited diversity and simulated noise conditions. As an alternative, we propose and demonstrate the usefulness of evaluating the generalization capabilities of recent speech enhancement models using CommonPhone, a multilingual and crowdsourced dataset. Since CommonPhone is derived from CommonVoice, it allows to analyze enhancement performance based on demographic variables such as age and gender. Our experiments reveal significant performance variations across these variables. We also introduce a new benchmark dataset designed to challenge enhancement models with difficult and diverse speech samples, facilitating future research in universal speech enhancement.

Evaluating Speech Enhancement Performance Across Demographics and Language

Abstract

More links