Skip to main content Skip to main navigation



KI-basierte Anonymisierung personenbezogener Patientendaten in klinischen Text- und Sprachdatenbeständen

In the Medinym project, we are pursuing the goal of completely anonymising the speaker identity of a speaker, both on voice and on statement/semantic level, without losing emotional or diagnostic information. From the point of view of data protection, this development of speech data opens up enormous application potential.

The SLT contributes significant expertise in the areas of:

  • NLP/ IE for anonymization, for example, recognise and obfuscate relevant entities and/or relations, synthetic data generation for AI learning processes in the medical field.
  • Speech Synthesis, for example voice conversion (VC), speech-to-text (STT), voice cloning, zero-shot learning.
  • Speech Recognition, e.g. Automatic Speech Recognition (ASR), Multi-Lingual Speech Recognition.
  • Speaker Recognition, e.g. Automatic Speaker Recognition and Verification (ASV), Multi-Lingual Speaker Recognition
  • Emotion Recognition from Speech, Text, Video/Images, Multimodal, e.g. Transformer-based Models, Acoustic- , Linguistic- (Language Models), and Visual Models (Facial Expression, Landmarks).
  • Crowd-based AI support, *e.g. automated online orchestrated crowd- and expert sourcing hybrid AI+Human workflows for high quality data acquisition.
  • AI in the area of pre-trained language models, transfer-learning, cross-lingual learning, continuous learning, frugal AI, LLMs, RLHF.

Motivation The advancing scientific development of technologies based on artificial intelligence (AI) promotes medical application potentials. The real use of these technologies by a large number of users such as citizens, public authorities, health care workers and small and medium-sized enterprises is confronted with the difficulty of data security and data protection. Particularly in the automated processing of medical data, innovative technologies often cannot be used, as the protection of identity is rightly a high priority due to the sensitive content. The protection of clinical data and the resulting difficulty in accessing it also means that machine learning (ML), for example for clinical diagnoses, prognoses and therapy or decision support, cannot be developed without major hurdles.

Aims and approach The project "AI-based Anonymisation of Personal Patient Data in Clinical Text and Speech Datasets" (Medinym) investigates the possibility of further utilisation of sensitive data by removing the sensitive information through anonymisation. In the project, two medical use cases, text-based data from the electronic patient file and voice data from diagnostic doctor-patient conversations, are implemented as examples. To this end, open technologies for anonymisation are being investigated, further developed and applied to real data in the project. The researchers are also investigating how the significance of such anonymised data can be preserved for further use. In addition, methods will be considered that prevent or impede misuse of the technology outside of the intended use case.

Innovations and perspectives Information-preserving anonymisation should make it possible to further process clinical data, since de-anonymisation is no longer possible. These data sets can then be used to train AI models on clinical data in a privacy-compliant manner or be extended to other cohorts. This would make a cumulative collection of corresponding data sets possible even for small and medium-sized enterprises. For in this way, sensitive data could be combined across several application purposes and used for AI training routines; always assuming appropriate anonymisation. The intended anonymisation should also increase the willingness of patients to consent to participation in studies, data analyses and general donations of health data. Finally, information-preserving anonymisation allows the integration of the technology into common development methods and diagnostic systems and thus strengthens Germany as a science and business location in the areas of diagnostics, treatment and thus health care in general.

Lead: Dr. Tim Polzehl Dr. Tim Polzehl leads the AI-based developments in the area of speech-based applications of the Speech and Language Technology department at DFKI.In addition, he leads the area of "Next Generation Crowdsourcing and Open Data" and is an active member of the "Speech Technolgy" group of the Quality and Usability Labs (QU-Labs) at the Technical University of Berlin.

Profile DFKI:

Profile QU-Labs TU-Berlin:


BMBF - Federal Ministry of Education, Science, Research and Technology

BMBF - Federal Ministry of Education, Science, Research and Technology

Publications about the project

Suhita Ghosh; Arnab Das; Yamini Sinha; Ingo Siegert; Tim Polzehl; Sebastian Stober

In: Proc. INTERSPEECH 2023. Conference in the Annual Series of Interspeech Events (INTERSPEECH-2023), Pages 2093-2097, ISCA-speech, 2023.

To the publication

Arnab Das; Suhita Ghosh; Tim Polzehl; Ingo Siegert; Sebastian Stober (Hrsg.)

ISCA Tutorial and Research Workshop on Speech Synthesis (SSW-2023), Grenoble, France, ISCA Speech Synthesis Workshop, 2023.

To the publication

Carlos Franzreb; Tim Polzehl; Sebastian Möller

In: Proc. 3rd Symposium on Security and Privacy in Speech Communication. Symposium on Security and Privacy in Speech Communication (SPSC-2023), located at Interspeech 2023, August 21-24, Dublin, Ireland, ISCA, 2023.

To the publication