Seven Data Management Papers Presented at ACM SIGMOD 2021

The 2021 ACM International Conference on the Management of Data (SIGMOD) – a top ranked international conference on database systems and information management – accepted seven papers submitted by DFKI and TU Berlin Researchers. Large amounts of high-quality data are the backbone of modern machine learning applications in research, industry, and sectors, like medicine and mobility. To enable the next generation of Artificial Intelligence applications, an increasing number of different data sources need to be accessed and analyzed in a shorter period of time, while reducing computation costs, maintaining fault tolerance, and achieving high data quality. The researchers in the Intelligent Analytics for Massive Data (IAM) group, led by Prof. Dr. Volker Markl, tackled some of these data management challenges and developed innovative solutions.

Six full research papers and one industrial paper by DFKI Researchers on data management topics were accepted at SIGMOD 2021. “The acceptance of such a high number of papers from one German research group at SIGMOD is exceptional. I am very proud of this success and the international recognition of our research efforts,” states Volker Markl.

Two of the publications were the result of international research collaborations. One of these papers is due to joint work with colleagues at East China Normal University in Shanghai. The authors propose HyMAC, a system that enables iterative machine learning algorithms to run more efficiently on distributed dataflow systems. The approach has the potential to speed up the process of machine learning with data from billions of datapoints by reducing the communication cost in dataflow systems, such as Apache Flink.

The other international collaboration resulted in a publication based on work conducted in the ExDRa (Exploratory Data Science on Raw Data) project, jointly with DFKI and TU Berlin researchers, Siemens AG, TU Graz, and Know-Center GmbH. This paper was accepted in the SIGMOD Industrial Track. The ExDRa system is designed to support the exploratory data science process over heterogeneous and distributed data. Typically, industrial data scientists propose and evaluate hypotheses, integrate the necessary data, and build and execute models, in order to identify interesting patterns. To aid in this process, ExDRa investigates how to design and build a system that can offer support and help optimize the analysis of problems arising in several Siemens use-cases (e.g., chemical, pharmaceutical, water, oil, gas). For this, the project leverages the NebulaStream data management system for the IoT.

Share this post:

Contact:

Martin Pagel

Researcher, DFKI Berlin

Press contact:

Andreas Schepers, M.A.

Unternehmenskommunikation, DFKI Berlin

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz