Intelligent Analytics for Massive Data

Smart Data

Our mission is to empower data analysts and enable them to derive insight from massive, heterogeneous, and fast-evolving data sets.

Our team

  • conducts advanced research in methods, technologies, and tools for data analytics along the entire data value chain (from data acquisition and information extraction, information integration and storage, to scalable data processing, interactive exploration, and  visualization, including crowd-computing and user-feedback),
  • strives to advance data-driven decision making in order to arrive at high-quality actionable intelligence for science, industry, and society,
  • develops novel and enhanced systems, tools, and solutions for information management, data science, and smart data applications.

Our key areas of activity include scalable data management systems and tools, data mining, and open data/information marketplaces. 

The research group lead by Prof. Dr. Markl is in close cooperation with the Database Systems and Information Management Group (DIMA) at TU Berlin and participates in the Berlin Big Data Center (BBDC).

The BBDC, funded by the Federal Ministry of Education and Research (BMBF), is a competence center for big data under the management of TU Berlin.
 

 

Flink / Stratosphere

"Apache Flink" [1] is a stream-processing framework for distributed, high-performing, always-available, and accurate data streaming applications. It originated from the joined research project "Stratosphere" [2], funded by the Deutsche Forschungsgemeinschaft (DFG). After a successful incubator phase, Flink graduated to a top-level project of the Apache Foundation [3] and became one of the most important and promising projects within the Apache Big Data Stack. Flink has a big and lively community, numerous well-known users, such as Zalando, Alibaba, and Netflix, and features it's own annually conference "FlinkForward" [4] taking place in Berlin and San Francisco.

[1] https://flink.apache.org

[2] http://stratosphere.eu/

[3] https://www.apache.org/

[4] https://flink-forward.org/

EMMA

Emma is a quotation-based Scala DSL that enables holistic optimizations of data flow programs for scalable data analysis on Apache Flink and Spark.

http://emma-language.org

Hawk

A Hardware Adaptive Query Compiler

The performance of modern processors is primarily bound by a fixed energy budget. This power wall forces processor vendors to specialize their processors to certain applications to provide the speedups users expect.

http://cogadb.dfki.de/

Myriad

The Myriad Toolkit facilitates the specification of scalable data generation programs with complex statistical constraints via a special XML data generator prototyping language.

The Myriad Toolkit uses advanced PRNG algorithms to implement offset-based access to the elements of the generated domain type sequences within a bounded time. This feature facilitates an efficient data-parallel execution mode. Data generation programs created with the Myriad Toolkit therefore can be scaled-out in a massively parallel manner in order to quickly generate large synthetic datatets with complex statistical dependencies.

http://www.myriad-toolkit.com

https://github.com/TU-Berlin-DIMA/myriad-toolkit

Peel

Peel is a framework that helps you to define, execute, analyze, and share experiments for distributed systems and algorithms. A Peel package bundles together the configuration data, datasets, and workload applications required for the execution of a particular collection of experiments. Peel bundles can be largely decoupled from the underlying operational environment and easily migrated and reproduced to new environments.​

http://peel-framework.org/

Contact

German Research Center for
Artificial Intelligence GmbH (DFKI)
Intelligent Analytics for Massive Data
DFKI Project Office Berlin
Alt-Moabit 91c
10559 Berlin
Germany

News from the Research Department

Das Europäische Innovations- und Technologieinstitut (EIT) hat den Gewinner eines europaweiten Wettbewerbs zur Wertschöpfung in der Produktion bekannt...

Go to Article

Am 6.12.2018 findet in Dortmund der Auftakt für das vom Bundesministerium für Bildung und Forschung (BMBF) geförderte Projekt zur Einrichtung eines...

Go to Article

Durch die Europäische Datenschutzgrundverordnung, die seit heute in Kraft ist, wird ein Dilemma für Anwender und Anbieter deutlich: Nutzer wollen...

Go to Article

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz