DeepNotebooks: Deep Probabilistic Models Construct Python Notebooks for Reporting DatasetsClaas Völcker; Alejandro Molina; Johannes Neumann; Dirk Westermann; Kristian Kersting
In: Peggy Cellier; Kurt Driessens (Hrsg.). Machine Learning and Knowledge Discovery in Databases. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD-2019), International Workshops of ECML PKDD 2019, Proceedings, Part I, September 16-20, Würzburg, Germany, Pages 28-43, Communications in Computer and Information Science, Vol. 1167, Springer, 2019.
Machine learning is taking an increasingly relevant role in science, business, entertainment, and other fields. However, the most advanced techniques are still in the hands of well-educated and -funded experts only. To help to democratize machine learning, we propose DeepNotebooks as a novel way to empower a broad spectrum of users, which are not machine learning experts, but might have some basic programming skills and are interested data science. Within the DeepNotebook framework, users feed a cleaned tabular datasets to the system. The system then automatically estimates a deep but tractable probabilistic model and compiles an interactive Python notebook out of it that already contains a preliminary yet comprehensive analysis of the dataset at hand. If the users want to change the parameters of the interactive report or make different queries to the underlying model, they can quickly do that within the DeepNotebook. This flexibility allows the users to interact with the framework in a feedback loop—they can discover patterns and dig deeper into the data using targeted questions, even if they are not experts in machine learning.