Skip to main content Skip to main navigation


Demonstrating CAT: Synthesizing Data-Aware Conversational Agents for Transactional Databases

Marius Gassen; Benjamin Hättasch; Benjamin Hilprecht; Nadja Geisler; Alexander Fraser; Carsten Binnig
In: Proceedings of the VLDB Endowment (PVLDB), Vol. 15, No. 12, Pages 3586-3589, Association for Computing Machinery (ACM), 2022.


Databases for OLTP are often the backbone for applications such as hotel room or cinema ticket booking applications. However, developing a conversational agent (i.e., a chatbot-like interface) to allow end-users to interact with an application using natural language requires both immense amounts of training data and NLP expertise. This motivates CAT, which can be used to easily create conversational agents for transactional databases. The main idea is that, for a given OLTP database, CAT uses weak supervision to synthesize the required training data to train a state-of-the-art conversational agent, allowing users to interact with the OLTP database. Furthermore, CAT provides an out-of-the-box integration of the resulting agent with the database. As a major difference to existing conversational agents, agents synthesized by CAT are data-aware. This means that the agent decides which information should be requested from the user based on the current data distributions in the database, which typically results in markedly more efficient dialogues compared with non-data-aware agents. We publish the code for CAT as open source.

Weitere Links