Skip to main content Skip to main navigation

Publication

ELEET: Efficient Learned Query Execution over Text and Tables

Matthias Urban; Carsten Binnig
In: Proceedings of the VLDB Endowment (PVLDB), Vol. 17, No. 13, Pages 4867-4880, VLDB, 2024.

Abstract

In this paper, we present ELEET, a novel execution engine that al- lows one to seamlessly query and process text as a first-class citizen along with tables. To enable such a seamless integration of text and tables, ELEET leverages learned multi-modal operators (MMOps) such as joins and unions that seamlessly combine structured with unstructured textual data. While large language models (LLM) such as GPT-4 are interesting candidates to enable such learned multi- modal operations, we deliberately do not follow this trend to enable MMOps, since it would result in high overhead at query runtime. Instead, to enable MMOps, ELEET comes with a more efficient small language model (SLM) that is targeted to extract structured data from text. Thanks to our novel architecture and pre-training proce- dure, the ELEET-model enables high-accuracy extraction with low overheads. In our evaluation, we compare query execution based on ELEET to baselines leveraging LLMs such as GPT-4 and show that ELEET can speed up multi-modal queries over tables and text by up to 575× without sacrificing accuracy.

More links