Nachdem bereits ein Paper aus der Gruppe über schnelles Laden von CSVs durch Einsatz von GPUs akzeptiert wurde, ist dies das zweite Paper des Forschungsbereichs Intelligente Analytik für Massendaten (IAM) und des Fachgebiets Datenbanksysteme und Informationsmanagement (DIMA) an der TU Berlin, das auf der BTW 2021 vorgestellt werden wird.
BTW ist die führende Datenbank-Konferenz im deutschsprachigen Raum. Für weitere Informationen zur Konferenz besuchen Sie bitte https://sites.google.com/view/btw-2021-tud/.
Mobile devices have become ubiquitous; smartphones, tablets and wearables are essential commodities for many people. The ubiquity of mobile devices combined with their ever increasing capabilities, open new possibilities for Internet-of-Things (IoT) applications where mobile devices act as both data generators as well as processing nodes. However, deploying a stream processing system (SPS) over mobile devices is particularly challenging as mobile devices change their position within the network very frequently and are notoriously prone to transient disconnections. To deal with faults arising from disconnections and mobility, existing fault tolerance strategies in SPS are either checkpointing-based or replication-based. Checkpointing-based strategies are too heavyweight for mobile devices, as they save and broadcast state periodically, even when there are no failures. On the other hand, replication-based strategies cannot provide fault tolerance at the level of the data source, as the data source itself cannot be always replicated. Finally, existing systems exclude mobile devices from data processing upon a disconnection even when the duration of the disconnection is very short, thus failing to exploit the computing capabilities of the offline devices. This paper proposes a buffering-based reactive fault tolerance strategy to handle transient disconnections of mobile devices that both generate and process data, even in cases where the devices move through the network during the disconnection. The main components of our strategy are: (a) a circular buffer that stores the data which are generated and processed locally during a device disconnection, (b) a query-aware buffer replacement policy, and (c) a query restart process that ensures the correct forwarding of the buffered data upon re-connection, taking into account the new network topology. We integrate our fault tolerance strategy with NebulaStream, a novel stream processing system specifically designed for the IoT. We evaluate our strategy using a custom benchmark based on real data, exhibiting reduction in data loss and query latency compared to the baseline NebulaStream.
Eine Preprint-Version des Papers (PDF) kann heruntergeladen werden.