Project Details
Computing Foundations For Semantic Stream Processing
Applicant
Danh Le Phuoc, Ph.D.
Subject Area
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Term
since 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 453130567
The ability to process stream data is ubiquitous in modern information systems. The grand challenge in establishing a processing framework for powering such systems is how to strike the right balance between expressivity and computability in a highly dynamic setting. The expressivity of the framework reflects what kind of input data and what types of processing operations it enables. The computability corresponds to its ability to process a certain workload (e.g., processing workflow and data size) under an execution setting (e.g., hardware and network bandwidth).So far, various research communities have independently addressed this challenge by imposing their application-specific trade-offs and assumptions. Such trade-offs and assumptions are driven by prior-knowledge on data characteristics (e.g., format, modality, schema and distribution), processing workload and computation settings. However, the recent developments of the Internet of Things and AI have brought completely new levels of expressivity of the processing pipeline as well as dynamicity of computation settings. For instance, a typical processing pipeline of a connected vehicle includes not only multimodal stream elements generated by various types of sensors but also very complex processing workflows including logic reasoning and probabilistic inference. Furthermore, this pipeline can be executed in a highly dynamic distributed setting, e.g. combining in-car processing units and cloud/edge computing infrastructures. The processing pipeline and the setup of this kind hence need a radical overhaul of the state of the art of several areas.To this end, this project will aim to discover new computing foundations that enable a novel processing framework to address this grand challenge. The targeted framework will propose a unified processing model for building a semantic stream processing engine with a standard-oriented graph data model and query language fragments. Therefore, the project will carry out a systematic study on the tractable classes of a wide range of processing operators e.g, graph query pattern, logic reasoning, and probabilistic inference on stream data. The newly-invented tractable classes of processing operations will pave the way for designing efficient classes of incremental evaluation algorithms. To address the scalability, the project will also study how to elastically and robustly scale a highly expressive stream processing pipeline in a dynamic and distributed computing environment. Moreover, the project will investigate a novel optimisation mechanism which combines the logic optimisation algorithms which exploit rewriting rules and pruning constraints with adaptive optimisation algorithms which continuously optimise its execution plans based on runtime statistics. The proposed algorithms and framework will be extensively and systematically evaluated in two application domains, connected vehicles and the Web of Things.
DFG Programme
Research Grants