Project Details
Quantitative reasoning about database queries
Subject Area
Theoretical Computer Science
Term
since 2019
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 412400621
Traditional database concepts and systems have insisted on the totality of logical correctness in query answering. However, nowadays data-centric applications often analyze datasets that are unreliable and noisy or simply too large to allow answering complex queries exactly. Hence the management of modern data should be pursued by incorporating uncertainty and imprecision in data modeling, sampling in data-access modeling, approximation in query semantics, and machine learning in query formulation. Yet, while these concepts are ubiquitous in modern practice of data analytics, their underlying foundational basis in database theory is sparse andfragmented. Our goal in this proposal is to embark on a systematic and integrated study of database management under these terms. Towards that, a crucial and central subgoal is to establish the theoretical foundations of approximate query answering in a manner that is dynamic (data driven) and quantitative. Being dynamic will enable better approximations, since we can leverage properties of the data at hand. Being quantitative will allow for approximation guarantees, either absolute or statistical, and will provide the flexibility to trade accuracy for performance. We will carry out the proposed research by pursuing several objectives. We plan to establish and explore the theoretical foundations of database distances and corresponding notions of approximate query answering, including the relationship to querying samples and lossy compressions of data. We will also investigate the application of the theory to more specific tasks such as text analytics and description of complex queries and functions.
DFG Programme
DIP Programme
International Connection
Israel