Project Details
Statistical foundations of decision making and learning in dynamical environments
Applicant
Professor Tobias Sutter, Ph.D.
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Theoretical Computer Science
Theoretical Computer Science
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 521183802
Taking optimal sequential decisions in an unknown dynamical environment requires a balanced trade-off between learning the underlying system by interacting with it (exploration) and using this acquired knowledge to make optimal decisions (exploitation). In computer science, operations research and statistics such sequential decision making problems are addressed by the framework of reinforcement learning (RL) and the underlying dynamic programming principle. This modelling framework led to a technological "miracle" in 2017, when an RL based algorithm managed to successfully play Go without any human knowledge. The key factors enabling this success were a "skilful implementation of known ideas and awesome computational power". Mathematically explaining why this amount of computational power is needed and under which conditions such superior performance can be observed remains widely open. Moreover, combining RL with the framework of causal inference (known as causal reinforcement learning) has only very recently gained thrust with the advent of several works, a unifying theory, however, is far from being developed. This project investigates and develops data-driven methods for causal decision problems in unknown dynamic environments, focusing on the underlying statistical guarantees as well as the efficient computation of these sequential decision problems. In particular, the project aims to use recent developments from data-driven distributionally robust optimization and extend these to a dynamical (or causal) setting. Our focus will be on deriving statistical bounds on the quality of solutions/decisions as well as deriving computationally efficient algorithms for solving the underlying optimization problems.Studying a statistically principled framework for (causal) RL will further provide important insights to the question of how a system's agency can be modelled and certified and therefore also contributes towards explainability of AI systems addressing these sequential decision making problems.
DFG Programme
Research Grants