Project Details
Lifespan AI - Project C2: Causal Discovery across the Lifespan
Subject Area
Epidemiology and Medical Biometry/Statistics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 459360854
The project will capitalise on AI methods to develop novel robust approaches for the discovery and characterisation of causal relations across the lifespan. Such methods are urgently needed because randomised trials, the gold standard for inferring causality, are almost never available for questions regarding long-term effects, e.g. of lifestyle in early childhood, across the whole lifespan. Causal discovery comprises a class of methods whereby a multivariate dataset is taken as input and, ideally, the output is a directed acyclic graph encoding the probabilistic causal structure among variables (nodes). While such methods are slowly being adopted in the biomedical and social sciences, they have never been applied to the case where several different datasets with temporal structure are combined to cover a long range of a person’s lifespan – our proposal aims to fill this gap.The project will begin with laying the foundation by delivering a theoretically sound and practically useful discovery algorithm which combines time-structured datasets allowing us to cover a long lifespan: the TIOD algorithm – “temporal integration of overlapping datasets”. To ensure that this algorithm can be applied to nonlinear / nonparametric settings we will exploit state-of-the-art machine learning methodologies, such as causal generative neural networks. A particular focus will also be given to the role of multi-cohort study designs for the performance of TIOD, as some designs may be more informative than others. Further, we will generalise methods for quantifying causal relations, such as those found by TIOD, so as to estimate long-term effects from separate data sources. To validate and demonstrate the practical usefulness of our methods we will elicit epidemiological research questions and prepare real datasets, combing for instance the IDEFICS/I.Family cohort study with the NAKO Health Study, where challenging issues of data harmonisation will need to be addressed. These applications to real data will ultimately deliver novel insights, for instance, into causal relations between early childhood factors and health outcomes in adulthood.
DFG Programme
Research Units