Project Details
Projekt Print View

Statistical theory on finite alphabet structures: inference, algorithms, and applications

Subject Area Mathematics
Term from 2018 to 2020
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 411042450
 
A vast amount of research of modern statistics is concerned with problems that are highly underdetermined, in the sense that the amount of unknown parameters is (much) larger than the amount of observable data. This renders estimation of and inference about such parameters impossible per se, as the parameters are not identifiable. Therefore, it is pertinent to include additional structural information. In a broader sense, this is achieved by some kind of sparsity: although the parameter of interest is complex (e.g., high-dimensional), it has a simple (e.g., low-dimensional) underlying structure. The focus of this proposal is on a type of sparsity that has received relatively few attention so far, namely, sparsity in the function values of a signal via a given finite alphabet (FA). FA structures appear in many different fields, for example, in cancer genetics, where DNA copy-numbers can only take one of a few known integer values, and in digital communications with binary signals.In the theoretical part of this project, we want to analyze, jointly with Prof. Martin Wainwright (UC Berkeley) and Prof. Bin Yu (UC Berkeley), how FA structures can enable meaningful inference in underdetermined statistical models, in place of and in combination with classical sparsity. Thereby, we want to focus on blind source separation and high-dimensional linear models. Although, FA structures solve the problem of non-identifiability, their combinatorial nature leads to a computational burden. Therefore, a fundamental research objective of this proposal is to precisely quantify this gap between statistical minimax optimality and computational feasibility. In particular, we want to develop fast algorithms, which, at the same time, yield adequate statical efficiency.On this basis, in the analytical part of this project, we want to consider a modification of FA structures: Subgroup detection for clinical trials often leads to segmentation problems, where a specific FA is induced by phylogenetic trees. We want to tackle those problems with multiscale procedures. Those do not just provide minimax optimal estimates, but also confidence statements, something which can be particularly crucial in medical applications. In cooperation with Prof. Bin Yu (UC Berkeley) and the Wellcome Trust Center for Human Genetics (Oxford) we want to demonstrate with real data examples how FA-procedures provide significant improvement in personalized medicine.
DFG Programme Research Fellowships
International Connection USA
 
 

Additional Information

Textvergrößerung und Kontrastanpassung