Project Details
Projekt Print View

Localized Statistical Learning with Kernels

Subject Area Mathematics
Term from 2016 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 317622002
 
Statistical machine learning approaches, and in particular regularized kernel methods, have been proven successful in many data analysis applications and additionally enjoy a nowadays well-founded theory. In recent years, so-called big data applications, which typically require the statistical analysis of huge, high-dimensional data sets with often uncertain data quality, moved more and more into the research focus. Unfortunately, standard regularized kernel methods poorly scale with the sample size, and are therefore not appropriate for big data scenarios. The goal of this project is to address this serious issue by establishing theoretically and empirically well-founded data-decomposition approaches for decreasing the computational requirements of kernel-based learning methods. On the empirical side we envision methods that can handle millions of samples in high dimensions on a single desktop within a reasonable period of time. On the theoretical side, we seek, unlike most previous attempts for speeding up kernel-based learning methods, strong guarantees for these decomposition approaches.In a nutshell, our objectives consist of four parts:(i) Identification of spatially oriented data decomposition strategies that reduce the computational requirements significantly without sacrificing generalization performance. This is in sharp contrast to several recent approaches in the literature in which random subsampling from the original data set is proposed.(ii) A rigorous mathematical analysis of successful decomposition strategies, which includes universal consistency, fast and, if possible, optimal learning rates, and statistical robustness.(iii) A description of the trade-off between computational requirements and generalization performance.(iv) Software prototypes to demonstrate that these methods are applicable in big data situations. Because of the spatial nature of the considered decomposition strategies and the strong focus on their theoretical analysis, we speak of localized statistical learning with kernels (LSLK).
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung