Project Details
Stability Analysis for Clustering
Applicant
Professor Dr. Joachim M. Buhmann
Subject Area
Mathematics
Term
from 2008 to 2012
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 40095828
A new model validation principle based on information theory is developed and analyzed in this project. Discrete structures like data partitions in clustering or graph cuts are infered from noisy data according to an objective function. Due to the noise in the measurements (data), learning algorithms have to return a set of approximate partitionings which are considered to be statistically indistinguishible. The uncertainty in the data induce a quantization of the space of partitionings and, thereby, defines a coding scheme. An information theoretic analysis of this code yields an approximation capacity of the underlying model represented by an objective function. This selection criterion trades informativeness against stability and controls the model complexity by the approximation precision. Approximate solutions are sampled by Gibbs sampling at a finite temperature. This novel information theoretic model selection principle will be applied to correlation clustering in the context of clustering protein interaction data. Furthermore, we will apply this principle for learning dynamical systems in systems biology and for infering user roles in information security applications.
DFG Programme
Research Units
Subproject of
FOR 916:
Statistical Regularisation and Qualitative Constraints - Inference, Algorithms, Asymptotics and
Applications
International Connection
Switzerland