Project Details
Multimodal sparsity models for segmentation of visual objects
Applicants
Professor Dr. Daniel Cremers; Professor Dr.-Ing. Klaus Diepold, since 8/2016
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2015 to 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 263894508
Automatic segmentation of objects in a 3D scene is a fundamental problem in Computer Vision. It is a core aspect for many high-level algorithms including object recognition, semantic scene analysis and 3D object reconstruction. To date unsupervised segmentation faces many challenges such as complex texture and lighting variations in real-world scenes. The proposed project is based on the reasonable assumption that an object is determined by characteristic properties of signals captured through different modalities. We intend to demonstrate that a rigorous mathematical derivation of multimodal segmentation approaches will lead to drastic improvements in unsupervised segmentation ultimately allowing for a robust and precise segmentation of physical objects in the scene. More specifically we will focus on the following challenges: - Inspired by biological systems which perceive their environment through many different signal modalities at once, we intend to devise algorithms to fuse sensory information from different modalities in order to drastically enhance the performance of unsupervised segmentation methods. - Unlike existing fusion schemes that combine information on higher processing levels, we will focus on fusion across several modalities on the signal level.- We will generalize existing techniques for sparse representation and inference from the unimodal setting to the multimodal setting and demonstrate that an accurate modeling of sparsity and interdependency of the multiple channels allows for much better discrimination of objects of interest. Our methods will be general enough to work on different types of visual data and their cross-modal dependencies, without the need of hand-crafted signal-specific features. We investigate sparse signal representations in general and focus on the so-called co-sparse analysis model in particular. - We will derive variational segmentation algorithms which exploit multimodal sparsity for unsupervised object segmentation, thereby combining multimodal sparsity with powerful convex regularization methods.- We will develop a demonstrator which generates semantic 3D segmentations from multimodal data taken from multiple views of a scene. We hope to demonstrate that the proposed method is robust to noise and challenging lighting conditions of real-world environments.
DFG Programme
Research Grants
Ehemaliger Antragsteller
Professor Dr. Martin Kleinsteuber, until 8/2016