Project Details
Projekt Print View

Learning How to Interact with Scenes through Part-Based Understanding

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 495902919
 
In recent years, we have seen remarkable advances in machine perception of real-world environments with data-driven deep learning techniques. These have created new possibilities in many applications, including robotics, mixed reality, industry 4.0, and medical understanding. In particular, success in object recognition has paved the way forward to open up possibilities for complex machine perception and higher-level scene understanding. In particular, understanding interactions with environments is fundamental towards understanding human behavior as well as the structure and design of man-made environments. The ability to predict and generate interactions with scenes will impact many applications across research and industry: for instance, robots or virtual agents must understand plausible interactions with their environments to move beyond navigation-based tasks; diverse, complex virtual environments can be generated for mixed reality scenarios; etc.. Unfortunately, recognition at the level of objects remains insufficient to inform interactions, which occur with object parts rather than objects as a whole.For instance, in order to find an object in a drawer, one should be aware that the drawer handle is used to open it to gain access to its contents. This requires inferring and understanding the part decomposition of the objects in an environment. Inferring such a part-based decomposition then enables possibilities for efficient operations at the part level to develop approaches that can reason about object functionality and propose potential interactions with objects in the environment.In this proposal, we thus develop new machine learning methods to address the problem of inferring a part-based understanding of objects in real-world environments, and reason about their functionality. We will develop new deep learning architectures for recognizing and segmenting object parts in scenes, exploring a different data representations and geometric operators, such as sparse voxels, points, multi-view images, meshes, volumetric or geodesic convolutions, to best achieve effective semantic part understanding of an observed scene. Additionally, to mitigate expensive data annotation processes, we will focus on developing weakly- and self-supervised approaches to bridge together knowledge from existing synthetic shape datasets with part annotations to real-world observations without any part annotations. Finally, we will develop new learning algorithms based on our predicted part decompositions to understand the functionality of the objects in a scene based on these object parts, and propose possible interactions with the objects in the environment.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung