Deep learning for robust audio-visual processing (A06)

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Cognitive, Systems and Behavioural Neurobiology

Term since 2020

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 261402652

Project Description

In project A6, we will develop novel algorithms to jointly process visual and acoustic cues to improve signal processing in both domains. We will develop robust methods that are able to cope with cluttered real-world data, various noise types and deliberately crafted adversarial data. We will determine which parts of the data are most promising using attentional mechanisms and the principle of information gain. Finally, we will exploit both visual and audio data for sound source separation and speech enhancement. The developed methods will be embedded into a multi-modal robotics platform, which involves dealing with practical constraints such as limited training data, limited computational power, and real-time constraints.

DFG Programme CRC/Transregios

Subproject of TRR 169: Crossmodal Learning: Adaptivity, Prediction and Interaction

International Connection China

Applicant Institution Universität Hamburg

Project Heads Professorin Dr. Simone Frintrop; Professor Dr.-Ing. Timo Gerkmann; Professor Dr. Xiaolin Hu; Dr. Cornelius Weber, until 12/2019

Servicenavigation

Hauptnavigation

Deep learning for robust audio-visual processing (A06)

Additional Information

Servicenavigation

Hauptnavigation

Deep learning for robust audio-visual processing (A06)

Additional Information

Textvergrößerung und Kontrastanpassung