Project Details
Deep learning for robust audio-visual processing (A06)
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Cognitive, Systems and Behavioural Neurobiology
Cognitive, Systems and Behavioural Neurobiology
Term
since 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 261402652
In project A6, we will develop novel algorithms to jointly process visual and acoustic cues to improve signal processing in both domains. We will develop robust methods that are able to cope with cluttered real-world data, various noise types and deliberately crafted adversarial data. We will determine which parts of the data are most promising using attentional mechanisms and the principle of information gain. Finally, we will exploit both visual and audio data for sound source separation and speech enhancement. The developed methods will be embedded into a multi-modal robotics platform, which involves dealing with practical constraints such as limited training data, limited computational power, and real-time constraints.
DFG Programme
CRC/Transregios
International Connection
China
Applicant Institution
Universität Hamburg
Project Heads
Professorin Dr. Simone Frintrop; Professor Dr.-Ing. Timo Gerkmann; Professor Dr. Xiaolin Hu; Dr. Cornelius Weber, until 12/2019