Project Details
Projekt Print View

Simultaneous Interpretation of Lectures into/from German

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2017 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 326928774
 
Due to substantial improvements in speech recognition and machine translation, first applications of speech translation have shown the potential of this technology. One successful example is the lecture translator at the KIT. This system transcribes German lectures given at the university and translates them into English in real time. Thereby, foreign students, who are not able to understand the lecture in German, can follow it by reading the English subtitles presented on a website. While this first system demonstrates the technology's potential, significant improvements are still necessary in order to provide a simultaneous interpretation service at German universities. This task is especially challenging since machine translation involving German is well known to be difficult. Therefore, this proposal will research techniques to generate better translations from German to English and will provide also a translation service for English lectures into German. The additional translation system from English to German allows foreign students to actively participate in the lectures. In order to achieve this, several research questions have to be addressed. First, while statistical machine translation can generate quite good quality translations for many language pairs, several phenomena of the German language have to be modeled better. For example, the difference in word order between German and English is a difficult task for current machine translation systems. Furthermore, a better modeling of the morphology and word agreement has to be integrated into the translation system. Thereby, we will focus on methods that can be used in low-latency scenarios. Secondly, techniques to structure the output of the speech recognition system are necessary in order to develop a helpful application. This task involves marking and removing disfluencies. Furthermore, the speech input does not include a segmentation into sentences and paragraphs. But the students are only able to understand the automatic transcript and translation fast enough to follow the lecture, if the text is presented well-structured. Therefore, a high quality segmentation into sentences and paragraphs is necessary. Finally, it is nearly impossible to perform a dedicated collection of training data for the speech recognition and machine translation components that match the domain of the specific university lectures addressed. Therefore, we will investigate techniques to collect feedback and corrections form the user. This data can then be use to improve the different components of the system.
DFG Programme Research Grants
Co-Investigator Dr.-Ing. Sebastian Stüker
 
 

Additional Information

Textvergrößerung und Kontrastanpassung