Detailseite
Projekt Druckansicht

Simultanes Dolmetschen von Vorlesungen von/nach Deutsch

Fachliche Zuordnung Allgemeine und Vergleichende Sprachwissenschaft, Experimentelle Linguistik, Typologie, Außereuropäische Sprachen
Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Förderung Förderung von 2017 bis 2022
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 326928774
 
Erstellungsjahr 2022

Zusammenfassung der Projektergebnisse

Speech translation (ST) is one of the most challenging yet attractive and interesting from the application point of view. In this project, KIT addressed one of the most challenging conditions for speech translation: streaming speech translation of lectures from and to German. By spotting and tracking down its main issues (data sparsity, quality and latency of its components, domain mismatches, etc.) and then investigating and researching novel, advanced techniques to tackle those issues, we have managed to build a high-quality lecture translation system. The proposed achievements were presented at well-known, international conferences and lead to the successful participation at several international evaluation campaigns. For example, our English speech recognition achieves super-human performance for a standard test set on conversational speech with a low latency. Or our multilingual translation system pioneers the research field with the idea of making the learned common representation interlingual. In addition, the techniques were integrated into a real-world application of speech translation, the KIT lecture translator. Our speech translation framework, beside becoming a useful tool for lecturers and students, also helps us to collect more lecture data and user feedbacks, shedding the light for more research on how to leverage those kinds of data to improve lecture translation systems. We also initialize a prototype model of direct speech translation, urging the efforts to build more and larger end-to-end speech translation corpora in the community. Within the project, we also developed significant contribution to one of the most researched questions in the speech and speech translation community at the moment: The comparison between end-ot-end ASR vs hybrid ASR or end-to-end speech translation and cascaded speech translation. Thereby, the developed techniques are a valuable contribution in reducing the gap between the different approach. The most valuable lesson learned from this project is how we foresee and estimate the potential of some directions and come up with modern and advanced research along those directions. Being able to do this early enough, we can contribute greatly to the research community as well as strive to get high quality research and application.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung