Project Details
Conversational Speech Quality Evaluation in the Crowd
Applicant
Professor Dr.-Ing. Sebastian Möller
Subject Area
Acoustics
Security and Dependability, Operating-, Communication- and Distributed Systems
Security and Dependability, Operating-, Communication- and Distributed Systems
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 501877996
The success of modern telecommunication services largely depends on the quality perceived by their users. For speech services including Voice-over-IP, perceived quality is quantified in subjective experiments in which participants either listen to recorded speech samples transmitted through the system under test, or engage in a conversation to achieve a more a realistic evaluation result. Typically, evaluations are conducted in a laboratory environment which allows to control confounding factors, such as background noise or participant hearing characteristics. The International Telecommunication Union (ITU) has issued recommendations on how to carry out such laboratory experiments so that they deliver reliable results. However they invoke high efforts in terms of costs and time. Crowdsourcing (CS) offers new possibilities for quality assessment by providing a global pool of participants carrying out evaluations over the Internet. The potential benefits of crowdsourced quality evaluations are the investigation a) of participant influence factors due to the diverse population of users, b) of environment influence factors due to the real-life environment of the participants, and c) reduced costs and turnaround times. ITU-T Recommendation P.808 provides guidelines on using CS evaluations for speech quality assessment. However, the method is so far limited to the listening-only situation and does not provide any recommendation on conversation tests.In this project, we aim at overcoming this limitation by systematically answering the following key research questions: How should crowdsourcing-based conversation tests be set up in order to provide valid and reliable results? Specifically, how can a real-time communication platform be set up to control all test-external factors (stemming from the Internet connection or hardware of the test participants)? How can the characteristics of the test participants and the test environment be assessed in online tests? How should the test procedure be designed and which differences are to be expected between crowdsourcing- and laboratory-based conversation test?The research questions will be answered by analyzing and quantifying the impact of the most important characteristics of the participants, their device, and the test environment in CS-based conversation tests, in comparison to standard laboratory experiments. Furthermore, valid and reliable inspection methods will be specified to remotely analyze relevant characteristics, such as environmental noise or hearing abilities. The analysis will lead to data-driven guidelines for conducting CS-based conversation tests which will be proposed for a new ITU-T Recommendation, and will be openly available to the research community.
DFG Programme
Research Grants