Project Details
Human Perception and Automatic Detection of Speaker Personality and Likability - Influence of Modern Telecommunication Channels
Applicant
Laura Fernández Gallardo, Ph.D.
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Acoustics
Acoustics
Term
from 2015 to 2018
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 284757262
When listening to unknown voices, humans tend to make spontaneous inferences about the perceived personality and voice likability of their interlocutors. The voices heard are generally transmitted through communication channels, e.g. in telephone-based speech applications. However, the study of transmission channel effects has not yet been addressed in previous investigations of human and automatic detection of personality traits and likability. Besides, regarding the automatic prediction of these speaker characteristics, the binary classification task has principally been tackled despite the continuous nature of the perceptive ratings. The proposed project will examine the influence of transmission channels of different settings, such as bandwidth, codec and user interface, on speaker personality and likability detection by humans and machines. Conversational speech data in German, needed for the proposed analyses, will be recorded. On the human side, crowdsourcing will be employed to rapidly and reliably gather listeners' assessments from large transmitted speech material. On the automatic side, regression models will be considered for personality and likability prediction, employing state-of-the-art techniques such as deep neural networks. The validity of speech quality measures as predictors of these speaker characteristics will also be studied. The outcomes will elucidate which transmission channels can preserve the voice properties that determine the perceived personality and likability, and how these can be automatically predicted. This can be used in applications based on telephone speech which aim at estimating perceived speaker characteristics and at foreseeing subsequent user behavior.
DFG Programme
Research Grants