Project Details
Comprehensive Modeling of Conversational Contributions in Prose Texts
Applicant
Professor Dr. Sebastian Padó
Subject Area
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2017 to 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 350397899
In many kinds of prose texts, both literary or newswire texts, reportedspeech plays an important role as a source of information aboutcharacters, their attitudes, and their relationships. Going further,such information can aid in the analysis of patterns of behavior and theconstruction of social networks.While readers do not have any problem in assembling representations forcomplete situations from individual instances of reported speech, thisis still a challenging task for computers. Current state of the artmethods are generally organized as "pipelines" which start fromindividual instances of reported speech and proceed incrementally tomore global properties of the situation or characters. Since individualinstances of reported speech are often short and uninformative, apipeline procedure often causes prediction errors which cannot berectified in retrospect.In this project, we develop joint inference methods to model the variousaspects of reported speech (who is the speaker? the hearer? What is thecontent? What is the relationship between speaker and hearer?) togetherinstead of individually. The resulting joint model takes account of theinterdependencies between these aspects. Thus, information from thedifferent aspects can complement each other. The result of this part ofthe project is a solid starting place (in terms of natural languageprocessing methods) for the application of such methods for theautomatic analysis of reported speech in digital humanities and socialsciences.This algorithmic goal is complemented by a goal from corpus andcomputational linguistics, namely elucidating the relationship betweenreported speech and other aspects of semantic analysis. In particular,there appears to be a close relationship between reported speech and (asubset) of semantic roles. Yet, no comprehensive formal analysis hasbeen carried out so far. We will provide a linguistic characterizationof the relationship and exploit it algorithmically to further improvethe recognition of reported speech as discussed above. The results ofthis part of the project is the (at least partial) consolidation of twostrands of research that have largely been treated as independent sofar.
DFG Programme
Research Grants
Co-Investigator
Professor Dr. Roman Klinger