Project Details
Projekt Print View

Semantic Methods for Computer-supported Writing Aids

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2014 to 2017
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 249088706
 
This research proposal in the field of language technology is concerned with the question whether it is possible to develop a semantic writing aid, which helps reformulating texts by paraphrasing. This works similar to a spelling correction or grammar correction: in a text processing context, suitable paraphrases are offered, which allows faster formulation of texts with a more variable vocabulary choice. A special feature is given by researching a mechanism that improves paraphrasing quality (and in this way the writing aid) by usage data. The main hypothesis in this proposal is that we assume that unsupervised and knowledge-free methods can yield suitable data sources for paraphrases for this application context. A paraphrasing component is found at the core of a prototypical implementation of this writing aid. Additionally to questions regarding the combination of several data sources and research regarding a suitable user interface, we will work on transferring the methodology to other languages using data-driven methods. Further, we explore the possibility of improving the writing aid with implicit feedback. For the development of single components, as well as for simulating usage, we massively rely on crowdsourcing as a means to data collection and for evaluation. First we explore, how paraphrases from different data sources are characterized and combine these heterogeneous sources in a paraphrasing component. We contextualize distributional semantic methods to produce context-dependent paraphrase candidates wit unsupervised and knowledge-free methods. Here, we especially focus on the data-drivenness of this approach, which should ideally work without any existing lexical resources. This is motivated by language and domain independence and will be demonstrated by transferring the methodology from English to German. Supported by user studies, we implement a prototype of the writing aid and optimize the solution regarding user interaction, presentation and pro-activity. We use this prototype to examine, in how far we can use 'weak signals', i.e. merely interaction data with the prototype, to improve the paraphrasing component. This form of implicit user feedback, that has not been utilized in language technology before, gives rise to the iterative refinement of language processing components in order to segue pre-processing steps to the quality level required by applications.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung