Project Details
Hybrid grammars for discontinuous phrase structure trees
Applicant
Professor Dr.-Ing. Heiko Vogler
Subject Area
Theoretical Computer Science
Term
from 2014 to 2015
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 255344147
The syntactic structure of sentences of a natural language is usually given by a context-free grammar. This grammar can be used to parse a given sentence and to build its parse tree. Each parse tree is continuous in the sense that for each node the following holds: the frontier, i.e., the concatenation of the leaf labels from left to right, of the i-th subtree is placed to the left of the frontier of the j-th subtree provided i is smaller than j. For languages with relative free word ordering (e.g. German and Dutch) also discontinuous parse trees occur. For instance, in the sentence "Sie hat oft geschrieben" the two words "hat" and "geschrieben" of the verb phrase do not occur consecutively in the sentence but disconnected, or: discontinuous. Also the phenomenon of cross-serial dependencies in Dutch leads to discontinuous parse trees. For instance, in the sentence "omdat ik Peter Cecilia de nijlpaarden zag helpen voeren" the two words "ik" and "zag" belong together and, e.,g, have to be inflected simultaneously, but they are placed discontinuously in the sentence; the same applies to the word pairs "Peter-helpen" and "Cecilia-voeren". In this short research project I would like to introduce a new grammar formalism, called hybrid grammars which generates discontinuous parse trees. Each rule of a hybrid grammar has a probability in order to calculate a ranking among all the possible parse trees of one given sentence. In particular, I want to find out to which extent hybrid grammars are an appropriate tool for natural language processing. For this the following questions arise: How can one extract rules for a hybrid grammar from a corpus? How can one train probabilities to the rules? How efficient is a parser? The theoretical questions should be accompanied by practical implementations.
DFG Programme
Research Grants