Project Details
Bridging Levels of Abstraction in Brains and Natural Language Processing Machines
Applicant
Professorin Mariya Toneva, Ph.D.
Subject Area
Human Cognitive and Systems Neuroscience
General, Cognitive and Mathematical Psychology
Biological Psychology and Cognitive Neuroscience
General, Cognitive and Mathematical Psychology
Biological Psychology and Cognitive Neuroscience
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 459426179
In the last few years, new Artificial Intelligence tools have emerged for Natural Language Processing (NLP) that significantly outperform previous methods across linguistic tasks, ranging from predicting upcoming words to answering comprehension questions. In particular, these methods learn to represent individual words and to flexibly combine these representations to account for the surrounding context and the task at hand. Recent work shows that these models can also predict brain activity during language comprehension to an impressive degree, revealing a significant similarity between representations of language in the brain and in these deep NLP models. Despite their successful task performance and the documented similarities of their linguistic representations to neuroimaging recordings, these deep NLP models exhibit several limitations: deep NLP models require vast amounts of data to learn and they struggle to maintain previously learned skills and knowledge when learning a new task. In this subproject of the ARENA research unit, we posit that these limitations are in part due to the formation of suboptimal representations of abstracted concepts, which are important for faster learning and improved generalization to new tasks, and that bridging the levels of abstraction in NLP models with those in the brain will result in improvements along several NLP-relevant metrics. To this end, we will address several important questions about the relationship between the brain and deep NLP models, by leveraging three key sources of information about linguistic meaning: brain recordings of people comprehending language, deep NLP models, and importantly, human judgements about the fine-grained semantic properties of words: What makes the representations of NLP models similar to the representations of language in the brain? How do these similarities relate to semantics at different levels of abstraction? Do the same patterns of similarity across different levels of abstraction hold for AI systems that are trained on multi-modal input, such as both language and visual input? Furthermore, what are the changes in an NLP model when we incorporate brain recordings in its training process? How do its representations of semantics at different levels of abstraction change, and what is the effect on its performance on linguistic tasks? Broadly, what are the benefits and limitations of incorporating brain recordings into NLP models?We envision that the integration of fundamental insights from addressing these questions can lead to more robust and data-efficient systems and that the work conducted in this subproject of the ARENA research unit, in close cooperation with the subprojects of the remaining ARENA PIs, will contribute to both these fundamental insights and to a comprehensive understanding of the benefits and limitations of the data-driven incorporation of these insights into AI systems.
DFG Programme
Research Units