Project Details
Individual text corpora for the language modeling of personality, knowledge and intelligence
Applicant
Dr. Markus Josef Hofmann
Subject Area
General, Cognitive and Mathematical Psychology
Personality Psychology, Clinical and Medical Psychology, Methodology
Personality Psychology, Clinical and Medical Psychology, Methodology
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 539645652
In the present research project, we seek to answer six questions: (i) Can we use human web behavior as a sample of individual human experience to reproducibly predict personality and individual knowledge data from surveys? (ii) Is the web search history or web tracking a better sample for these predictions? (iii) Do not only the survey data, but also their predictions reflect the structural relations between the Big Five factors of personality, individual interests, as well as crystallized and fluid intelligence? (iv) And if so, do the available data from many years of web search experience allow to retrospectively simulate intellectual development? (v) How "intelligent" are (large) language models, when we compare them with the norm samples of an intelligence test? (vi) Is it possible to successfully replace human intelligence test answers by training language models with the individual human experiences that are reflected in individual text corpora? To answer these questions, we rely on the web tracking data of a GESIS panel study with ~1,000 participants, which also allows to examine level of education. For this sample, we collect web search, personality and knowledge data. The predictive modeling methods developed from these participants will be tested in another sample of ~500 participants, for which we additionally collect interests and fluid intelligence. From this screening, we select a subsample of ~200 participants for an extensive examination with a well-established intelligence test. Our results may inform an honest scientific discussion about the possibilities and limitations that come with Big Data and Artificial Intelligence. Though the creation of predictive modeling methods may facilitate, improve or even replace future psychodiagnostics, these new data spaces come with ethical and data protection challenges which we answer in the supplement of this grant proposal.
DFG Programme
Infrastructure Priority Programmes