Project Details
CodeInspector - e-Research tool for data-driven search and analysis of social science research software
Applicants
Dr. Arnim Bleier; Professorin Dr.-Ing. Brigitte Mathiak; Professor Dr. Ansgar Scherp; Professor Dr. Matthias Tichy
Subject Area
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Empirical Social Research
Empirical Social Research
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 504226141
Research software is increasingly coming into the focus of the Open Science movement and the FAIR principles. It is the link between research data and scientific publications and allows the reproduction of scientific results. Hence, open software is indispensable for the subsequent re-use of research results. In recent years, demand has risen to integrate software, in addition to literature and research data, into information infrastructure services that support the research data cycle.Since the integration of research software into information infrastructure services is still very new, there are no projects yet that leverage the nature of research software as a self-documenting entity. Previous approaches only consider the use case of searching for code based on external metadata. However, software code carries a lot of further relevant information, such as data and packages used, language and creation date, authors, methods used as well as their combination. This information can be extracted automatically without having to be manually entered into information infrastructure services by the software author or curators.The goal of this project is to explore and exploit this potential of software code. An e-Research tool will be developed that automatically extracts relevant information from research software to provide value to users in various use cases. These use cases have been identified by us in expert interviews with prospective users. The project will enable users to search for research software as well as understand how the code works, how research data is processed by the code, and which parts of the software perform which processing function. Motivated by the interviews, we focus on research software that performs statistical tests. Statistical analysis is a central component of gaining knowledge in many disciplines, such as the social sciences and psychology. We furthermore focus on the programming language R, commonly used in this environment, for statistical data analysis, which is Open Source. Likewise, we will publish the developed e-research tool to the public as Open Source. Extracted metadata, the semantic code graph and links between software code, and between code and research data and literature will be published and shared via Scholix so that they can be further processed by aggregators like OpenAIRE.
DFG Programme
Research data and software (Scientific Library Services and Information Systems)