Socio-semantic Congruence in collaborative ontology development projects
Final Report Abstract
In the KonSKOE project, we investigated the congruence between social and semantic structures in collaborative ontology development projects. We differentiated two representative approaches of curating and building ontologies. The first approach is expert-driven top-down. Typical representatives of this approach are projects created in WebProtégé, which supports the development of formal ontologies based on OWL. These ontologies are, for example, provided on BioPortal, the largest online repository of biological and biomedical ontologies. The number of people involved in these ontology building projects is rather low, which reduces the necessary coordination work. The second approach comprises approaches that are community-driven and bottom-up. A representative is Wikidata, where the community creates structured data based on an evolving schema. The knowledge base is automatically converted into a RDF-based knowledge graph. In the KonSKOE project, we investigated both approaches independently in the first stage and synthesized the results in a second stage to inform a future research program. In the context of WebProtégé, we investigated the influence of the ontology structure on human behavior in two scenarios: (i) Editing behavior on WebProtégé, where experts and practitioners can collaboratively create and edit ontologies, and (ii) browsing behavior on BioPortal. We formally defined several hypotheses of how users edit an ontology and systematically investigated, analyzed and ranked these hypotheses according to their relative plausibility for describing edit trails of four real-world ontology-engineering projects using HypTrails. We found that the hierarchical structure of an ontology exercises the strongest influence on the user behavior observed, followed by the similarity of concepts. These findings are remarkably consistent across four different real-world projects in WebProtégé. In this direction, we studied how users browse BioPortal by clustering users according to their exploration strategies across ontologies. We were particularly able to identify seven distinct browsing types, all relying on different functionality provided by BioPortal. Search Explorers, for example, use the search functionality extensively, while Ontology Tree Explorers rely mainly on the class hierarchy for exploring ontologies. In the context of Wikidata, we investigated the development approach from two angles: (i) By considering the editing behavior of people on Wikidata and the evolution of their user groups, and (ii) by looking at the type of tasks the community hands over to automatic assistants. We classified all the edits in Wikidata according to their type and derived action sets that allowed us to identify six different participation patterns. Our results suggest that editors seem to specialize in one specific task (over at least one month), which corresponds to research carried out in the context of Wikipedia. Only a minority of editors (2 %) in Wikidata is involved in defining its schema information. Moreover, our results showed that many tasks are transferred to bots. The latter show similar participation patterns as human editors. This particular role of bots can be attributed to the bottom-up ontology development process. In the subsequent study, we show that editors who focus on the schema development are more sustainable contributors, whereas editors that edit primarily data have less consistent contribution behavior. Furthermore, people who join later are less likely to stay in the community. We assume that these editors belong instead to the group of data contributors. Our results indicated the important role of bots in the project. Thus, we adapted our research procedure and analyzed requests for permission for bot tasks in order to understand their role in the ontology building process. We found that editors define bot tasks mainly for improving the completeness of existing data by adding data. The results of the KonSKOE project informed the design and development of several tools (e.g., QueryBilder) and methods (e.g., HopRank), and the improvement of graphical user interfaces for structured data generation and ontology exploration.
Publications
-
Approving Automation: Analyzing Requests for Permissions of Bots in Wikidata. In Proceedings of the 15th International Symposium on Open Collaboration, OpenSym ’19, page 10, Skövde, Sweden, 2019. ACM
M. Farda-Sarbas, H. Zhu, M. F. Nest, and C. Müller-Birn
-
Peer-production system or collaborative ontology engineering effort: What is wikidata? In Proceedings of the 11th International Symposium on Open Collaboration, page 20. ACM, 2015
C. Müller-Birn, B. Karran, J. Lehmann, and M. Luczak-Rösch
-
Understanding how users edit ontologies: Comparing hypotheses about four real-world projects. In Proceedings of the 14th International Conference on The Semantic Web - ISWC 2015 - Volume 9366, pages 551–568, New York, NY, USA, 2015. Springer-Verlag New York
S. Walk, P. Singer, L. Espín-Noboa, T. Tudorache, M. A. Musen, and M. Strohmaier
-
Applicability of sequence analysis methods in analyzing peer-production systems: A case study in wikidata. In Social Informatics - 8th International Conference, SocInfo 2016, Bellevue, WA, USA, November 11-14, 2016, Proceedings, Part II, pages 142–156. 2016
T. T. Cuong and C. Müller-Birn
-
How users explore ontologies on the web: A study of ncbo’s bioportal usage logs. In Proceedings of the 26th International Conference on World Wide Web (WWW’17), pages 775–784. International World Wide Web Conferences Steering Committee, 2017
S. Walk, L. Espín-Noboa, D. Helic, M. Strohmaier, and M. A. Musen
-
Janus: A hypothesis-driven bayesian approach for understanding edge formation in attributed multigraphs. Applied Network Science, 2(1):16, 2017
L. Espín-Noboa, F. Lemmerich, M. Strohmaier, and P. Singer
-
Hoprank: How semantic structure influences teleportation in pagerank (a case study on bioportal). In Proceedings of the 2019 World Wide Web Conference (WWW’19). International World Wide Web Conferences Steering Committee, 2019
L. Espín-Noboa, F. Lemmerich, S. Walk, M. Strohmaier, and M. Musen