Project Details
Skeleton-based Clustering in Big and Streaming Social Networks
Subject Area
Theoretical Computer Science
Term
from 2014 to 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 254952906
We devise novel methods to cluster large-scale static and dynamic online social networks. Our approach is based on skeleton structures that represent and amplify variation in local cohesion, and that are defined locally to facilitate efficient computation. In addition to simplifying the clustering problem, they shall also provide a novel understanding of community dynamics, capturing more directly the agency of social actors.Finding patterns in online social relationships and interactions is one of the most prominent applications of big data today, largely driven by the relative ease of data collection and its economic and political value. While this process-generated data is massive, it stems from the actions of individuals with bounded knowledge of the entire system. Our working hypothesis is therefore that local structures capture major trends to an extent sufficient to inform algorithms for partitioning or overlapping clusters. Moreover, we expect the integration of individual attribute data to boost empirical relevance, and the definition of persistent skeleton structures to yield more reliable concepts of community dynamics.Combining sparsification with imputation and nodal attributes, we will advance recent approaches to locally-determined skeleton structures and study their utility for novel and approximate graph clustering algorithms that scale to big data. We are aiming at algorithms with near-linear worst-case running times by exploiting special characteristics of social networks or settling for approximate results. While overall algorithmic efficiency is important, we will place additional emphasis on algorithm engineering to increase and exploit locality. Consideration of streaming data will be a necessary requirement as online social networks typically generate sequences of dyadic events. In order to cope with massive graphs we also develop external memory algorithms for graph clustering and related problems. In support of the validation and experimental evaluation of algorithms we place particular emphasis on generative models for social networks. Due to the pervasive nature of clustering problems, the expected outcome of this project includes tools for big data analysis in other application areas.
DFG Programme
Priority Programmes
Subproject of
SPP 1736:
Algorithms for Big Data
International Connection
Switzerland