Project Details
Big-Data-DynAmO: Dynamic, Approximate, and Online Methods for Big Data
Applicant
Professor Dr.-Ing. Ulrich Meyer
Subject Area
Theoretical Computer Science
Term
from 2014 to 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 255326380
Big data is often characterized by the so called 3Vs: Volume (the amount of data), Variety (different types of data), and Velocity (the speed of input/output-data). In this proposal we are particularly interested in algorithmic challenges stemming from the combination of volume and velocity in the context of resource constraints: with a focus on huge graph problems that frequently arise in various application areas we aim to consider the possibilities and limitations of dynamically processing data after input updates, i.e., building on previous solutions instead of recomputing everything from scratch. Despite parallel computation power and efficient usage of memory hierarchies, given the sheer size of the data sets to be considered, we will often be forced to give up exact computations and rather consider effective ways to trade solution quality versus computational efforts, for example using embeddings to approximate shortest paths.Actually, not only high-frequency input data rates but also the computational platform itself may contribute to an online scenario. Resources like main memory or cache are often shared between several non-cooperating processes, thus resulting in fluctuating individual cache sizes which often cause tremendous paging delays for the processes involved. We aim to investigate both cache-robust algorithms and methods of fair cache-distribution, partially based on algorithmic game theory.In all aspects to be considered we do not only strive for theoretical results but intend to follow the whole algorithm engineering development cycle in order to engineer improved algorithms and data structures for modern hardware and integrate them in libraries like STXXL or Thrill so that other participants of the priority programme (and beyond) can use them.
DFG Programme
Priority Programmes
Subproject of
SPP 1736:
Algorithms for Big Data