Project Details
Efficient Discrete Optimization for Structured Prediction Problems in Computer Vision and Machine Learning
Applicant
Professor Dr. Paul Swoboda
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 524352575
The goal of this project is to research and develop a new generally applicable, fast and scalable discrete optimization solver for structured prediction problems in computer vision and machine learning. Structured prediction problems are those tasks that involve computing high-dimensional output that has special structure in terms of constraints. Examples include tracking, where the output is a set of pairwise disjoint trajectories in a tracking graph, clustering, where the output is a partition of the whole set into pairwise disjoint clusters or correspondence problems, where a 1:1 mapping between points is sought. Pure neural network pipelines may not be ideal in this setting, since often modelling explicit constraints on their output is difficult or unnatural. On the other hand, optimization problems can be straightforwardly formulated such that they take into account the desired constraints. Unfortunately, standard solvers are often not applicable in structured prediction tasks, since they do not scale to the very high-dimensional setting. On the other hand specialized solvers that scale are hard to develop and whenever a new type of constraints is needed they need to be adapted or even re-written from scratch, limiting their applicability. The goal of this project is to go beyond this dichotomy and to combine the generality of standard solvers with the efficiency of specialized ones. To this end, we will distill efficient algorithmic design principles found in specialized solvers and generalize them, so that they work in more general settings. We will put a special emphasis on massive GPU parallelism. Additionally, the solver will also be machine-learning friendly. First, the solver will be trainable. It will be possible to improve the solver by training it on previously seen optimization problems, thereby increasing its performance on unseen ones. Second, the solver will be embedded in neural network pipelines for specific structured prediction problem tasks, allowing to train the neural network backbone together with the solver, thereby making the whole system perform better for the task at hand.
DFG Programme
Research Grants