Project Details
Projekt Print View

Domain Transfer with Generative Models and Neural Rendering

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 413611294
 
In the recent years, we have seen the tremendous success of neural networks in almost every field of computer science. Nonetheless, despite this success, a fundamental limitation remains: the availability of labeled training data, which in general is costly and difficult to obtain, in particular for computer image tasks such as semantic segmentation where class labels must be manually provided for each pixel. A potential approach to tackling this problem is to exploit synthetic imagery as training data for neural networks; here, ground truth labels are already provided for free, and an virtually arbitrarily large amount of imagery from different viewpoints can be synthesized from a given 3D scene description. This potential has already inspired computer vision research to develop simulation environments in order to provide generate training data from these representations; e.g., Habitat and Gibson.The overarching goal of this proposal is to leverage training data across domains by bridging the domain gap between simulated and real-world visual data. Early works have proposed domain adaption techniques to address this challenging problems, such as the popular open set domain adaption method; however, the problem itself still remains challenging due to the mismatch in the underlying data statistics. In order to address the problem, we propose to develop new generative models that enable domain transfer by learning to match the respective underlying data distributions in both source (simulated) and target (real world) domains. We believe that this is a very timely direction with respect to the developments in the research community, since we have now seen very promising work on generative neural networks for visual data. In particular, generative adversarial networks (GANs) can now produce photo-realistic imagery from a random distributions such shown by progressive GAN, BigGAN, and the very recent StyleGAN methods. However, also probabilistic auto-regressive models have made tremendous progress in the recent years with works ranging from the early PixelCNN to the very recent high-quality results such as VQ-VAE-2. With these new advances, we see a compelling opportunity to develop such techniques towards bridging the synthetic-real domain gap; that is, leveraging generative approaches to transform synthetic data to its photo-realistic counterpart.Our main insight is to leverage graphics-based 3D understanding of imagery in order to inform generative neural networks to address the domain gap. By learning explicit 3D parameterizations of scenes captured in images, we can take advantage of physically-based modeling of imaging and 3D spatial consistency, which a network would then need not learn but could focus on bridging the domain-specific characteristics of synthetic and real data.
DFG Programme Research Units
 
 

Additional Information

Textvergrößerung und Kontrastanpassung