Model Selection for Surface Approximation and Scene Interpretation
Final Report Abstract
We have successfully demonstrated geometric and semantic analysis of building facades split into two parts: Bottom-up geometric modeling under the direction of PI Prof. Olaf Hellwich, and top-down semantic modeling under the direction of PI Prof. Helmut Mayer. The work has been linked by jointly acquiring and processing a novel data set which comprises depth information and is, particularly, of a resolution high enough to detect detailed structures of windows, namely their frames as well as transoms and mullions. Geometric Modeling. The geometric modeling handles the bottom-up path from the acquisition of the street level image data to the three dimensional reconstruction of the facade surface. These surface estimations are imprecise and incomplete and must be subjected to domain specific regularization to achieve clean results. As such, the focus of the geometric side of the project is on maximizing the quality of the reconstruction itself, as well as finding and implementing adequate geometric priors for the regularization. We devised a filtering technique that enhances the texture in input images and by extension the completeness of surface reconstructions in the absence of strong texture. A modified structure from motion pipeline that can process sets of longfocal-length-images acquired as a panorama, i.e. from a single view point, allows high resolution surface estimations from the street level without relying on photographs from drones. The Depth Map-Based Facade Abstraction allows the extraction of clean, simplified meshes from noisy real world point clouds. Rigorous handcrafted geometric constraints on the resulting surface, informed by our observations of common geometric attributes of building facades, ensure an abstraction of the surface that retains only the most important features. While the textured meshes can already be used, e.g., for visualization purposes in a stand alone fashion, the intermediate results of the geometric processing are used to provide a facade depth map and rectified input images to the semantic processing. In addition, we demonstrate that the approach cannot only be used on individual facades, but on entire cites. However, small geometric details are lost in the abstraction itself. While they could be retained in displacement maps, their geometric regularization needs to be governed by priors which do not purely derive from building architecture. A learning-based approach is demonstrated, which uses ConvNets to learn and encode geometric priors for small details and the radiometric cues to them. Semantic Modeling. Our hybrid pipeline for facade segmentation consists of traditional and modern machine learning approaches. The Structured Random Forest is based on feature extraction, our object detector employs a deep learning approach and the final part model fitting uses dynamic programming. We have examined the relevance of the features and the optimization functions of the nodes in the decision trees. Additionally, we have shown experimentally the effect of adding pseudo depth information determined by Geometric Modeling for our joint novel data set. The pipeline has been evaluated on several data sets and it was found to perform better than or at least on a par with currently published methods and it is the state of the art for small data sets. We are the first to build a system for the segmentation of window frames, transoms and mullions, thus, giving a detailed description of a window. Also this system uses a pipeline consisting of an object detector, semantic segmentation employing deep learning and model fitting based on the geometric shape and appearance of the windows. For semantic segmentation an architecture is chosen, which can capture thin elements in an image. The system has been trained and evaluated on the joint novel high-resolution data set. In summary, we have demonstrated that we have built a reliable system for facade segmentation which can be used as a source of information for building high quality 3D building models. Tangential Achievements. Overall, the results are very competitive and supported by the acceptance of the published work in good scientific conferences. The work not only created new insights concerning the geometric and semantic processing of facades, it also led to the development of useful tools and a novel data set. Care was taken, to keep particularly the geometric tools general enough to be of use outside the scope of the project. Through their publication, in some cases in top tier conferences, these tools were made available and provide a benefit to the scientific community. Some of the results in this report are not yet published and await final polishing.
Publications
-
“Depth Map based Facade Abstraction from Noisy Multi-View Stereo Point Clouds”. In: German Conference on Pattern Recognition (GCPR). 2016, pp. 155–165
A. Ley and O. Hellwich
-
“Reconstructing White Walls: Multi-View, Multi-Shot 3D Reconstruction of Textureless Surfaces”. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences III-3 (2016), pp. 91–98
A. Ley, R. Hänsch, and O. Hellwich
-
“SyB3R: A Realistic Synthetic Benchmark for 3D Reconstruction from Images”. In: European Conference on Computer Vision (ECCV). Vol. VII. 2016, pp. 236–251
A. Ley, R. Hänsch, and O. Hellwich
-
“Automatic Building Abstraction from Aerial Photogrammetry”. In: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences 4-2/W4 (2017), pp. 243–250
A. Ley, R. Hänsch, and O. Hellwich
-
“Facade Segmentation with a Structured Random Forest”. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. IV-1-W1. 2017, pp. 175–181
K. Rahmani, H. Huang, and H. Mayer
-
“High Quality Facade Segmentation Based on Structured Random Forest, Region Proposal Network and Rectangular Fitting”. In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. IV-2. 2018, pp. 223–230
K. Rahmani and H. Mayer
-
“A Digital Image Processing Pipeline for Modelling of Realistic Noise in Synthetic Images”. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2019
O. Bielova, R. Hänsch, A. Ley, and O. Hellwich