Schnelle inhaltsbasierte Suche in großen Multimedia-Datenbanksystemen mittels der Earth Mover's Distance.
Final Report Abstract
The Earth Mover's Distance was introduced in computer vision by Rubner et. al. as a highly flexible similarity measure for discrete feature distributions which can closely match the human notion of similarity in a large number of application domains such as image, music, and video retrieval. Its high degree of flexibility stems from two properties. First, the Earth Mover's Distance is defined on adaptive binning histograms that can tailor the feature space partitioning to individual objects in a database. Second, it utilizes a cost function to define dissimilarities between feature space partitions. This property allows the Earth Mover's Distance to break with the common but often unwarranted assumption of perceptual independence of the feature space partitions. However, its flexibility comes at a comparatively high computational cost as a linear optimization problem has to be solved for determining the Earth Mover's Distance between two objects in a database. The aim of the DFG-funded project "Fast Earth Mover's Distance Search" was to develop techniques that on the one hand enable fast retrieval of similar objects from multimedia databases and on the other hand exploit the adaptability of the cost function inherent to the Earth Mover's Distance in order to improve the relevancy of search results and the applicability of the Earth Mover's Distance to complex data types such as time series and graphs. Important results of the project include efficiency-boosting techniques based on lower bounding approximations, dimensionality reduction, and direct indexing of the Earth Mover's Distance. The effectiveness of the retrieval process was addressed by relevance feedback and user-centric cost adaptation techniques. Finally, the applicability of the Earth Mover's Distance in context of graph and time-structured data was investigated. Feature extraction and transformation methods that allow for the Earth Mover's Distance to be employed in graph similarity search scenarios were proposed and evaluated. In the context of timestructured data, a model based on Dynamic Time Warping with the Earth Mover's Distance as the so-called ground distance was introduced and techniques that allow for a tremendous improvement regarding the efficiency of the model through a reduction of the overall number of ground distance computations and through indexing of the time series in the database were developed. With a refined understanding of distance-based similarity search and object feature representations, we set out to develop new similarity models that combine the flexibility of the feature representation of the Earth Mover's Distance with the correlation-based interpretation of dissimilarity of quadratic form distance measures. The experience gained through working on the Earth Mover's Distance is also fundamental for approaching new application areas in future research. Relying on the group's experience with similarity search, research projects BioKeyS (funded by the Bundesamt für Sicherheit in der Informationstechnik) and THESEUS MachlnNet (funded by the Bundesministerium für Wirtschaft und Technologie) have been initiated in the course of our work on the Earth Mover's Distance. In addition, our group joined the DFG-funded Collaborative Research Center SFB 686 on "Model-Based Control of Homogenized Low-Temperature Combustion". The exploration of the applicability of similarity search techniques in the domains of computer security (biometrics) and industrial engineering will be the focus of members of our group throughout the next few years.
Publications
- Adaptable Distance Functions for Similarity-based Multimedia Retrieval. Datenbank-Spektrum Nr. 19, 2006
Assent I., Wichterich M., Seidl T.
- Approximation Techniques for Indexing the Earth Mover's Distance in Multimedia Databases. International Conference on Data Engineering (ICDE), 2006
Assent I., Wenning A., Seidl T.
- AttentionAttractor: Efficient Video Stream Similarity Query Processing in Real Time. International Conference on Data Engineering (ICDE), 2007
Assent I., Krieger R., Seidl T.
- Efficient EMD-based Similarity Search in Multimedia Databases via Flexible Dimensionality Reduction. International Conference on Management of Data (SIGMOD), 2008
Wichterich M., Assent I., Kranen P., Seidl T.
- Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases. International Conference on Data Engineering (ICDE), 2008
Assent I., Wichterich M., Meisen T., Seidl T.
- Anticipatory DTW for Efficient Similarity Search in Time Series Databases. International Conference on Very Large Data Bases (VLDB), 2009
Assent I., Wichterich M., Krieger R., Kremer H., Seidl T.
- Exploring Multimedia Databases via Optimization-Based Relevance Feedback ond the Earth Mover's Distance. International Conference on Information and Knowledge Management (CIKM), 2009
Wichterich M., Beecks C., Sundermeyer M., Seidl T.
- Metrische Anpassung der Earth Mover's Distanz zur Ähnlichkeitssuche in Multimedia-Datenbanken. Gl Conference on Database Systems for Business, Technology, and the Web (BTW), 2009
Beecks C., Wichterich M., Seidl T.
- Robust Adaptable Video Copy Detection. International Symposium on Spatial and Temporal Databases (SSTD), 2009
Assent I., Kremer H.
- Speeding up Complex Video Copy Detection Queries. International Conference on Database Systems for Advanced Applications (DASFAA), 2010
Assent l., Kremer H., Seidl T.