Mathematical methods and algorithms for learning effective embeddings of semi-structured information for anomaly detection problems

Applicant Professor Dr. Martin Spindler

Subject Area Statistics and Econometrics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing

Term since 2021

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 448795504

Project Description

The rise of digitization leads to the availability of huge and novel data sets which are often semi-structured. Although the analysis of such data sets is challenging, it offers great opportunities for researchers. The goal of the project is to develop models for better anomaly detection on the base of those semi-structured data. Health care industry provides challenges related to important applications like fraud detection, recommendation systems and decision support systems. These challenges can be solved with learning from collected data. Economic and financial (time series) industry also require outlier and novelty detection as an important first step in processing time series data.In those domains it is of vital importance to detect anomalies and outliers, as they have a high relevance. For example, the case of fraudulent claims, which usually differ considerably from default claims, shall be detected. In clinical / medical decision support systems unusual cases which need special treatment should be filtered out. For economic and financial data it is very important to perform outlier and change detection in an automatic way. The goal of this project is to develop Deep Learning and Machine Learning methods for anomaly and outlier detection and apply them to the tasks mentioned above, namely fraud detection in insurance and outlier detection in financial time series. These will be possible as all the tasks above share the type of input data related to important problems in healthcare, economics and financial areas: they are sequences of various length, so they belong to semi-structured datasets.The project consists of three parts. First, development of efficient deep representations and embeddings of semi-structured information such as graphs and sequences. Doing this, we will construct efficient semantic-level similarity measures, which will allow us to establish what is the norm to detect anomaly. Second, we will develop effective end-to-end learnable approaches to anomaly detection and imbalanced classification for semi-structured information. Third, we'll develop problem-oriented data mining approaches for fraud detection, outlier detection in (financial) time series, recommendation systems and decision support systems with applicationsin health care, insurance, finance and economics.To sum up, the final goal of this proposal is to enable effective representations of semi-structured information and develop end-to-end approaches for anomaly detection, that are ready to use for the solution of real-world applied problems.

DFG Programme Research Grants

International Connection Russia

Partner Organisation Russian Foundation for Basic Research

Cooperation Partner Professor Dr. Evgeny Burnaev

Servicenavigation

Hauptnavigation

Mathematical methods and algorithms for learning effective embeddings of semi-structured information for anomaly detection problems

Additional Information

Servicenavigation

Hauptnavigation

Mathematical methods and algorithms for learning effective embeddings of semi-structured information for anomaly detection problems

Additional Information

Textvergrößerung und Kontrastanpassung