Project Details
Projekt Print View

Mathematical methods and algorithms for learning effective embeddings of semi-structured information for anomaly detection problems

Subject Area Statistics and Econometrics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 448795504
 
The rise of digitization leads to the availability of huge and novel data sets which are often semi-structured. Although the analysis of such data sets is challenging, it offers great opportunities for researchers. The goal of the project is to develop models for better anomaly detection on the base of those semi-structured data. Health care industry provides challenges related to important applications like fraud detection, recommendation systems and decision support systems. These challenges can be solved with learning from collected data. Economic and financial (time series) industry also require outlier and novelty detection as an important first step in processing time series data.In those domains it is of vital importance to detect anomalies and outliers, as they have a high relevance. For example, the case of fraudulent claims, which usually differ considerably from default claims, shall be detected. In clinical / medical decision support systems unusual cases which need special treatment should be filtered out. For economic and financial data it is very important to perform outlier and change detection in an automatic way. The goal of this project is to develop Deep Learning and Machine Learning methods for anomaly and outlier detection and apply them to the tasks mentioned above, namely fraud detection in insurance and outlier detection in financial time series. These will be possible as all the tasks above share the type of input data related to important problems in healthcare, economics and financial areas: they are sequences of various length, so they belong to semi-structured datasets.The project consists of three parts. First, development of efficient deep representations and embeddings of semi-structured information such as graphs and sequences. Doing this, we will construct efficient semantic-level similarity measures, which will allow us to establish what is the norm to detect anomaly. Second, we will develop effective end-to-end learnable approaches to anomaly detection and imbalanced classification for semi-structured information. Third, we'll develop problem-oriented data mining approaches for fraud detection, outlier detection in (financial) time series, recommendation systems and decision support systems with applicationsin health care, insurance, finance and economics.To sum up, the final goal of this proposal is to enable effective representations of semi-structured information and develop end-to-end approaches for anomaly detection, that are ready to use for the solution of real-world applied problems.
DFG Programme Research Grants
International Connection Russia
Cooperation Partner Professor Dr. Evgeny Burnaev
 
 

Additional Information

Textvergrößerung und Kontrastanpassung