Format-aware Detection of Malicious Documents (FORMAD)

Antragsteller Professor Pavel Laskov, Ph.D.

Fachliche Zuordnung Softwaretechnik und Programmiersprachen

Förderung Förderung von 2013 bis 2015

Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 217981196

The project addresses the problem of detecting malicious content in formatted documents. Embedding of malicious code in documents is frequently used in modern attacks against computer systems. Successful detection of document-based attacks is only possible if detection methods are fully aware of format-specific syntax and semantics. In previous work, the format-aware analysis was only done for special cases, for example, embedded JavaScript code. The goal of the proposed project is to develop a general methodology for the format-aware analysis to be used for detection of malicious documents. The main idea is to use an intermediate document representation in the form of hierarchical key/value pairs (HKV) for the essential processing steps. Such representation will decouple analysis from format peculiarities while retaining a general semantics of the document content. Adaptation of the proposed methodology to new document formats would only require conversion to the HKV format instead of a complete re-design of detection methods. The main scientific challenge of the project is to develop analysis techniques suitable for the HKV representation. Only limited prior work has addressed such representation, and previous methods lack the scalability required for complex document formats. This challenge will be addressed by applying machine learning methods suitable for analysis of large amounts of high-dimensional data. New methods will be developed for assessing the plausibility of the values of specific keys as well as the overall risk associated with a document.

DFG-Verfahren Sachbeihilfen

Servicenavigation

Hauptnavigation

Format-aware Detection of Malicious Documents (FORMAD)

Zusatzinformationen

Servicenavigation

Hauptnavigation

Format-aware Detection of Malicious Documents (FORMAD)

Zusatzinformationen

Textvergrößerung und Kontrastanpassung