Project Details
"Scalable Methods of Text and Structure Recognition for the Full-Text Digitization of Historical Prints" Part 2: Layout Analysis
Applicant
Professor Dr. Andreas Dengel
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2018 to 2019
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 394346204
The project “Skalierbare Verfahren der Text- und Strukturerkennung für die Volltextdigitalisierung historischer Drucke” has the goal of developing a complete OCR-Workflow for a high quality mass digitization of historical prints from the 16th-18th century. For each step in the workflow innovative methods should be made available as tools. Module 2: Layouterkennung ist next to OCR itself the most important step. It improves the OCR results directly, but also improves the general understanding of the digitized document by providing insights to the layout and relations between the document components. For each optimization step there are a wide variety of algorithms available, however not all of them are suitable to the specific challenges of this projects. On the basis of prior experience and work, the DFKI plans the identification, development and integration of suitable methods.
DFG Programme
Research data and software (Scientific Library Services and Information Systems)
Co-Investigator
Dr.-Ing. Syed Saqib Bukhari