Project Details
Projekt Print View

"Scalable Methods of Text and Structure Recognition for the Full-Text Digitization of Historical Prints" Part 2: Layout Analysis

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2018 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 394346204
 
The project “Skalierbare Verfahren der Text- und Strukturerkennung für die Volltextdigitalisierung historischer Drucke” has the goal of developing a complete OCR-Workflow for a high quality mass digitization of historical prints from the 16th-18th century. For each step in the workflow innovative methods should be made available as tools. Module 2: Layouterkennung ist next to OCR itself the most important step. It improves the OCR results directly, but also improves the general understanding of the digitized document by providing insights to the layout and relations between the document components. For each optimization step there are a wide variety of algorithms available, however not all of them are suitable to the specific challenges of this projects. On the basis of prior experience and work, the DFKI plans the identification, development and integration of suitable methods.
DFG Programme Research data and software (Scientific Library Services and Information Systems)
Co-Investigator Dr.-Ing. Syed Saqib Bukhari
 
 

Additional Information

Textvergrößerung und Kontrastanpassung