Stage-oe-small.jpg

Inproceedings3955: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
(Die Seite wurde neu angelegt: „{{Publikation Erster Autor |ErsterAutorNachname=Vafaie |ErsterAutorVorname=Mahsa }} {{Publikation Author |Rank=2 |Author=Oleksandra Bruns }} {{Publikation Auth…“)
 
K (Added PDF.)
 
Zeile 30: Zeile 30:
 
{{Publikation Details
 
{{Publikation Details
 
|Abstract=Historical archival records present many challenges for OCR systems to correctly encode their content, due to visual complexity, e.g. mixed printed text and handwritten annotations, paper degradation, and faded ink. This paper addresses the problem of automatic identification and separation of handwritten and printed text in historical archival documents, including the creation of an artificial pixel-level annotated dataset and the presentation of a new FCN-based model trained on historical data. Initial test results indicate 18% IoU performance improvement on recognition of printed pixels and 10% IoU performance improvement on recognition of handwritten pixels in synthesised data when compared to the state-of-the-art trained on modern documents. Furthermore, an extrinsic OCR-based evaluation on the printed layer extracted from real historical documents shows 26% performance increase.
 
|Abstract=Historical archival records present many challenges for OCR systems to correctly encode their content, due to visual complexity, e.g. mixed printed text and handwritten annotations, paper degradation, and faded ink. This paper addresses the problem of automatic identification and separation of handwritten and printed text in historical archival documents, including the creation of an artificial pixel-level annotated dataset and the presentation of a new FCN-based model trained on historical data. Initial test results indicate 18% IoU performance improvement on recognition of printed pixels and 10% IoU performance improvement on recognition of handwritten pixels in synthesised data when compared to the state-of-the-art trained on modern documents. Furthermore, an extrinsic OCR-based evaluation on the printed layer extracted from real historical documents shows 26% performance increase.
 +
|Download=Handwrittend_and_Printed_Text_Identification_VAFAIE_Archiving2022.pdf
 
|Link=https://library.imaging.org/archiving/articles/19/1/4
 
|Link=https://library.imaging.org/archiving/articles/19/1/4
 
|DOI Name=10.2352/issn.2168-3204.2022.19.1.4
 
|DOI Name=10.2352/issn.2168-3204.2022.19.1.4
 
|Forschungsgruppe=Information Service Engineering
 
|Forschungsgruppe=Information Service Engineering
 
}}
 
}}

Aktuelle Version vom 31. Oktober 2022, 09:52 Uhr


Handwritten And Printed Text Identification in Historical Archival Documents


Handwritten And Printed Text Identification in Historical Archival Documents



Published: 2022

Buchtitel: Archiving Conference
Seiten: 15-20
Verlag: Society for Imaging Science and Technology
Erscheinungsort: IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

Referierte Veröffentlichung

BibTeX

Kurzfassung
Historical archival records present many challenges for OCR systems to correctly encode their content, due to visual complexity, e.g. mixed printed text and handwritten annotations, paper degradation, and faded ink. This paper addresses the problem of automatic identification and separation of handwritten and printed text in historical archival documents, including the creation of an artificial pixel-level annotated dataset and the presentation of a new FCN-based model trained on historical data. Initial test results indicate 18% IoU performance improvement on recognition of printed pixels and 10% IoU performance improvement on recognition of handwritten pixels in synthesised data when compared to the state-of-the-art trained on modern documents. Furthermore, an extrinsic OCR-based evaluation on the printed layer extracted from real historical documents shows 26% performance increase.

Download: Media:Handwrittend_and_Printed_Text_Identification_VAFAIE_Archiving2022.pdf
Weitere Informationen unter: Link
DOI Link: 10.2352/issn.2168-3204.2022.19.1.4



Forschungsgruppe

Information Service Engineering


Forschungsgebiet