Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2022, Volume 34, Issue 6, Pages 137–146
DOI: https://doi.org/10.15514/ISPRAS-2022-34(6)-10
(Mi tisp744)
 

Automatic data labeling for document image segmentation using deep neural networks

A. A. Mikhailovab

a Ivannikov Institute for System Programming of the RAS
b Matrosov Institute for System Dynamics and Control Theory of Siberian Branch of Russian Academy of Sciences
Abstract: The article proposes a new method for automatic data annotation for solving the problem of document image segmentation using deep object detection neural networks. The format of marked PDF files is considered as the initial data for markup. The peculiarity of this format is that it includes hidden marks that describe the logical and physical structure of the document. To extract them, a tool has been developed that simulates the operation of a stack-based printing machine according to the PDF format specification. For each page of the document, an image and annotation are generated in PASCAL VOC format. The classes and coordinates of the bounding boxes are calculated during the interpretation of the labeled PDF file based on the labels. To test the method, a collection of marked up PDF files was formed from which images of document pages and annotations for three segmentation classes (text, table, figure) were automatically obtained. Based on these data, a neural network of the EfficientDet D2 architecture was trained. The model was tested on manually labeled data from the same domain, which confirmed the effectiveness of using automatically generated data for solving applied problems.
Keywords: document layout analysis, PDF accessibility, ANN models, artificial intelligence
Document Type: Article
Language: Russian
Citation: A. A. Mikhailov, “Automatic data labeling for document image segmentation using deep neural networks”, Proceedings of ISP RAS, 34:6 (2022), 137–146
Citation in format AMSBIB
\Bibitem{Mik22}
\by A.~A.~Mikhailov
\paper Automatic data labeling for document image segmentation using deep neural networks
\jour Proceedings of ISP RAS
\yr 2022
\vol 34
\issue 6
\pages 137--146
\mathnet{http://mi.mathnet.ru/tisp744}
\crossref{https://doi.org/10.15514/ISPRAS-2022-34(6)-10}
Linking options:
  • https://www.mathnet.ru/eng/tisp744
  • https://www.mathnet.ru/eng/tisp/v34/i6/p137
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:17
    Full-text PDF :1
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024