|
Programming & Computer Software
Table recognition technology in tax documents of the Russian Federation
O. A. Slavinab a Federal Research Center “Computer Science and Control” RAS, Moscow,
Russian Federation
b LLC “Smart Engines Service”, Moscow, Russian Federation
Abstract:
This paper investigates the problem of cell recognition in the image of a table using the example of the Russian tax document (2-NDFL). Despite the simple structure of the tables, the printing method is based on a flexible template. The flexibility of the form is observed in the modifications of textual information and in the table area. The flexibility of tables lies in the modification of the number and size of columns. A structural method was proposed for table detection. The input data are the detected horizontal and vertical segments. Segments were searched by the Smart Document Reader system. Implementing and testing the method were also carried out in the Smart Document Reader system. In addition to detecting the area where tables can be placed, the following objectives were achieved: searching for table cells, naming table cells, and validating the table area. Validation of the table area was performed for separate tables and for table sets. The application of table aggregate descriptions showed the high reliability of linking table sets.
Keywords:
table recognition, line detection, table layout.
Received: 14.11.2023
Citation:
O. A. Slavin, “Table recognition technology in tax documents of the Russian Federation”, Vestnik YuUrGU. Ser. Mat. Model. Progr., 17:1 (2024), 75–85
Linking options:
https://www.mathnet.ru/eng/vyuru713 https://www.mathnet.ru/eng/vyuru/v17/i1/p75
|
|