Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Dokl. RAN. Math. Inf. Proc. Upr.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia, 2023, Volume 514, Number 2, Pages 297–307
DOI: https://doi.org/10.31857/S2686954323601720
(Mi danma474)
 

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

Text reuse detection in handwritten documents

A. V. Grabovoyab, M. S. Kaprielovaabc, A. S. Kildyakova, I. O. Potyashina, T. B. Seyila, E. L. Finogeeva, Yu. V. Chekhovichac

a Antiplagiat Company, Moscow, Russian Federation
b Moscow Institute of Physics and Technology, Moscow, Russian Federation
c Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow
References:
Abstract: Plagiarism detection in scholar assignments becomes more and more relevant nowadays. Rapidly growing popularity of online education, active expansion of online educational platforms for secondary and high school education create demand for development of an automatic reuse detection system for handwritten assignments. The existing approaches to this problem are not usable for searching for potential sources of reuse on large collections, which significantly limits their applicability. Moreover, real-life data is likely to be low-quality photographs taken with mobile devices. We propose an approach that allows to detect text reuse in handwritten documents. Each document is a picture and the search is performed on a large collection of potential sources. The proposed method consists of three stages: handwritten text recognition, candidate search and precise source retrieval. We represent experimental results for the quality and latency estimation of our system. The recall reaches 83.3% in case of better quality pictures and 77.4% in case of pictures of lower quality. The average search time is 3.2 seconds per document on CPU. The results show, that the created system is scalable and can be used in production, where fast reuse detection for hundreds of thousands of scholar assignments on large collection of potential reuse sources is needed. All the experiments were held on HWR200 public dataset.
Keywords: optical character recognition, handwriting, text reuse detection, computer vision, handwritten text recognition, plagiarism detection.
Funding agency Grant number
Fund for Assistance to Small Innovative Enterprises in the Scientific and Technical Sphere 79068
The work is supported by the Innovation Promotion Fund, project no. 79068, request no. II-208298.
Presented: A. L. Semenov
Received: 02.09.2023
Revised: 15.09.2023
Accepted: 18.10.2023
English version:
Doklady Mathematics, 2023, Volume 108, Issue suppl. 2, Pages S424–S433
DOI: https://doi.org/10.1134/S106456242370120X
Bibliographic databases:
Document Type: Article
UDC: 004.(89+93)
Language: Russian
Citation: A. V. Grabovoy, M. S. Kaprielova, A. S. Kildyakov, I. O. Potyashin, T. B. Seyil, E. L. Finogeev, Yu. V. Chekhovich, “Text reuse detection in handwritten documents”, Dokl. RAN. Math. Inf. Proc. Upr., 514:2 (2023), 297–307; Dokl. Math., 108:suppl. 2 (2023), S424–S433
Citation in format AMSBIB
\Bibitem{GraKapKil23}
\by A.~V.~Grabovoy, M.~S.~Kaprielova, A.~S.~Kildyakov, I.~O.~Potyashin, T.~B.~Seyil, E.~L.~Finogeev, Yu.~V.~Chekhovich
\paper Text reuse detection in handwritten documents
\jour Dokl. RAN. Math. Inf. Proc. Upr.
\yr 2023
\vol 514
\issue 2
\pages 297--307
\mathnet{http://mi.mathnet.ru/danma474}
\crossref{https://doi.org/10.31857/S2686954323601720}
\elib{https://elibrary.ru/item.asp?id=56717840}
\transl
\jour Dokl. Math.
\yr 2023
\vol 108
\issue suppl. 2
\pages S424--S433
\crossref{https://doi.org/10.1134/S106456242370120X}
Linking options:
  • https://www.mathnet.ru/eng/danma474
  • https://www.mathnet.ru/eng/danma/v514/i2/p297
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024