K. B. Bulatov, E. V. Emelianova, D. V. Tropin, N. S. Skoryukina, Y. S. Chernyshova, A. V. Sheshkus, S. A. Usilin, Z. Ming, J.-Ch. Burie, M. M. Luqman, V. V. Arlazarov, “MIDV-2020: a comprehensive benchmark dataset for identity document analysis”, Компьютерная оптика, 46:2 (2022), 252

Компьютерная оптика

RUS ENG

ЖУРНАЛЫ ПЕРСОНАЛИИ ОРГАНИЗАЦИИ КОНФЕРЕНЦИИ СЕМИНАРЫ ВИДЕОТЕКА ПАКЕТ AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	Общая информация
	Последний выпуск
	Архив
	Правила для авторов

	Поиск публикаций
	Поиск ссылок

	RSS
	Последний выпуск
	Текущие выпуски
	Архивные выпуски
	Что такое RSS

Компьютерная оптика:
Год:
Том:
Выпуск:
Страница:
	Найти

Персональный вход:
Логин:
Пароль:
	Запомнить пароль
	Войти
	Забыли пароль?
	Регистрация

Компьютерная оптика, 2022, том 46, выпуск 2, страницы 252–270
DOI: https://doi.org/10.18287/2412-6179-CO-1006 (Mi co1015)

Эта публикация цитируется в 19 научных статьях (всего в 19 статьях)

ОБРАБОТКА ИЗОБРАЖЕНИЙ, РАСПОЗНАВАНИЕ ОБРАЗОВ

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

K. B. Bulatov^ab, E. V. Emelianova^bc, D. V. Tropin^abd, N. S. Skoryukina^ab, Y. S. Chernyshova^ab, A. V. Sheshkus^ab, S. A. Usilin^ab, Z. Ming^e, J.-Ch. Burie^e, M. M. Luqman^e, V. V. Arlazarov^ab

^a Federal Research Center «Computer Science and Control» or Russian Academy of Sciences, Moscow, Russia
^b Smart Engines Service LLC, Moscow, Russia
^c National University of Science and Technology «MISIS», Moscow, Russia
^d Moscow Institute of Physics and Technology (State University), Moscow, Russia
^e L3i Laboratory, La Rochelle University, La Rochelle, France

PDF полного текста (7455 kB) Список цитирования (19)

DOI: https://doi.org/10.18287/2412-6179-CO-1006

Аннотация: Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. The dataset contains 72409 annotated images in total, making it the largest publicly available identity document dataset to the date of publication. We describe the structure of the dataset, its content and annotations, and present baseline experimental results to serve as a basis for future research. For the task of document location and identification content-independent, feature-based, and semantic segmentation-based methods were evaluated. For the task of document text field recognition, the Tesseract system was evaluated on field and character levels with grouping by field alphabets and document types. For the task of face detection, the performance of Multi Task Cascaded Convolutional Neural Networks-based method was evaluated separately for different types of image input modes. The baseline evaluations show that the existing methods of identity document analysis have a lot of room for improvement given modern challenges. We believe that the proposed dataset will prove invaluable for advancement of the field of document analysis and recognition.

Ключевые слова: document analysis, document recognition, identity documents, open data, video recognition, document location, text recognition, face detection

Финансовая поддержка	Номер гранта
Российский фонд фундаментальных исследований	19-29-09066 19-29-09092
This work is partially supported by Russian Foundation for Basic Research (projects 19-29-09066 and 19-29-09092). All source images for MIDV-2020 dataset were obtained from Wikimedia Commons. Author attributions for each source images are listed in the original MIDV-500 description table (ftp://smartengines.com/midv-500/documents.pdf). Face images by Generated Photos (https://generated.photos).

Поступила в редакцию: 01.07.2021
Принята в печать: 18.11.2021

Тип публикации: Статья

Образец цитирования: K. B. Bulatov, E. V. Emelianova, D. V. Tropin, N. S. Skoryukina, Y. S. Chernyshova, A. V. Sheshkus, S. A. Usilin, Z. Ming, J.-Ch. Burie, M. M. Luqman, V. V. Arlazarov, “MIDV-2020: a comprehensive benchmark dataset for identity document analysis”, Компьютерная оптика, 46:2 (2022), 252–270

Цитирование в формате AMSBIB

\RBibitem{BulEmeTro22}

\by K.~B.~Bulatov, E.~V.~Emelianova, D.~V.~Tropin, N.~S.~Skoryukina, Y.~S.~Chernyshova, A.~V.~Sheshkus, S.~A.~Usilin, Z.~Ming, J.-Ch.~Burie, M.~M.~Luqman, V.~V.~Arlazarov

\paper MIDV-2020: a comprehensive benchmark dataset for identity document analysis

\jour Компьютерная оптика

\yr 2022

\vol 46

\issue 2

\pages 252--270

\mathnet{http://mi.mathnet.ru/co1015}

\crossref{https://doi.org/10.18287/2412-6179-CO-1006}

Образцы ссылок на эту страницу:

https://www.mathnet.ru/rus/co1015

https://www.mathnet.ru/rus/co/v46/i2/p252

Эта публикация цитируется в следующих 19 статьяx:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Что такое QR-код?

Обратная связь:

Пользовательское соглашение

Регистрация посетителей портала

Логотипы