A. I. Get'man, M. N. Goryunov, A. G. Matskevich, D. A. Rybolovlev, “Methodology for collecting a training dataset for an intrusion detection model”, Proceedings of ISP RAS, 33:5 (2021), 83

Loading [MathJax]/jax/output/SVG/config.js

Proceedings of the Institute for System Programming of the RAS

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Proceedings of the Institute for System Programming of the RAS, 2021, Volume 33, Issue 5, Pages 83–104
DOI: https://doi.org/10.15514/ISPRAS-2021-33(5)-5 (Mi tisp629)

This article is cited in 3 scientific papers (total in 3 papers)

Methodology for collecting a training dataset for an intrusion detection model

A. I. Get'man^ab, M. N. Goryunov^c, A. G. Matskevich^c, D. A. Rybolovlev^c

^a Ivannikov Institute for System Programming of the RAS
^b National Research University Higher School of Economics
^c The Academy of Federal Security Guard Service of the Russian Federation

Full-text PDF (575 kB) Citations (3)

DOI: https://doi.org/10.15514/ISPRAS-2021-33(5)-5

Abstract: The paper discusses the issues of training models for detecting computer attacks based on the use of machine learning methods. The results of the analysis of publicly available training datasets and tools for analyzing network traffic and identifying features of network sessions are presented sequentially. The drawbacks of existing tools and possible errors in the datasets formed with their help are noted. It is concluded that it is necessary to collect own training data in the absence of guarantees of the public datasets reliability and the limited use of pre-trained models in networks with characteristics that differ from the characteristics of the network in which the training traffic was collected. A practical approach to generating training data for computer attack detection models is proposed. The proposed solutions have been tested to evaluate the quality of model training on the collected data and the quality of attack detection in conditions of real network infrastructure.

Keywords: information security, network intrusion detection system, machine learning, dataset, transfer learning, random forest, network traffic, computer attack.

Document Type: Article

Language: Russian

Citation: A. I. Get'man, M. N. Goryunov, A. G. Matskevich, D. A. Rybolovlev, “Methodology for collecting a training dataset for an intrusion detection model”, Proceedings of ISP RAS, 33:5 (2021), 83–104

Citation in format AMSBIB

\Bibitem{GetGorMat21}

\by A.~I.~Get'man, M.~N.~Goryunov, A.~G.~Matskevich, D.~A.~Rybolovlev

\paper Methodology for collecting a training dataset for an intrusion detection model

\jour Proceedings of ISP RAS

\yr 2021

\vol 33

\issue 5

\pages 83--104

\mathnet{http://mi.mathnet.ru/tisp629}

\crossref{https://doi.org/10.15514/ISPRAS-2021-33(5)-5}

Linking options:

https://www.mathnet.ru/eng/tisp629

https://www.mathnet.ru/eng/tisp/v33/i5/p83

This publication is cited in the following 3 articles:

Ismaeel Abiodun Sikiru, Ahmed Dooguy Kora, Eugène C. Ezin, Agbotiname Lucky Imoize, Chun-Ta Li, “Hybridization of Learning Techniques and Quantum Mechanism for IIoT Security: Applications, Challenges, and Prospects”, Electronics, 13:21 (2024), 4153
B. B. Borisenko, S. D. Erokhin, I. D. Martishin, 2023 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO, 2023, 1
S. D. Erokhin, B. B. Borisenko, I. D. Martishin, A. S. Fadeev, 2022 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO), 2022, 1

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Proceedings of the Institute for System Programming of the RAS

Statistics & downloads:
Abstract page:	38
Full-text PDF :	17

Что такое QR-код?

Registration to the website

Logotypes