|
Methods for intrinsic plagiarism detection
K. F. Safinab, M. P. Kuznetsovc, M. V. Kuznetsovaba a Antiplagiat JSC, 33 Varshavskoe Shosse, Moscow 117105, Russian Federation
b Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation
c “Forecsys” LLC, 42 Vavilov Str., Moscow 119333, Russian Federation
Abstract:
There are two ways to find plagiarism in documents: “external” and “intrinsic” plagiarism detection. External plagiarism detection is the task with a known set of possible references. Intrinsic plagiarism detection aims at discovering plagiarism by analyzing only the document by itself. The paper investigates the methods of intrinsic plagiarism detection. The authors developed a plagiarism detection method based on constructing statistics from the features of the document parts and detecting outliers. The proposed algorithm was tested on the PAN-2011 collection for intrinsic plagiarism detection.
Keywords:
natural language processing; intrinsic plagiarism detection; outliers detection.
Received: 30.01.2017
Citation:
K. F. Safin, M. P. Kuznetsov, M. V. Kuznetsova, “Methods for intrinsic plagiarism detection”, Inform. Primen., 11:3 (2017), 73–79
Linking options:
https://www.mathnet.ru/eng/ia487 https://www.mathnet.ru/eng/ia/v11/i3/p73
|
|