Sistemy i Sredstva Informatiki [Systems and Means of Informatics]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Sistemy i Sredstva Inform.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Sistemy i Sredstva Informatiki [Systems and Means of Informatics], 2015, Volume 25, Issue 1, Pages 34–53
DOI: https://doi.org/10.14357/08696527150103
(Mi ssi392)
 

This article is cited in 1 scientific paper (total in 1 paper)

Multicriteria method for detecting near-duplicates in a stream of text messages

A. Andreev, D. Berezkin, I. Kozlov, K. Simakov

Bauman Moscow State Technical University, 5 Baumanskaya 2nd Str., Moscow 105005, Russian Federation
Full-text PDF (791 kB) Citations (1)
References:
Abstract: The problem of near-duplicate detection in a stream of text messages is considered. A model of a text document and a multicriteria duplicate identification method is proposed. The model provides flexible adjustment for different domains. The method is based on binary classification using support vector machine. The paper also provides a method of candidates prefiltration in order to ensure high efficiency of the approach. Several experiments with data obtained from a stream of news articles were carried out. The results show feasibility of the suggested approach.
Keywords: near-duplicate detection; similarity measure; binary classification.
Received: 30.12.2014
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: A. Andreev, D. Berezkin, I. Kozlov, K. Simakov, “Multicriteria method for detecting near-duplicates in a stream of text messages”, Sistemy i Sredstva Inform., 25:1 (2015), 34–53
Citation in format AMSBIB
\Bibitem{AndBerKoz15}
\by A.~Andreev, D.~Berezkin, I.~Kozlov, K.~Simakov
\paper Multicriteria method for detecting near-duplicates in a stream of text messages
\jour Sistemy i Sredstva Inform.
\yr 2015
\vol 25
\issue 1
\pages 34--53
\mathnet{http://mi.mathnet.ru/ssi392}
\crossref{https://doi.org/10.14357/08696527150103}
\elib{https://elibrary.ru/item.asp?id=23875693}
Linking options:
  • https://www.mathnet.ru/eng/ssi392
  • https://www.mathnet.ru/eng/ssi/v25/i1/p34
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Системы и средства информатики
    Statistics & downloads:
    Abstract page:221
    Full-text PDF :153
    References:39
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024