Computer Research and Modeling
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Computer Research and Modeling:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Computer Research and Modeling, 2012, Volume 4, Issue 4, Pages 707–719
DOI: https://doi.org/10.20537/2076-7633-2012-4-4-707-719
(Mi crm523)
 

NUMERICAL METHODS AND THE BASIS FOR THEIR APPLICATION

Automated citation graph building from a corpora of scientific documents

V. A. Polezhaev

RUKONT-PhysTech Laboratory, CMAM department, MIPT, 9 Institutskii per., Dolgoprudny, Moscow Region, 141700, Russia
References:
Abstract: In this paper the problem of automated building of a citation graph from a collection of scientific documents is considered as a sequence of machine learning tasks. The overall data processing technology is described which consists of six stages: preprocessing, metainformation extraction, bibliography lists extraction, splitting bibliography lists into separate bibliography records, standardization of each bibliography record, and record linkage. The goal of this paper is to provide a survey of approaches and algorithms suitable for each stage, motivate the choice of the best combination of algorithms, and adapt some of them for multilingual bibliographies processing. For some of the tasks new algorithms and heuristics are proposed and evaluated on the mixed English and Russian documents corpora.
Keywords: text mining, machine learning, information extraction, citation graph, bibliography, matching, record linkage, labeling, segmentation, conditional random fields.
Funding agency Grant number
Ministry of Education and Science of the Russian Federation 07.524.11.4002
Received: 06.09.2012
Document Type: Article
UDC: 004.852
Language: Russian
Citation: V. A. Polezhaev, “Automated citation graph building from a corpora of scientific documents”, Computer Research and Modeling, 4:4 (2012), 707–719
Citation in format AMSBIB
\Bibitem{Pol12}
\by V.~A.~Polezhaev
\paper Automated citation graph building from a corpora of scientific documents
\jour Computer Research and Modeling
\yr 2012
\vol 4
\issue 4
\pages 707--719
\mathnet{http://mi.mathnet.ru/crm523}
\crossref{https://doi.org/10.20537/2076-7633-2012-4-4-707-719}
Linking options:
  • https://www.mathnet.ru/eng/crm523
  • https://www.mathnet.ru/eng/crm/v4/i4/p707
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Computer Research and Modeling
    Statistics & downloads:
    Abstract page:90
    Full-text PDF :32
    References:24
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024